Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
552 views
in Technique[技术] by (71.8m points)

version control - Difference between GIT and CVS

What is the difference between Git and CVS version control systems?

I have been happily using CVS for over 10 years, and now I have been told that Git is much better. Could someone please explain what the difference between the two is, and why one is better than the other?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The main difference is that (as it was already said in other responses) CVS is (old) centralized version control system, while Git is distributed.

But even if you use version control for single developer, on single machine (single account), there are a few differences between Git and CVS:

  • Setting up repository. Git stores repository in .git directory in top directory of your project; CVS require setting up CVSROOT, a central place for storing version control info for different projects (modules). The consequence of that design for user is that importing existing sources into version control is as simple as "git init && git add . && git commit" in Git, while it is more complicated in CVS.

  • Atomic operations. Because CVS at beginning was a set of scripts around per-file RCS version control system, commits (and other operations) are not atomic in CVS; if an operation on the repository is interrupted in the middle, the repository can be left in an inconsistent state. In Git all operations are atomic: either they succeed as whole, or they fail without any changes.

  • Changesets. Changes in CVS are per file, while changes (commits) in Git they always refer to the whole project. This is very important paradigm shift. One of consequences of this is that it is very easy in Git to revert (create a change that undoes) or undo whole change; other consequence is that in CVS is easy to do partial checkouts, while it is currently next to impossible in Git. The fact that changes are per-file, grouped together led to invention of GNU Changelog format for commit messages in CVS; Git users use (and some Git tools expect) different convention, with single line describing (summarizing) change, followed by empty line, followed by more detailed description of changes.

  • Naming revisions / version numbers. There is another issue connected with the fact that in CVS changes are per files: version numbers (as you can see sometimes in keyword expansion, see below) like 1.4 reflects how many time given file has been changed. In Git each version of a project as a whole (each commit) has its unique name given by SHA-1 id; usually first 7-8 characters are enough to identify a commit (you can't use simple numbering scheme for versions in distributed version control system -- that requires central numbering authority). In CVS to have version number or symbolic name referring to state of project as a whole you use tags; the same is true in Git if you want to use name like 'v1.5.6-rc2' for some version of a project... but tags in Git are much easier to use.

  • Easy branching. Branches in CVS are in my opinion overly complicated, and hard to deal with. You have to tag branches to have a name for a whole repository branch (and even that can fail in some cases, if I remember correctly, because of per-file handling). Add to that the fact that CVS doesn't have merge tracking, so you have to either remember, or manually tag merges and branching points, and manually supply correct info for "cvs update -j" to merge branches, and it makes for branching to be unnecessary hard to use. In Git creating and merging branches is very easy; Git remembers all required info by itself (so merging a branch is as easy as "git merge branchname")... it had to, because distributed development naturally leads to multiple branches.

    This means that you are able to use topic branches, i.e. develop a separate feature in multiple steps in separate feature branch.

  • Rename (and copy) tracking. File renames are not supported in CVS, and manual renaming might break history in two, or lead to invalid history where you cannot correctly recover the state of a project before rename. Git uses heuristic rename detection, based on similarity of contents and filename (This solution works well in practice). You can also request detecting of copying of files. This means that:

    • when examining specified commit you would get information that some file was renamed,
    • merging correctly takes renames into account (for example if the file was renamed only in one branch)
    • "git blame", the (better) equivalent of "cvs annotate", a tool to show line-wise history of a file contents, can follow code movement also across renames
  • Binary files. CVS has only a very limited support for binary files (e.g. images), requiring users to mark binary files explicitly when adding (or later using "cvs admin", or via wrappers to do that automatically based on file name), to avoid mangling of binary file via end-of-line conversion and keyword expansion. Git automatically detects binary file based on contents in the same way CNU diff and other tools do it; you can override this detection using gitattributes mechanism. Moreover binary files are safe against unrecoverable mangling thanks to default on 'safecrlf' (and the fact that you have to request end-of-line conversion, although this might be turned on by default depending on distribution), and that (limited) keyword expansion is a strict 'opt-in' in Git.

  • Keyword expansion. Git offers a very, very limited set of keywords as compared to CVS (by default). This is because of two facts: changes in Git are per repository and not per file, and Git avoids modifying files that did not change when switching to other branch or rewinding to other point in history. If you want to embed revision number using Git, you should do this using your build system, e.g. following example of GIT-VERSION-GEN script in Linux kernel sources and in Git sources.

  • Amending commits. Because in distributed VCS such as Git act of publishing is separate from creating a commit, one can change (edit, rewrite) unpublished part of history without inconveniencing other users. In particular if you notice typo (or other error) in commit message, or a bug in commit, you can simply use "git commit --amend". This is not possible (at least not without heavy hackery) in CVS.

  • More tools. Git offers much more tools than CVS. One of more important is "git bisect" that can be used to find a commit (revision) that introduced a bug; if your commits are small and self-contained it should be fairly easy then to discover where the bug is.


If you are collaboration with at least one other developer, you would find also the following differences between Git and CVS:

  • Commit before merge Git uses commit-before-merge rather than, like CVS, merge-before-commit (or update-then-commit). If while you were editing files, preparing for creating new commit (new revision) somebody other created new commit on the same branch and it is now in repository, CVS forces you to first update your working directory and resolve conflicts before allowing you to commit. This is not the case with Git. You first commit, saving your state in version control, then you merge other developer changes. You can also ask the other developer to do the merge and resolve conflicts.

    If you prefer to have linear history and avoid merges, you can always use commit-merge-recommit workflow via "git rebase" (and "git pull --rebase"), which is similar to CVS in that you replay your changes on top of updated state. But you always commit first.

  • No need for central repository With Git there is no need to have single central place where you commit your changes. Each developer can have its own repository (or better repositories: private one in which he/she does development, and public bare one where she/he publishes that part which is ready), and they can pull/fetch from each other repositories, in symmetric fashion. On the other hand it is common for larger project to have socially defined/nominated central repository from which everyone pull from (get changes from).


Finally Git offers many more possibilities when collaboration with large number of developers is needed. Below there are differences between CVS in Git for different stages of interest and position in a project (under version control using CVS or Git):

  • lurker. If you are interested only in getting latest changes from a project, (no propagation of your changes), or doing private development (without contributing back to original projects); or you use foreign projects as a basis of your own project (changes are local and doesn't it make sense to publish them).

    Git supports here anonymous unauthenticated read-only access via custom efficient git://protocol, or if you are behind firewall blocking DEFAULT_GIT_PORT (9418) you can use plain HTTP.

    For CVS most common solution (as I understand it) for read-only access is guest account for 'pserver' protocol on CVS_AUTH_PORT (2401), usually called "anonymous" and with empty password. Credentials are stored by default in $HOME/.cvspass file, so you have to provide it only once; still, this is a bit of barrier (you have to know name of guest account, or pay attention to CVS server messages) and annoyance.

  • fringe developer (leaf contributor). One way of propagating your changes in OSS is sending patches via email. This is most common solution if you are (more or less) accidental developer, sending single change, or single bugfix. BTW. sending patches might be via review board (patch review system) or similar means, not only via email.

    Git offers here tools which help in this propagation (publishing) mechanism both for sender (client), and for maintainer (server). For people who want send their changes via email there is "git rebase" (or "git pull --rebase") tool to replay your own changes on top of current upstream version, so your changes are on top of current version (are fresh), and "git format-patch" to create email with commit message (and authorship), change in the form of (extended) unified diff format (plus diffstat for easier review). Maintainer can turn such email directly into commit preserving all information (including commit message) using "git am".

    CVS offer no such tools: you can use "cvs diff" / "cvs rdiff" to generate changes, and use GNU patch to apply changes, but as far as I know there is no way to automate applying commit message. CVS was meant to be used in client <-> server fashion...

  • lieutenant. If you are maintainer of separate part of a project (subsystem), or if development of your project follows "network of trust" workflow used in development of Linux kernel... or just if you have your own public repository, and the changes you want to publish are too large to send via email as patch series, you can send p


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...