Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
479 views
in Technique[技术] by (71.8m points)

git - Difference between 'rebase master' and 'rebase --onto master' from a branch derived from a branch of master

Given the following branch structure:

  *------*---*
Master        
               *---*--*------*
               A       
                        *-----*-----*
                        B         (HEAD)

If I want to merge my B changes (and only my B changes, no A changes) into master what is the difference between these two set of commands?

>(B)      git rebase master
>(B)      git checkout master
>(master) git merge B

>(B)      git rebase --onto master A B
>(B)      git checkout master
>(master) git merge B

I'm mainly interested in learning if code from Branch A could make it into master if I use the first way.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Bear with me for a while before I answer the question as asked. One the earlier answers is right but there are labeling and other relatively minor (but potentially confusing) issues, so I want to start with branch drawings and branch labels. Also, people coming from other systems, or maybe even just new to revision control and git, often think of branches as "lines of development" rather than "traces of history" (git implements them as the latter, rather than the former, so a commit is not necessarily on any specific "line of development").

First, there is a minor problem with the way you drew your graph:

  *------*---*
Master        
               *---*--*------*
               A       
                        *-----*-----*
                        B         (HEAD)

Here's the exact same graph, but with the labels drawn in differently and some more arrow-heads added (and I've numbered the commit nodes for use below):

0 <- 1 <- 2         <-------------------- master
           
            3 <- 4 <- 5 <- 6      <------ A
                       
                        7 <- 8 <- 9   <-- HEAD=B

Why this matters is that git is quite loose about what it means for a commit to be "on" some branch—or perhaps a better phrase is to say that some commit is "contained in" some set of branches. Commits cannot be moved or changed, but branch labels can and do move.

More specifically, a branch name like master, A, or B points to one specific commit. In this case, master points to commit 2, A points to commit 6, and B points to commit 9. The first few commits 0 through 2 are contained within all three branches; commits 3, 4, and 5 are contained within both A and B; commit 6 is contained only within A; and commits 7 through 9 are contained only in B. (Incidentally, multiple names can point to the same commit, and that's normal when you make a new branch.)

Before we proceed, let me re-draw the graph yet one more way:

0
 
  1
   
    2     <-- master
     
      3 - 4 - 5
              |
              | 6   <-- A
               
                7
                 
                  8
                   
                    9   <-- HEAD=B       

This just emphasizes that it's not a horizontal line of commits that matter, but rather the parent/child relationships. The branch label points to a starting commit, and then (at least the way these graphs are drawn) we move left, maybe also going up or down as needed, to find parent commits.


When you rebase commits, you're actually copying those commits.

Git can never change any commit

There's one "true name" for any commit (or indeed any object in a git repository), which is its SHA-1: that 40-hex-digit string like 9f317ce... that you see in git log for instance. The SHA-1 is a cryptographic1 checksum of the contents of the object. The contents are the author and committer (name and email), time stamps, a source tree, and the list of parent commits. The parent of commit #7 is always commit #5. If you make a mostly-exact copy of commit #7, but set its parent to commit #2 instead of commit #5, you get a different commit with a different ID. (I've run out of single digits at this point—normally I use single uppercase letters to represent commit IDs, but with branches named A and B I thought that would be confusing. So I'll call a copy of #7, #7a, below.)

What git rebase does

When you ask git to rebase a chain of commits—such as commits #7-8-9 above—it has to copy them, at least if they're going to move anywhere (if they're not moving it can just leave the originals in place). It defaults to copying commits from the currently-checked-out branch, so git rebase needs just two extra pieces of information:

  • Which commits should it copy?
  • Where should the copies land? That is, what's the target parent-ID for the first-copied commit? (Additional commits simply point back to the first-copied, second-copied, and so on.)

When you run git rebase <upstream>, you let git figure out both parts from one single piece of information. When you use --onto, you get to tell git separately about the both parts: you still supply an upstream but it doesn't compute the target from <upstream>, it only computes the commits to copy from <upstream>. (Incidentally, I think <upstream> is not a good name, but it's what rebase uses and I don't have anything way better, so let's stick with it here. Rebase calls target <newbase>, but I think target is a much better name.)

Let's take a look at these two options first. Both assume that you're on branch B in the first place:

  1. git rebase master
  2. git rebase --onto master A

With the first command, the <upstream> argument to rebase is master. With the second, it's A.

Here's how git computes which commits to copy: it hands the current branch to git rev-list, and it also hands <upstream> to git rev-list, but using --not—or more precisely, with the equivalent of the two-dot exclude..include notation. This means we need to know how git rev-list works.

While git rev-list is extremely complicated—most git commands end up using it; it's the engine for git log, git bisect, rebase, filter-branch, and so on—this particular case is not too hard: with the two-dot notation, rev-list lists every commit reachable from the right-hand side (including that commit itself), excluding every commit reachable from the left-hand side.

In this case, git rev-list HEAD finds all commits reachable from HEAD—that is, almost all commits: commits 0-5 and 7-9—and git rev-list master finds all commits reachable from master, which is commit #s 0, 1, and 2. Subtracting 0-through-2 from 0-5,7-9 leaves 3-5,7-9. These are the candidate commits to copy, as listed by git rev-list master..HEAD.

For our second command, we have A..HEAD instead of master..HEAD, so the commits to subtract are 0-6. Commit #6 doesn't appear in the HEAD set, but that's fine: subtracting away something that's not there, leaves it not there. The resulting candidates-to-copy is therefore 7-9.

That still leaves us with figuring out the target of the rebase, i.e., where should copied commits land? With the second command, the answer is "the commit identified by the --onto argument". Since we said --onto master, that means the target is commit #2.

rebase #1

git rebase master

With the first command, though, we didn't specify a target directly, so git uses the commit identified by <upstream>. The <upstream> we gave was master, which points to commit #2, so the target is commit #2.

The first command is therefore going to start by copying commit #3 with whatever minimal changes are needed so that its parent is commit #2. Its parent is already commit #2. Nothing has to change, so nothing changes, and rebase just re-uses the existing commit #3. It must then copy #4 so that its parent is #3, but the parent is already #3, so it just re-uses #4. Likewise, #5 is already good. It completely ignores #6 (that's not in the set of commits to copy); it checks #s 7-9 but they're all good as well, so the whole rebase ends up just re-using all the original commits. You can force copies anyway with -f, but you didn't, so this whole rebase ends up doing nothing.

rebase #2

git rebase --onto master A

The second rebase command used --onto to select #2 as its target, but told git to copy just commits 7-9. Commit #7's parent is commit #5, so this copy really has to do something.2 So git makes a new commit—let's call this #7a—that has commit #2 as its parent. The rebase moves on to commit #8: the copy now needs #7a as its parent. Finally, the rebase moves on to commit #9, which needs #8a as its parent. With all commits copied, the last thing rebase does is move the label (remember, labels move and change!). This gives a graph like this:

          7a - 8a - 9a       <-- HEAD=B
         /
0 - 1 - 2                    <-- master
         
          3 - 4 - 5 - 6      <-- A
                    
                     7 - 8 - 9   [abandoned]

OK, but what about git rebase --onto master A B?

This is almost the same as git rebase --onto master A. The difference is that extra B at the end. Fortunately, this difference is very simple: if you give git rebase that one extra argument, it runs git checkout on that argument first.3

Your original commands

In your first set of commands, you ran git rebase master while on branch B. As noted above, this is a big no-op: since nothing needs to move, git copies nothing at all (unless you use -f / --force, which you didn't). You then checked out master and used git merge B, which—if it it is told to4—creates a new commit with the merge. Therefore Dherik's answer, as of the time I saw it at least, is correct here: The merge commit has two parents, one of which is the tip of branch B, and that branch reaches back through three commits that are on branch A and therefore some of what's on A winds up being merged into master.

With your second command sequence, you first checked out B (you were already on B so this was redundant, but was part of the git rebase). You then had rebase copy three commits, producing the final graph above, with commits 7a, 8a, and 9a. You then checked out master and made a merge commit with B (see footnote 4 again). Again Dherik's answer is correct: the only thing missing is that the original, abandoned commits are not drawn-in and it's not as obvious that the new merged-in commits are copies.


1This only matters in that it's extraordinarily difficult to target a particular checksum. That is, if someone you trust tells you "I trust the commit with ID 1234567...", it's almost impossible for someone else—someone you may not trust so much—to come up with a commit that has that same ID, but has different contents. The chances of it happening by accident are 1 in 2160, which is much less likely than you having a heart attack while being struck by lightning while drowning in a tsunami while being abducted by


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...