views:

364

answers:

3

I just observed something odd about git pull, which I don't understand.

On Friday, I worked on a local branch. let's call it mybranch. Before leaving the office I pushed it to origin (which is my github repo): git push origin mybranch.

Yesterday at home, I pulled mybranch to my laptop, did some more coding, and then pushed my changes back to github (origin).

Now I'm at work again, and tried to pull the changes from yesterday to my work machine (I didn't change anything in my work place's local repo over the weekend):

git pull origin mybranch

that caused a fast forward merge, which is fine. I then did a git status, and it said:

# On branch mybranch
# Your branch is ahead of 'origin/mybranch' by 6 commits.
#
nothing to commit (working directory clean)

Huh? How can it be 6 commits ahead when I didn't even touch it over the weekend, AND just pulled from origin? So I ran a git diff origin/mybranch and the diffs were exactly the 6 changes I just pulled from remote.

I could only "fix" this by running git fetch origin:

From [email protected]:me/project
af8be00..88b0738  mybranch -> origin/mybranch

Apparently, my local repo was missing some reference objects, but how can that be? I mean, a pull does a fetch already, and I didn't work on anything except that branch, so a git fetch origin and git fetch origin mybranch should have the same result?

Should I always use git pull origin instead of git pull origin branchname?

I'm confused.

+1  A: 

What does git remote -v show returns when it comes to origin?

If origin points to github, the status should be up to date, and not ahead of any remote repo. At least, with the Git1.6.5 I am using for a quick test.

Anyway, to avoid this, define explicitly the remote repo of master branch:

$ git config branch.master.remote yourGitHubRepo.git

then a git pull origin master, followed by a git status should return a clean status (no ahead).
Why? because the get fetch origin master (included in the git pull origin master) would not just update FETCH_HEAD (as Charles Bailey explains in his answer), but it would also update the "remote master branch" within your local Git repository.
In that case, your local master would not seem anymore to be "ahead" of the remote master.


I can test this, with a git1.6.5:

First I create a workrepo:

PS D:\git\tests> cd pullahead
PS D:\git\tests\pullahead> git init workrepo
Initialized empty Git repository in D:/git/tests/pullahead/workrepo/.git/
PS D:\git\tests\pullahead> cd workrepo
PS D:\git\tests\pullahead\workrepo> echo firstContent > afile.txt
PS D:\git\tests\pullahead\workrepo> git add -A 
PS D:\git\tests\pullahead\workrepo> git commit -m "first commit"

I simulate a GitHub repo by creating a bare repo (one which can receive push from anywhere)

PS D:\git\tests\pullahead\workrepo> cd ..
PS D:\git\tests\pullahead> git clone --bare workrepo github

I add a modif to my working repo, that I push to github repo (added as a remote)

PS D:\git\tests\pullahead> cd workrepo
PS D:\git\tests\pullahead\workrepo> echo aModif >> afile.txt
PS D:\git\tests\pullahead\workrepo> git ci -a -m "a modif to send to github"
PS D:\git\tests\pullahead\workrepo> git remote add github d:/git/tests/pullahead/github
PS D:\git\tests\pullahead\workrepo> git push github

I create a home repo, cloned of GitHub, in which I make a couple of modifications, pushed to GitHub:

PS D:\git\tests\pullahead\workrepo> cd ..
PS D:\git\tests\pullahead> git clone github homerepo
PS D:\git\tests\pullahead> cd homerepo
PS D:\git\tests\pullahead\homerepo> type afile.txt
firstContent
aModif

PS D:\git\tests\pullahead\homerepo> echo aHomeModif1  >> afile.txt
PS D:\git\tests\pullahead\homerepo> git ci -a -m "a first home modif"
PS D:\git\tests\pullahead\homerepo> echo aHomeModif2  >> afile.txt
PS D:\git\tests\pullahead\homerepo> git ci -a -m "a second home modif"
PS D:\git\tests\pullahead\homerepo> git push github

I then clone workrepo for a first experiment

PS D:\git\tests\pullahead\workrepo4> cd ..
PS D:\git\tests\pullahead> git clone workrepo workrepo2
Initialized empty Git repository in D:/git/tests/pullahead/workrepo2/.git/
PS D:\git\tests\pullahead> cd workrepo2
PS D:\git\tests\pullahead\workrepo2> git remote add github d:/git/tests/pullahead/github
PS D:\git\tests\pullahead\workrepo2> git pull github master
remote: Counting objects: 8, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 6 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (6/6), done.
From d:/git/tests/pullahead/github
 * branch            master     -> FETCH_HEAD
Updating c2763f2..75ad279
Fast forward
 afile.txt |  Bin 46 -> 98 bytes
 1 files changed, 0 insertions(+), 0 deletions(-)

In that repo, git status does mention master geing ahead of 'origin':

PS D:\git\tests\pullahead\workrepo5> git status
# On branch master
# Your branch is ahead of 'origin/master' by 2 commits.
#
nothing to commit (working directory clean)

But that is only origin is not github:

PS D:\git\tests\pullahead\workrepo2> git remote -v show
github  d:/git/tests/pullahead/github (fetch)
github  d:/git/tests/pullahead/github (push)
origin  D:/git/tests/pullahead/workrepo (fetch)
origin  D:/git/tests/pullahead/workrepo (push)

But if I repeat the sequence in a repo which has an origin to github (or no origin at all, just a remote 'github' defined), status is clean:

PS D:\git\tests\pullahead\workrepo2> cd ..
PS D:\git\tests\pullahead> git clone workrepo workrepo4
PS D:\git\tests\pullahead> cd workrepo4
PS D:\git\tests\pullahead\workrepo4> git remote rm origin
PS D:\git\tests\pullahead\workrepo4> git remote add github d:/git/tests/pullahead/github
PS D:\git\tests\pullahead\workrepo4> git pull github master
remote: Counting objects: 8, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 6 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (6/6), done.
From d:/git/tests/pullahead/github
 * branch            master     -> FETCH_HEAD
Updating c2763f2..75ad279
Fast forward
 afile.txt |  Bin 46 -> 98 bytes
 1 files changed, 0 insertions(+), 0 deletions(-)
PS D:\git\tests\pullahead\workrepo4> git status
# On branch master
nothing to commit (working directory clean)

If I had only origin pointing on github, status would be clean for git1.6.5.
It may be with a 'ahead' warning for earlier git, but anyway, a git config branch.master.remote yourGitHubRepo.git defined explicitly should be able to take care of that, even with early versions of Git.

VonC
Thanks for taking time to look into this. The origin remote already points to my GitHub repo. I cloned that project from a GitHub url and my local master branch is tracking origin/master. As for the mybranch, I'm pretty sure I created it off the origin/mybranch branch, which should track it automatically. But still, maybe this is the problem? That the local mybranch doesn't actually track origin/mybranch? PS: I'm using git 1.6.1 (via MacPorts).
Matthias
Is there a git command which let's me see if a local branch is tracking another branch? I can't find it in the man pages.
Matthias
+1  A: 

Are you careful to add all of your remote (except origin which comes with your original clone) using git remote add NAME URL? I've seen this bug when they've just been added to the git config.

Pat Notz
I did this when cloning the repo. I didn't do this with each branch, however. For e.g. mybranch I would first fetch from origin, then `git checkout -b mybranch origin/mybranch`. According to the man page of git-branch, the origin/mybranch is the start point, and furthermore, it states for --track: "... Use this if you always pull from the same upstream branch into the new branch, and if you don't want to use "git pull <repository> <refspec>" explicitly. This behavior is the default when the start point is a remote branch."
Matthias
+13  A: 

git pull calls git fetch with the appropriate parameters before merging the explicitly fetched heads (or if none the remote branch configured for merge) into the current branch.

The syntax: git fetch <repository> <ref> where <ref> is just a branch name with no colon is a 'one shot' fetch that doesn't do a standard fetch of all the tracked branches of the specified remote but instead fetches just the named branch into FETCH_HEAD.

When you perform git pull <repository> <ref>, FETCH_HEAD is updated as above, then merged into your checked out HEAD but none of the standard tracking branches for the remote repository will be updated. This means that locally it looks like you are ahead of of the remote branch, whereas in fact you are up to date with it.

Personally I always do git fetch followed by git merge <remote>/<branch> because I get to see any warnings about forced updates before I merge, and I can preview what I'm merging in. If I used git pull a bit more than I do, I would do a plain git pull with no parameters most of the time, relying on branch.<branch>.remote and branch.<branch>.merge to 'do the right thing'.

Charles Bailey
+1 That's really a good explanation! I knew the explanation was hiding somewhere inside of 'git help fetch' but couldn't get it out...
Stefan Näwe
+1. Good post, with an approach similar to http://gitster.livejournal.com/28309.html
VonC