Git - Revision Control Perfected

A remote named "origin" is configured automatically when a repository is created using git clone. Consider a clone of Linus Torvald's Kernel Tree mirrored on GitHub:

git clone

If you look inside the new repository's config file (.git/config), you'll see these lines set up:

[remote "origin"]
  fetch = +refs/heads/*:refs/remotes/origin/*
  url =
[branch "master"]
  remote = origin
  merge = refs/heads/master

The fetch line above defines the remote tracking branches. This "refspec" specifies that all branches in the remote repository under "refs/heads" (the default path for branches) should be transferred to the local repository under "refs/remotes/origin". For example, the remote branch named "master" will become a tracking branch named "origin/master" in the local repository.

The lines under the branch section provide defaults—specific to the master branch in this example—so that git pull can be called with no arguments to fetch and merge from the remote master branch into the local master branch.

The git pull command is actually a combination of the git fetch and git merge commands. If you do a git fetch instead, the tracking branches will be updated and you can compare them to see what changed. Then you can merge as a separate step:

git merge origin/master

Git also provides the git push command for uploading to a remote repository. The push operation is essentially the inverse of the pull operation, but since it won't do a remote "checkout" operation, it is usually used with "bare" repositories. A bare repository is just the git database without a working copy. It is most useful for servers where there is no reason to have editable files checked out.

For safety, git push will allow only a "fast-forward" merge where the local commits derive from the remote head. If the local head and remote head have both changed, you must perform a full merge (which will create a new commit deriving from both heads). Full merges must be done locally, so all this really means is you must call git pull before git push if someone else committed something first.


This article is meant only to provide an introduction to some of Git's most basic features and usage. Git is incredibly powerful and has a lot more capabilities beyond what I had space to cover here. But, once you realize all the features are based on the same core concepts, it becomes straightforward to learn the rest.

Check out the Resources section for some sites where you can learn more. Also, don't forget to read the git man page.


Git Home Page:

Git Community Book:

Why Git Is Better Than X:

Google Tech Talk: Linus Torvalds on Git:



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.


Anonymous's picture

Good to read this informative article here on this website. It’s an interesting post.
Thank you! wintersport oostenrijk chalet / wintersport oostenrijk chalet

Ich selber habe einige

Anonymous's picture

Ich selber habe einige Webseiten und brauchte genau das, was ich hier lesen konnte. Könnte ja mal schauen, ob ich es richtig gemacht habe.

Liebe Grüße

About Git

Mike Wang's picture

Worth keeping

Checking Out A Small Subset Of Files On A Small Device?

Anonymous's picture

The limitation that I immediately ran into when I considered migrating to git is to check out some (rather randomly selected) subset of files on a small/portable computing device.

Say I have a big repository of files and I only needed a very small subset of files while on the go -- to refer to and to be edited.

It was originally a small netbook computer where I could check out a few directories from a big repository and be able to edit files on the netbook computer while on the bus.

Netbook might have grown larger with regard to its disk storage, but now, I want to do the same on an Android phone.

git's sparse checkout feature still pulls the entire repository to the device. It only checkout a subset of files to give the appearance of sparse checkout, but it doesn't resolve the storage issue.

I don't think git submodules help, as, I think, one can't easily move selected files across repositories with all history intact (i.e., every now and then, add some additional directories to the list available to small devices by moving them to a submodule, when it becomes necessary), as one can easily do with CVS.

The only solution that I can think of is to remotely mount .git/objects/ directory and deal with its limitation.

Is there any creative brain power would find a solution lift this limitation?


Split-able git Tree?

Anonymous's picture

Given that:
Tree object = Blobs file names + permissions + Blobs collection.

Can splitting git repository be implemented by splitting some git's Tree object into 2 (sub-) Tree objects on a personal workstation, (perhaps new Commit objects to keep track of the split,) allowing a smaller tree be checked out to a small device.

Remote changes (done by others) can, then, be merged to the personal workstation (as staging), before merging to the splitted Tree branches for the small devices if necessary.

Changes on the small devices can be merged to the personal workstation (as staging), before being pulled by others?

Would that solve the disk space problem by limiting checkout to a small (sub-) Tree?

If this idea works, would some able developer turns it into an implementation?


On the guarantees of SHA1

Johan Commelin's picture

First of all I want to thank the author for this clear and concise article.

However, I want to point out some inaccuracy regarding the paragraph on SHA1. The author states that SHA1 guarantees that the data in the blobs is different, and that the chance that two pieces of data have the same SHA1 is infinitesimally small. I disagree on this point.

The 40-character string that SHA1 outputs gives us 16^40 = 2^160 ~~ 10^16 different checksums. Although this is big enough to assume the above descripted 'guarantee', the claim about the infinitesimal chance is just wrong.

Consider for example 2^160 + 1 pairwise distinct files (this is data, be it hypothetical). The chance that there will be two different pieces of data in this set having the same checksum is 1. And 1 is very very different from infinitesimal.

I agree that it is highly unlikely that two such files will occur in practice, let alone in one project. (For example, each person on earth would have to create about 100.000 distinct files, to come close to the 2^160 files.) Still I wanted to point this out about the cryptographic features of SHA1.

There is not enough matter in

Anonymous's picture

There is not enough matter in the universe to store 2^64 bits, much less 2^160 bits, even if you stored 1 bit per atom.

Your math is *way* off. 2^160

Anonymous's picture

Your math is *way* off. 2^160 ~~ 10^48.

Very nice introduction

Anonymous's picture

Congrats for a very clear and concise introduction for something as difficult to teach as git.

I love git, and had to give git training to Subversion users -- hard work! It really amounts to unlearning SVN and learn something completely new.

Geek Guide
The DevOps Toolbox

Tools and Technologies for Scale and Reliability
by Linux Journal Editor Bill Childers

Get your free copy today

Sponsored by IBM

Upcoming Webinar
8 Signs You're Beyond Cron

Scheduling Crontabs With an Enterprise Scheduler
11am CDT, April 29th
Moderated by Linux Journal Contributor Mike Diehl

Sign up now

Sponsored by Skybot