Git - Revision Control Perfected

Branching and Merging

The work you do in Git is specific to the current branch. A branch is simply a moving reference to a commit (SHA1 object name). Every time you create a new commit, the reference is updated to point to it—this is how Git knows where to find the most recent commit, which is also known as the tip, or head, of the branch.

By default, there is only one branch ("master"), but you can have as many as you want. You create branches with git branch and switch between them with git checkout. This may seem odd at first, but the reason it's called "checkout" is that you are "checking out" the head of that branch into your working copy. This alters the files in your working copy to match the commit at the head of the branch.

Branches are super-fast and easy, and they're a great way to try out new ideas, even for trivial things. If you are used to other systems like CVS/SVN, you might have negative thoughts associated with branches—forget all that. Branching and merging are free in Git and can be used without a second thought.

Run the following commands to create and switch to a new local branch named "myidea":


git branch myidea
git checkout myidea

All commits now will be tracked in the new branch until you switch to another. You can work on more than one branch at a time by switching back and forth between them with git checkout.

Branches are really useful only because they can be merged back together later. If you decide that you like the changes in myidea, you can merge them back into master:


git checkout master
git merge myidea

Unless there are conflicts, this operation will merge all the changes from myidea into your working copy and automatically commit the result to master in one fell swoop. The new commit will have the previous commits from both myidea and master listed as parents.

However, if there are conflicts—places where the same part of a file was changed differently in each branch—Git will warn you and update the affected files with "conflict markers" and not commit the merge automatically. When this happens, it's up to you to edit the files by hand, make decisions between the versions from each branch, and then remove the conflict markers. To complete the merge, use git add on each formerly conflicted file, and then git commit.

After you merge from a branch, you don't need it anymore and can delete it:


git branch -d myidea

If you decide you want to throw myidea away without merging it, use an uppercase -D instead of a lowercase -d as listed above. As a safety feature, the lowercase switch won't let you delete a branch that hasn't been merged.

To list all local branches, simply run:


git branch

Viewing Changes

Git provides a number of tools to examine the history and differences between commits and branches. Use git log to view commit histories and git diff to view the differences between specific commits.

These are text-based tools, but graphical tools also are available, such as the gitk repository browser, which essentially is a GUI version of git log --graph to visualize branch history. See Figure 2 for a screenshot.

Remote Repositories

Figure 2. gitk

Remote Repositories

Git can merge from a branch in a remote repository simply by transferring needed objects and then running a local merge. Thanks to the content-addressed storage design, Git knows which objects to transfer based on which object names in the new commit are missing from the local repository.

The git pull command performs both the transfer step (the "fetch") and the merge step together. It accepts the URL of the remote repository (the "Git URL") and a branch name (or a full "refspec") as arguments. The Git URL can be a local filesystem path, or an SSH, HTTP, rsync or Git-specific URL. For instance, this would perform a pull using SSH:


git pull user@host:/some/repo/path master

Git provides some useful mechanisms for setting up relationships with remote repositories and their branches so you don't have to type them out each time. A saved URL of a remote repository is called a "remote", which can be configured along with "tracking branches" to map the remote branches into the local repository.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Interesting

Anonymous's picture

Good to read this informative article here on this website. It’s an interesting post.
Thank you! wintersport oostenrijk chalet / wintersport oostenrijk chalet

Ich selber habe einige

Anonymous's picture

Ich selber habe einige Webseiten und brauchte genau das, was ich hier lesen konnte. Könnte ja mal schauen, ob ich es richtig gemacht habe.

Liebe Grüße
http://www.flirtcenter24.de/

About Git

Mike Wang's picture

Worth keeping

Checking Out A Small Subset Of Files On A Small Device?

Anonymous's picture

The limitation that I immediately ran into when I considered migrating to git is to check out some (rather randomly selected) subset of files on a small/portable computing device.

Say I have a big repository of files and I only needed a very small subset of files while on the go -- to refer to and to be edited.

It was originally a small netbook computer where I could check out a few directories from a big repository and be able to edit files on the netbook computer while on the bus.

Netbook might have grown larger with regard to its disk storage, but now, I want to do the same on an Android phone.

git's sparse checkout feature still pulls the entire repository to the device. It only checkout a subset of files to give the appearance of sparse checkout, but it doesn't resolve the storage issue.

I don't think git submodules help, as, I think, one can't easily move selected files across repositories with all history intact (i.e., every now and then, add some additional directories to the list available to small devices by moving them to a submodule, when it becomes necessary), as one can easily do with CVS.

The only solution that I can think of is to remotely mount .git/objects/ directory and deal with its limitation.

Is there any creative brain power would find a solution lift this limitation?

Thanks.

Split-able git Tree?

Anonymous's picture

Given that:
Tree object = Blobs file names + permissions + Blobs collection.

Can splitting git repository be implemented by splitting some git's Tree object into 2 (sub-) Tree objects on a personal workstation, (perhaps new Commit objects to keep track of the split,) allowing a smaller tree be checked out to a small device.

Remote changes (done by others) can, then, be merged to the personal workstation (as staging), before merging to the splitted Tree branches for the small devices if necessary.

Changes on the small devices can be merged to the personal workstation (as staging), before being pulled by others?

Would that solve the disk space problem by limiting checkout to a small (sub-) Tree?

If this idea works, would some able developer turns it into an implementation?

Thanks.

On the guarantees of SHA1

Johan Commelin's picture

First of all I want to thank the author for this clear and concise article.

However, I want to point out some inaccuracy regarding the paragraph on SHA1. The author states that SHA1 guarantees that the data in the blobs is different, and that the chance that two pieces of data have the same SHA1 is infinitesimally small. I disagree on this point.

The 40-character string that SHA1 outputs gives us 16^40 = 2^160 ~~ 10^16 different checksums. Although this is big enough to assume the above descripted 'guarantee', the claim about the infinitesimal chance is just wrong.

Consider for example 2^160 + 1 pairwise distinct files (this is data, be it hypothetical). The chance that there will be two different pieces of data in this set having the same checksum is 1. And 1 is very very different from infinitesimal.

I agree that it is highly unlikely that two such files will occur in practice, let alone in one project. (For example, each person on earth would have to create about 100.000 distinct files, to come close to the 2^160 files.) Still I wanted to point this out about the cryptographic features of SHA1.

There is not enough matter in

Anonymous's picture

There is not enough matter in the universe to store 2^64 bits, much less 2^160 bits, even if you stored 1 bit per atom.

Your math is *way* off. 2^160

Anonymous's picture

Your math is *way* off. 2^160 ~~ 10^48.

Very nice introduction

Anonymous's picture

Congrats for a very clear and concise introduction for something as difficult to teach as git.

I love git, and had to give git training to Subversion users -- hard work! It really amounts to unlearning SVN and learn something completely new.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState