Git - Revision Control Perfected
A commit is meant to record a set of changes introduced to a project. What it really does is associate a tree object—representing a complete snapshot of a directory structure at a moment in time—with contextual information about it, such as who made the change and when, a description, and its parent commit(s).
A commit doesn't actually store a list of changes (a "diff") directly, but it doesn't need to. What changed can be calculated on-demand by comparing the current commit's tree to that of its parent. Comparing two trees is a lightweight operation, so there is no need to store this information. Because there actually is nothing special about the parent commit other than chronology, one commit can be compared to any other just as easily regardless of how many commits are in between.
All commits should have a parent except the first one. Commits usually have a single parent, but they will have more if they are the result of a merge (I explain branching and merging later in this article). A commit from a merge still is just a snapshot in time like any other, but its history has more than one lineage.
By following the chain of parent references backward from the current commit, the entire history of a project can be reconstructed and browsed all the way back to the first commit.
A commit is expanded recursively into a project history in exactly the same manner as a tree is expanded into a directory structure. More important, just as the SHA1 of a tree is a fingerprint of all the data in all the trees and blobs below it, the SHA1 of a commit is a fingerprint of all the data in its tree, as well as all of the data in all the commits that preceded it.
This happens automatically because references are part of an object's overall content. The SHA1 of each object is computed, in part, from the SHA1s of any objects it references, which in turn were computed from the SHA1s they referenced and so on.
A tag is just a named reference to an object—usually a commit. Tags typically are used to associate a particular version number with a commit. The 40-character SHA1 names are many things, but human-friendly isn't one of them. Tags solve this problem by letting you give an object an additional name.
There are two types of tags: object tags and lightweight tags. Lightweight tags are not objects in the repository, but instead are simple refs like branches, except that they don't change. (I explain branches in more detail in the Branching and Merging section below.)
Setting Up Git
If you don't already have Git on your system, install it with your package manager. Because Git is primarily a simple command-line tool, installing it is quick and easy under any modern distro.
You'll want to set the name and e-mail address that will be recorded in new commits:
git config --global user.name "John Doe" git config --global user.email email@example.com
This just sets these parameters in the config file ~/.gitconfig. The config has a simple syntax and could be edited by hand just as easily.
Git's interface consists of the "working copy" (the files you directly interact with when working on the project), a local repository stored in a hidden .git subdirectory at the root of the working copy, and commands to move data back and forth between them, or between remote repositories.
The advantages of this design are many, but right away you'll notice that there aren't pesky version control files scattered throughout the working copy, and that you can work off-line without any loss of features. In fact, Git doesn't have any concept of a central authority, so you always are "working off-line" unless you specifically ask Git to exchange commits with your peers.
The repository is made up of files that are manipulated by invoking the git command from within the working copy. There is no special server process or extra overhead, and you can have as many repositories on your system as you like.
You can turn any directory into a working copy/repository just by running this command from within it:
Next, add all the files within the working copy to be tracked and commit them:
git add . git commit -m "My first commit"
You can commit additional changes as frequently or infrequently as you
like by calling
git add followed by
commit after each modification
you want to record.
If you're new to Git, you may be wondering why you need to call
add each time. It has to do with the process of
"staging" a set of
changes before committing them, and it's one of the most common sources of
confusion. When you call
git add on one or more files, they are added
to the Index. The files in the Index—not the working copy—are what
get committed when you call
Think of the Index as what will become the next commit. It simply provides an extra layer of granularity and control in the commit process. It allows you to commit some of the differences in your working copy, but not others, which is useful in many situations.
You don't have to take advantage of the Index if you don't want to, and
you're not doing anything "wrong" if you don't. If you want to pretend
it doesn't exist, just remember to call
git add . from the root of
the working copy (which will update the Index to match) each time and
git commit. You also can use the -a option with
git commit to add changes automatically; however, it will not add new
files, only changes to existing files. Running
git add. always
will add everything.
The exact work flow and specific style of commands largely are left up to you as long as you follow the basic rules.
git status command shows you all the differences between your
working copy and the Index, and the Index and the most recent commit
(the current HEAD):
This lets you see pending changes easily at any given time, and it even
reminds you of relevant commands like
git add to stage pending changes
into the Index, or
git reset HEAD <file> to remove (unstage) changes
that were added previously.
Free DevOps eBooks, Videos, and more!
Regardless of where you are in your DevOps process, Linux Journal can help!
We offer here the DEFINITIVE DevOps for Dummies, a mobile Application Development Primer, and advice & help from the expert sources like:
- Linux Journal
- Users, Permissions and Multitenant Sites
- New Products
- Flexible Access Control with Squid Proxy
- Security in Three Ds: Detect, Decide and Deny
- High-Availability Storage with HA-LVM
- Tighten Up SSH
- DevOps: Everything You Need to Know
- Solving ODEs on Linux
- Non-Linux FOSS: MenuMeters
- March 2015 Issue of Linux Journal: System Administration