Git - Revision Control Perfected
A commit is meant to record a set of changes introduced to a project. What it really does is associate a tree object—representing a complete snapshot of a directory structure at a moment in time—with contextual information about it, such as who made the change and when, a description, and its parent commit(s).
A commit doesn't actually store a list of changes (a "diff") directly, but it doesn't need to. What changed can be calculated on-demand by comparing the current commit's tree to that of its parent. Comparing two trees is a lightweight operation, so there is no need to store this information. Because there actually is nothing special about the parent commit other than chronology, one commit can be compared to any other just as easily regardless of how many commits are in between.
All commits should have a parent except the first one. Commits usually have a single parent, but they will have more if they are the result of a merge (I explain branching and merging later in this article). A commit from a merge still is just a snapshot in time like any other, but its history has more than one lineage.
By following the chain of parent references backward from the current commit, the entire history of a project can be reconstructed and browsed all the way back to the first commit.
A commit is expanded recursively into a project history in exactly the same manner as a tree is expanded into a directory structure. More important, just as the SHA1 of a tree is a fingerprint of all the data in all the trees and blobs below it, the SHA1 of a commit is a fingerprint of all the data in its tree, as well as all of the data in all the commits that preceded it.
This happens automatically because references are part of an object's overall content. The SHA1 of each object is computed, in part, from the SHA1s of any objects it references, which in turn were computed from the SHA1s they referenced and so on.
A tag is just a named reference to an object—usually a commit. Tags typically are used to associate a particular version number with a commit. The 40-character SHA1 names are many things, but human-friendly isn't one of them. Tags solve this problem by letting you give an object an additional name.
There are two types of tags: object tags and lightweight tags. Lightweight tags are not objects in the repository, but instead are simple refs like branches, except that they don't change. (I explain branches in more detail in the Branching and Merging section below.)
Setting Up Git
If you don't already have Git on your system, install it with your package manager. Because Git is primarily a simple command-line tool, installing it is quick and easy under any modern distro.
You'll want to set the name and e-mail address that will be recorded in new commits:
git config --global user.name "John Doe" git config --global user.email email@example.com
This just sets these parameters in the config file ~/.gitconfig. The config has a simple syntax and could be edited by hand just as easily.
Git's interface consists of the "working copy" (the files you directly interact with when working on the project), a local repository stored in a hidden .git subdirectory at the root of the working copy, and commands to move data back and forth between them, or between remote repositories.
The advantages of this design are many, but right away you'll notice that there aren't pesky version control files scattered throughout the working copy, and that you can work off-line without any loss of features. In fact, Git doesn't have any concept of a central authority, so you always are "working off-line" unless you specifically ask Git to exchange commits with your peers.
The repository is made up of files that are manipulated by invoking the git command from within the working copy. There is no special server process or extra overhead, and you can have as many repositories on your system as you like.
You can turn any directory into a working copy/repository just by running this command from within it:
Next, add all the files within the working copy to be tracked and commit them:
git add . git commit -m "My first commit"
You can commit additional changes as frequently or infrequently as you
like by calling
git add followed by
commit after each modification
you want to record.
If you're new to Git, you may be wondering why you need to call
add each time. It has to do with the process of
"staging" a set of
changes before committing them, and it's one of the most common sources of
confusion. When you call
git add on one or more files, they are added
to the Index. The files in the Index—not the working copy—are what
get committed when you call
Think of the Index as what will become the next commit. It simply provides an extra layer of granularity and control in the commit process. It allows you to commit some of the differences in your working copy, but not others, which is useful in many situations.
You don't have to take advantage of the Index if you don't want to, and
you're not doing anything "wrong" if you don't. If you want to pretend
it doesn't exist, just remember to call
git add . from the root of
the working copy (which will update the Index to match) each time and
git commit. You also can use the -a option with
git commit to add changes automatically; however, it will not add new
files, only changes to existing files. Running
git add. always
will add everything.
The exact work flow and specific style of commands largely are left up to you as long as you follow the basic rules.
git status command shows you all the differences between your
working copy and the Index, and the Index and the most recent commit
(the current HEAD):
This lets you see pending changes easily at any given time, and it even
reminds you of relevant commands like
git add to stage pending changes
into the Index, or
git reset HEAD <file> to remove (unstage) changes
that were added previously.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Sony Settles in Linux Battle
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Libarchive Security Flaw Discovered
- Profiles and RC Files
- Maru OS Brings Debian to Your Phone
- Understanding Ceph and Its Place in the Market
- Snappy Moves to New Platforms
- The Giant Zero, Part 0.x
- Git 2.9 Released
- Astronomy for KDE
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide