Subversion: Not Just for Code Anymore

Here is a subversive way to handle multiple versions of your personal information instead of just versions of code.

Have you ever needed some information from a file, only to remember that you modified the file a week ago and removed the very information you're interested in? Or, have you ever spent hours sifting through dozens of inconsistently named copies of the same file trying to find one particular version? If you're like me, the answer is probably a resounding yes to both questions. Of course, if you're a programmer, you've probably already solved that problem in your development activities by using a version control system like CVS or Subversion. What about everything else though? Mom's cherry pie recipe may not change as frequently as rpc_init.c, but if you do decide to create a low-cal version, you're not going to want to lose the original. As it turns out, version control isn't only for source files anymore. Many of the features of Subversion make it ideal for versioning all kinds of files.

With Subversion, you can keep a history of changes made to your files. That way, you easily can go back and see exactly what a given file contained at a particular point in time. You also save space, because it stores deltas from one version to the next. That way, when you make a change to a versioned file, it needs only enough extra space to store the changes rather than a complete second copy of the file. Also, unlike with CVS, delta storage on Subversion also applies to binary files as well as text files.

Subversion makes it easy to access your files from multiple computers too. Instead of worrying whether the copy of the budget report on your laptop reflects the changes you made last night on your desktop system at home, you can simply run an update on your laptop and Subversion automatically updates your file to the latest version in the repository. Also, because all of the versions are stored in a single repository, there is a single location that you need to back up in order to keep all of your data safe.

What to Version

So your interest is piqued. You're sold on the advantages of versioning your files, and you'd like to give it a try. The first question to answer is what files you're going to put under version control. One obvious possibility would be to version your entire hard drive. In practice though, that's not a very practical approach. When you store a portion of a repository's contents locally (in what's called a working copy), Subversion stores a second copy of each file to allow it to compare locally changes you have made with the last version from the repository. Therefore, if you version the entire hard drive, you'll need twice as much hard drive.

There's also little reason to keep full revision history on the largely static parts of your filesystem, such as /usr or /opt. On the other hand, directories that contain a lot of custom files/modifications, such as /etc or /home, are prime candidates for versioning, because the advantage of tracking those changes is more likely to outweigh the disadvantages of extra storage requirements. Furthermore, with Subversion, you can opt to create a working copy from a subtree in the repository hierarchy. That way, you don't need to store any copies of infrequently accessed data locally, which often results in a net reduction in hard drive requirements, even though the files you are storing locally take up twice as much space.

Getting Subversion Up and Going

Now, let's dive in and get Subversion running on your machine. Installing is generally pretty easy. You can, of course, download the Subversion source and compile that, but in most cases, it's going to be much easier to install the precompiled binary package for your Linux distribution of choice. Fortunately, Subversion has matured to the point where such a package is available for almost every major distribution. In fact, I don't know of any off the top of my head that it isn't available for.

Once you have Subversion installed, it's time to create a repository. Let's say you have a documents directory in your home that you'd like to version. First, you need to create a new empty repository using the svnadmin create command. For instance, the following creates a new repository in your home directory:

$ svnadmin create $HOME/.documents_repository

Next, you need to import your existing documents into the newly created repository. To do that, use the svn import command with the directory to import and a URL that points to the repository. In this example, the URL refers directly to the repository using a file://-type URL. If your repository will be used only locally, the file:// URL is the easiest way to access a repository (there are other, better ways to access repositories that I'll discuss in a bit though):

$ svn import $HOME/documents file://$HOME/.documents_repository

When you run the import command, Subversion opens an editor and asks you for a log message. Whatever message you enter will be associated with the newly created repository revision and can be seen by examining the repository history logs. Enter something brief, such as “imported documents directory”. As soon as you save the log message and leave the editor, Subversion performs the import and outputs something like the following:

Adding         documents/file1.txt
Adding         documents/file2.txt
Adding         documents/file3.jpg

Committed revision 1.

You can now safely remove the original $HOME/documents and then re-create it as a working copy of the repository, using the svn checkout command:

$ rm -rf $HOME/documents
$ svn checkout file://$HOME/.documents_repository $HOME/documents

So far, so good. However, if you want to take advantage of Subversion from multiple machines, you're going to need to set up a server. Several options are available to you, but the best choice is generally to use Apache with mod_dav, which serves a Subversion repository using the WebDAV protocol.

From a basic Apache installation, getting WebDAV to work is fairly simple. First, you need to make sure that mod_dav and mod_dav_svn are being loaded:

LoadModule      dav_module        modules/mod_dav.so
LoadModule      dav_svn_module    modules/mod_dav_svn.so

Next, you need to set up a <Location> directive to point to your repository. For example, if you want your repository to be referenced with the URL http://example.net/bill/documents, and the repository is located in /srv/repositories/bill_documents, you could use the following Location directive:


<Location /bill/documents>
   DAV svn
   SVNPath /srv/repositories/bill_documents
   AuthType None
</Location>

Or, if you want more security, you could allow for valid users only:


<Location /bill/documents>
   DAV svn
   SVNPath /srv/repositories/bill_documents
   AuthType Basic
   AuthName "Bill's Documents"
   AuthUserFile /srv/repositories/bill_documents/passwd
   Require valid-user
</Location>

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Labels for certain important changes

Alvaro Arenas's picture

Hi

Good article. However, I was missing a little paragraph about labelling some important versions of a file. For example, I am trying to modify my grandmother’s cake receipt. I made some changes and I got a good receipt, but still I am not totally satisfied. I would like to save this version of the receipt with a label "good-enough" and continue trying. Can I do this? How do I do it?

Greetings

Alvaro.

Re: Labels for certain important changes

William Nagel's picture

Thanks, I'm glad you enjoyed the article.

Creating a "label" of a version of your file couldn't be easier. All you have to do is make a copy of the file and rename it to whatever you want. Subversion uses what it calls "cheap copies" when you make a copy of a file in the repository. Basically what that means is that it doesn't really make a copy of the file. Instead all it does is make a new entry under the new filename that points back to the revision of the original file from which it was created, which uses almost zero extra space on disk.

Assuming you are using WebDAV, you can do the copy by just copying the file as you would any other file and Subversion will "do the right thing" (don't create the copy by using Save As from the text editor though, as that will create a copy of the file's contents in the repository). On the other hand, if you're using a working copy of the repository you can do the copy from the command line using "svn copy" followed by "svn commit".

import syntax

padyer's picture

I think you need to use file:/// (3 /'s instead of 2) when not including a hostname.

Awesome article. I really want to try the webdav stuff.

phil

Re: import syntax

William Nagel's picture

I think you need to use file:/// (3 /'s instead of 2) when not including a hostname.

You are correct. However if you look back at the article you'll notice that I use $HOME, which includes the leading slash so that file://$HOME will expand to file:///home/bill (with the correct number of /'s).

Awesome article. I really want to try the webdav stuff.

Thank you very much! I'm glad you enjoyed it and I with you luck with getting WebDAV going.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState