The Subversion Project: Buiding a Better CVS
If you work on any kind of open-source project, you've probably worked with CVS. You probably remember the first time you learned to do an anonymous checkout of a source tree over the Net, your first commit or learning how to look at CVS diffs. And then the fateful day came: you asked your friend how to rename a file. “You can't”, was the reply.
“What? What do you mean?” you asked.
“Well, you can delete the file from the repository and then re-add it under a new name.”
“Yes, but then nobody would know it had been renamed.”
“Let's call the CVS administrator. She can hand-edit the repository's RCS files for us and possibly make things work.”
“And by the way, don't try to delete a directory either.”
You rolled your eyes and groaned. How could such simple tasks be so difficult?
No doubt about it, CVS has evolved into the standard software configuration management (SCM) system of the Open Source community, and rightly so. CVS is free software and has a wonderful nonlocking development model that allows hundreds of far-flung programmers to collaborate. In fact, one might argue that, without CVS, it's doubtful whether sites like Freshmeat or SourceForge ever would have flourished as they do now. CVS and its semi-chaotic development model have become an essential part of Open Source culture.
So what's wrong with CVS? Because it uses the RCS storage system under the hood, CVS can only track file contents, not tree structures. As a result, the user has no way to copy, move or rename items without losing history. Tree rearrangements are always ugly server-side tweaks.
The RCS back end cannot store binary files efficiently, and branching and tagging operations can become very slow. CVS also uses the network inefficiently; many users are annoyed by long waits, because file differences are sent in only one direction (from server to client, but not from client to server), and binary files are always transmitted in their entirety.
From a developer's standpoint, the CVS codebase is the result of layers upon layers of historical “hacks”. (Remember that CVS began life as a collection of shell scripts to drive RCS.) This makes the code difficult to understand, maintain or extend. For example, CVS's networking ability was essentially stapled on. It was never designed to be a native client/server system.
Rectifying CVS's problems is a huge task, and we've only listed a few of the many common complaints here.
In 1995, Karl Fogel and Jim Blandy founded Cyclic Software, a company for commercially supporting and improving CVS. Cyclic made the first public release of a network-enabled CVS (contributed by Cygnus software). In 1999, Karl Fogel published a book about CVS and the open-source development model it enables (cvsbook.red-bean.com). Karl and Jim had long talked about writing a replacement for CVS; Jim even had drafted a new, theoretical repository design. Finally, in February 2000, Brian Behlendorf of Collabnet (www.collab.net) offered Karl a full-time job to write a CVS replacement. Karl gathered a team and work began in May.
The team settled on a few simple goals: it was decided that Subversion would be designed as a functional replacement for CVS. It would do everything that CVS does, preserving the same development model while fixing the flaws in CVS's (lack of) design. Existing CVS users would be the target audience; any CVS user should be able to start using Subversion with little effort. Any other bonus features were decided to be of secondary importance (at least before a 1.0 release).
At the time of this writing, the original team has been coding for a little over a year, and we have a number of excellent volunteer contributors. (Subversion, like CVS, is an open-source project.)
Here's a quick rundown of some of the reasons you should be excited about Subversion:
Real copies and renames: the Subversion repository doesn't use RCS files at all; instead, it implements a virtual versioned filesystem that tracks tree structures over time (described below). Files and directories are versioned. At last there are real client-side mv and cp commands that behave just as you think.
Atomic commits: a commit either goes into the repository completely or not all.
Advanced network layer: the Subversion network server is Apache, and client and server speak WebDAV(2) to each other. (See the “Subversion Design” section below.)
Faster network access: a binary diffing algorithm is used to store and transmit deltas in both directions, regardless of whether a file is of text or binary type.
Filesystem properties: each file or directory has an invisible hash table attached. You can invent and store any arbitrary key/value pairs you wish: owner, perms, icons, app-owner, MIME type, personal notes, etc. This is a general-purpose feature for users. Properties are versioned, just like file contents. And some properties are auto-detected, like the MIME type of a file (no more remembering to use the -kb switch).
Extensible and hackable: Subversion has no historical baggage; it was designed and implemented as a collection of shared C libraries with well defined APIs. This makes Subversion extremely maintainable and usable by other applications and languages.
Easy migration: the Subversion command-line client is very similar to CVS; the development model is the same, so CVS users should have little trouble making the switch. Development of a cvs2svn repository converter is in progress.
It's free: Subversion is released under an Apache/BSD-style, open-source license.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Sony Settles in Linux Battle
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Libarchive Security Flaw Discovered
- Profiles and RC Files
- Maru OS Brings Debian to Your Phone
- Understanding Ceph and Its Place in the Market
- Snappy Moves to New Platforms
- Git 2.9 Released
- The Giant Zero, Part 0.x
- Astronomy for KDE
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide