The Subversion Project: Buiding a Better CVS

This open-source project aims to produce a compelling replacement for the Concurrent Versions System (CVS).

If you work on any kind of open-source project, you've probably worked with CVS. You probably remember the first time you learned to do an anonymous checkout of a source tree over the Net, your first commit or learning how to look at CVS diffs. And then the fateful day came: you asked your friend how to rename a file. “You can't”, was the reply.

“What? What do you mean?” you asked.

“Well, you can delete the file from the repository and then re-add it under a new name.”

“Yes, but then nobody would know it had been renamed.”

“Let's call the CVS administrator. She can hand-edit the repository's RCS files for us and possibly make things work.”

“What?”

“And by the way, don't try to delete a directory either.”

You rolled your eyes and groaned. How could such simple tasks be so difficult?

The Legacy of CVS

No doubt about it, CVS has evolved into the standard software configuration management (SCM) system of the Open Source community, and rightly so. CVS is free software and has a wonderful nonlocking development model that allows hundreds of far-flung programmers to collaborate. In fact, one might argue that, without CVS, it's doubtful whether sites like Freshmeat or SourceForge ever would have flourished as they do now. CVS and its semi-chaotic development model have become an essential part of Open Source culture.

So what's wrong with CVS? Because it uses the RCS storage system under the hood, CVS can only track file contents, not tree structures. As a result, the user has no way to copy, move or rename items without losing history. Tree rearrangements are always ugly server-side tweaks.

The RCS back end cannot store binary files efficiently, and branching and tagging operations can become very slow. CVS also uses the network inefficiently; many users are annoyed by long waits, because file differences are sent in only one direction (from server to client, but not from client to server), and binary files are always transmitted in their entirety.

From a developer's standpoint, the CVS codebase is the result of layers upon layers of historical “hacks”. (Remember that CVS began life as a collection of shell scripts to drive RCS.) This makes the code difficult to understand, maintain or extend. For example, CVS's networking ability was essentially stapled on. It was never designed to be a native client/server system.

Rectifying CVS's problems is a huge task, and we've only listed a few of the many common complaints here.

Enter Subversion

In 1995, Karl Fogel and Jim Blandy founded Cyclic Software, a company for commercially supporting and improving CVS. Cyclic made the first public release of a network-enabled CVS (contributed by Cygnus software). In 1999, Karl Fogel published a book about CVS and the open-source development model it enables (cvsbook.red-bean.com). Karl and Jim had long talked about writing a replacement for CVS; Jim even had drafted a new, theoretical repository design. Finally, in February 2000, Brian Behlendorf of Collabnet (www.collab.net) offered Karl a full-time job to write a CVS replacement. Karl gathered a team and work began in May.

The team settled on a few simple goals: it was decided that Subversion would be designed as a functional replacement for CVS. It would do everything that CVS does, preserving the same development model while fixing the flaws in CVS's (lack of) design. Existing CVS users would be the target audience; any CVS user should be able to start using Subversion with little effort. Any other bonus features were decided to be of secondary importance (at least before a 1.0 release).

At the time of this writing, the original team has been coding for a little over a year, and we have a number of excellent volunteer contributors. (Subversion, like CVS, is an open-source project.)

Subversion's Features

Here's a quick rundown of some of the reasons you should be excited about Subversion:

  • Real copies and renames: the Subversion repository doesn't use RCS files at all; instead, it implements a virtual versioned filesystem that tracks tree structures over time (described below). Files and directories are versioned. At last there are real client-side mv and cp commands that behave just as you think.

  • Atomic commits: a commit either goes into the repository completely or not all.

  • Advanced network layer: the Subversion network server is Apache, and client and server speak WebDAV(2) to each other. (See the “Subversion Design” section below.)

  • Faster network access: a binary diffing algorithm is used to store and transmit deltas in both directions, regardless of whether a file is of text or binary type.

  • Filesystem properties: each file or directory has an invisible hash table attached. You can invent and store any arbitrary key/value pairs you wish: owner, perms, icons, app-owner, MIME type, personal notes, etc. This is a general-purpose feature for users. Properties are versioned, just like file contents. And some properties are auto-detected, like the MIME type of a file (no more remembering to use the -kb switch).

  • Extensible and hackable: Subversion has no historical baggage; it was designed and implemented as a collection of shared C libraries with well defined APIs. This makes Subversion extremely maintainable and usable by other applications and languages.

  • Easy migration: the Subversion command-line client is very similar to CVS; the development model is the same, so CVS users should have little trouble making the switch. Development of a cvs2svn repository converter is in progress.

  • It's free: Subversion is released under an Apache/BSD-style, open-source license.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

regarding the svn+ssh

kashide's picture

where do I find the detailed steps for setting up the server configuration.

My requirement is that I have database on server A and working forlder on server B.

I am not able to use the tunneling over the SSH.

I think I am doing some mistake.
for the following steps :
$ svn list svn+ssh://host.A.com/repo/project
Kashide@host.A.com's password: *****

I get an error message
"Authontication failed. Please explain me why? and what is the solution for this.

Not as compelling as one would hope

Brian Gallew's picture

I've been a systems administrator for 15 years. I've been a revision control evangelist for at least 10 of those years. I've used RCS, CVS, and PerForce. I've never actually used subversion, though, and for good reason: it's impossible to build.

I'll admit, things have certainly improved over the past few years. At least it's actually possible to get a linux distribution with Subversion installed. Of course, if you're already a user, then you're probably using an older version, and the one that comes with your distro of choice isn't compatible.

Then again, if you happen to actually work for a living with Unix(tm), then you probably aren't in a linux-only environment. How about HP-UX? Solaris? AIX? The only way to actually build subversion on those platforms is to either completely ignore several vendor-supplied packages and build them all from scratch, or else to dig back through Usenet and try to find the magic combination of Subversion, Neon, ... that will work with your installed version of Apache. Or Python. Or Swig. Or whatever.

While I understand that the features offered by Subversion are not only useful, but cover some extremely large gaps in CVS, I can build CVS on any platform, and the *only* dependency is RCS, which also builds cleanly on any platform. Frankly, at this point the only way Subversion is going to become interesting to a huge number of people out there is if they ship it in a pure Python/Java/Perl/Ruby/whatever form (i.e. write it in a high-level scripting language that stands a reasonable chance of running on multiple platforms).

Re: The Subversion Project: Buiding a Better CVS

Anonymous's picture

Subversion is really a great system now. It's making strides against CVS in terms of market share.

From my customers forums:
"One of the features that sold me is the ease of access via http. Using CVS, you always have to struggle with firewalls and setting up ssh tunnels or else, put up with the weak security of pserver."

Feel free to check out our Subversion [wush.net] hosting service if you want to give Subversion a spin. We've got instant setup and a week long free trial. You'll be able to start playing with your repository in 5 minutes :)

regarding the download Subversion for Linux

Ritu's picture

i want to free download subversion on fedora linux.So please tell me the way for this.

Release Date

Anonymous's picture

As of today, more than a year after the proposed v1.0 date, it looks like it's at v0.27 :/

v0.37 = v1.0 RC1 (New target is Feb 23, 2004)

Anonymous's picture

According to their site, v0.37 (dubbed 1.0 RC1) will become 1.0 in another week or so if nothing major gets discovered during testing.

Re: Release Date

Anonymous's picture

Yes, what happened?

Subversion rocks!

Anonymous's picture

This is going to be really good... -- mbp

What is the 'intelligent merging' that is slated?

Anonymous's picture

What does this term 'intelligent merging' mean? It sounds like 'intelligent merging' means if I check in a file that's been updated (ie, changed by someone else) since I checked it out, the VCS *intelligently* merges the changes. Does this system currently support concurrent development or not?

Re: What is the 'intelligent merging' that is slated?

Anonymous's picture

Actually, "intelligent merging" means a solution to the classic "repeated merge" problem you see so often in CVS.

That is, you merge some changes from the trunk to a branch. Then later on, you want to merge more changes from trunk to branch, but end up getting *conflicts*, because some of the changes have already been ported over previously.

With intelligent merging, the system remembers which changesets have been ported to each branch. So it can automatically avoid repeated merges.

This feature won't be in 1.0, but should follow soon after.

Re: What is the 'intelligent merging' that is slated?

Anonymous's picture

Ofcourse it supports concurrent development....

Re: What is the 'intelligent merging' that is slated?

Anonymous's picture

It probably refers to the merging of a "sandbox" sub-project, where, say new functionality is being added to a project, with the main project code tree. That way, new functionality can be added and tested seperately without affecting the stability of the main project code. Once it is ready, this stream can be merged into the main project. Just a guess.

Re: The Subversion Project: Buiding a Better CVS

Anonymous's picture

Found the following link to the project:

http://subversion.tigris.org/

Re: The Subversion Project: Buiding a Better CVS

Anonymous's picture

Was there some reason not to include the project URL in this article? Not that it took long to find with Google, but it does seem like an obvious oversight. The URL in question is http://subversion.tigris.org/.

Re: The Subversion Project: Buiding a Better CVS

Anonymous's picture

I had exactly the same question, why no link available for subversion?

Thanks anyway for posting it here, though it shows up only to those who read beyond the article's end :-)

-shalz

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState