An Automated Reliable Backup Solution

Creating an unattended, encrypted, redundant, network backup solution using Linux, Duplicity and COTS hardware.
The Hardware

Once I had committed to building this backup solution, I had to decide which hardware components I was going to use. Given my functionality, reliability, performance and general requirements, I decided to build a RAID 1—mirrored—array-based network solution. This meant that I needed two hard drives and a RAID controller that would support at least two hard drives.

I started by looking at small form-factor motherboards that I might use. I had used Mini-ITX motherboards in a number of other projects and knew that there was close to full Linux support for it. Given that this project did not require a fast CPU, I decided on the EPIA Mini-ITX ML8000A motherboard, which has an 800MHz CPU, a 100Mb network interface and one 32-bit PCI slot built in to it. This met my motherboard, CPU and network interface requirements and provided a PCI slot for the RAID controller.

After deciding on the form factor and motherboard, I had to choose a case and power supply that would provide enough space to fit a PCI hardware RAID controller, the Mini-ITX motherboard and two full-size hard drives, while complying with my general requirements. I compared a large number of Mini-ITX cases. I found only one, the Silver Venus 668, that was flexible enough to support everything I needed. After choosing the motherboard and case, I looked at the RAM requirement, and I chose 512MB of DDR266 RAM. I had great difficulty finding US Mini-ITX distributors. Luckily, I found a company, Logic Supply, which provided me with the motherboard, case, power supply and RAM as a package deal for a total of $301.25 US, including shipping. At this point, I had all of the components except the RAID controller and hard drives.

Finding a satisfactory RAID controller was extremely difficult. Many RAID controllers actually do their processing in operating system-level drivers rather than on a chip in the RAID controller card itself. The 3ware 8006-2LP SATA RAID Controller is a two-drive SATA controller that does its processing on the controller card. I acquired the 3ware 8006-2LP from Monarch Computer Systems for a total of $127.83 US, including shipping.

At this point, I needed only the hard drives. I eventually decided on buying two 200GB Western Digital #2000JS SATA300 8MB Cache drives from Bytecom Systems, Inc., for a total of $176.69 US, including shipping. At this point, I had all of my hardware requirements satisfied. In the end, the hardware components for this system cost a total of $604.77 US—well below the approximate $1,000 US cost of the RAID array network appliances that failed to satisfy most of my requirements.

Figure 3. Silver Venus 668 Case (Inside with Hardware)

File Server

After building the computer, I decided to install Debian stable 3.1r2 on the newly built server's RAID array because of its superior package management system. I then installed an SSH dæmon so that the file server could be accessed securely. Once the SSH package was installed, I created a user account for myself on the file server. The user account home directory is where the backup data is stored, and all users who want to back up to the server will have their own accounts on the file server.

Client Setup

Once the file server was set up, I had to configure a computer to be backed up. Because Duplicity is integrated with GnuPG and SSH, I configured GnuPG and SSH to work unattended with Duplicity. I set up the following configuration on all the computers that I wanted to back up onto my newly created file server.

Installing Duplicity

I installed Duplicity on a Debian Linux computer using apt-get with the following command as superuser:

# apt-get install duplicity
SSH DSA Key Authentication

Once Duplicity was installed, I created a DSA key pair and set up SSH DSA key authentication to provide a means of using SSH without having to enter a password. Some people implement this by creating an SSH key without a password. This is extremely dangerous, because if people obtain the key, they instantly have the same access that the original key owner had. Using a password-protected key requires people who get the key also to have the key's password before they can gain access. To create an SSH key pair and set up SSH DSA key authentication, I ran the following command sequence on the client machine:


$ ssh-keygen -t dsa
$ scp ~/.ssh/id_dsa.pub <username>@<server>:
$ ssh <username>@<server>
$ cat id_dsa.pub >> ~/.ssh/authorized_keys2
$ exit

The first command creates the DSA key pair. The second command copies the previously generated public key to the backup server. The third command starts a remote shell on the backup server. The fourth command appends the public key to the list of authorized keys, enabling key authentication between the client machine and the backup server. The fifth and final command exits the remote shell.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Duplicity for Windows

Nic Sandfield's picture

Those with Windows clients should check out the wonderful duplicati implementation

Actually, you will also have

Anonymous's picture

Actually, you will also have the added complication of file system issues if backing up the forked HFS+ file system on the Mac to the single fork file system on the Linux box.

Dave from nanovor game

I was not able to covery every aspect.

Kevin Horn's picture

I was not able to covery every aspect. Getting it working on Mac OS X is pretty close to what is required for getting it working on Linux. However, Windows is a completely different experience, it required a huge amount of work on my part and I have not had a chance to write it all up yet in final form. Work has been consuming most of my time as of late, but I am still trying to get something out to help people like yourself.
Kevin Horn - club penguin

awesome

Neal's picture

This article is fantastic. Great work. Just what I needed to jumpstart my move to this solution without having to learn too much before I get it working.

Thanks again.

-N

Any updates on sourcing of components?

gmaya's picture

Andrew:
Are there any updates on sourcing of components and their features?

I started by looking at

Cristiano's picture

I started by looking at small form-factor motherboards that I might use. I had used Mini-ITX motherboards in a number of other projects and knew that there was close to full Linux support for it. Given that this project did not require a fast CPU, I decided on the EPIA Mini-ITX ML8000A motherboard, which has an 800MHz CPU, a 100Mb network interface and one 32-bit PCI slot built in to it.

Unclear

adeponte's picture

I am having difficulty understanding what you are specifically referring to. If you are referreing to the hardware and the functionality of it, not much has change since the article was released. If not, please drop me an e-mail at cyphactor@socal.rr.com with further questions.

Is something missing....?

PatrickT's picture

When I read this article I was lead to believe that since the author has "12 computers, which run a combinations of Linux, Mac OS X, and Windows. Losing my work is unacceptable!" we were going to a see a solution that provided for backup of all the OSs he listed. Unfortunately it appears, only Linux like OSs are supported. Foiled again!

Patrick

Try BackupPC

Muyiwa Taiwo's picture

You may want to check out BackupPC here. I've done a write-up here about integrating Windows Active Directory clients with the BackupPC server.

Limitations of Reality

adeponte's picture

You are correct, when you did read the article it did lead you to beleive I have 12 computers running a variety of operating systems Linux, Mac OS X, and Windows. The limitations of reality are that there is a word limit for articles. Hence I was not able to covery every aspect. Getting it working on Mac OS X is pretty close to what is required for getting it working on Linux. However, Windows is a completely different experience, it required a huge amount of work on my part and I have not had a chance to write it all up yet in final form (if I can remember all that I did). Work has been consuming most of my time as of late, but I am still trying to get something out to help people like yourself. My ultimate goal is to expand this current solution into a more complete feature filled solution that is pretty trivial to setup. Sadly it isn't there yet, but it is on the back burner. If you have any questions feel free to e-mail me at cyphactor@socal.rr.com.

Actually, you will also have

Anonymous's picture

Actually, you will also have the added complication of file system issues if backing up the forked HFS+ file system on the Mac to the single fork file system on the Linux box.

Backup for Windows

Tabare Perez's picture

Maybe a solution for your Windows machine is a free software called Cobian Backup (http://www.educ.umu.se/~cobian/cobianbackup.htm). It works very well.

Best regards.
Tabare

Rsync backup for Windows to a Linux server

Alan's picture

Not that Rsync is the best solution out there(I do really like the duplicity backup solution outlined above)there is a way to use Cygwin and Rsync to a Linux server.
Check it out here http://www.gaztronics.net/rsync.php I have not tried it, but I may if I cannot get Duplicity to play well with Cygwin

Try using this page--Running Duplicity in Cygwin

Alan's picture

I haven't set this up yet, but tomorrow's the day. I will try to post to let you know how it goes. See this site for instructions on running duplicity in Cygwin. I don't see why it wouldn't work.... http://katastrophos.net/andre/blog/2006/04/03/duplicity-042-on-cygwin/

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState