Build a Home Terabyte Backup System Using Linux

Build a low-cost, terabyte-sized backup server using Linux and back up your digital audio files, digital images and digital movie recordings.

A terabyte-plus backup and storage system is now an affordable option for Linux users. This article discusses options for building and configuring an inexpensive, expandable, Linux-based backup server.

Server Design

High-capacity disk drives are now widely available at prices that are incredibly cheap compared to those of only a few years ago. In addition, with so many Linux users now ripping CDs to disk, saving images from their digital cameras and recording video using digital camcorders and DVRs, such as MythTV, the need for backing up and archiving large amounts of data is becoming critical. Losing pictures and videos of your kids—or your audio music library—because of a disk crash would be a catastrophe. Fortunately, a high-capacity, Linux-based backup server can be built easily and cheaply using inexpensive disk drives and free software.

Virtually any home PC can meet the basic requirements for a backup server. If you have long backup windows or relatively small amounts of data, a slow computer is not an obstacle. Make sure your network is fast enough to transfer data within your backup window. For older equipment, the bottleneck for backups can be the disk data transfer bandwidth (30-150Mbps depending on disk technology).

Many consumer-level computers do not have cooling capacity for more than two internal hard disks. Most motherboards support a maximum of four onboard disks (often four ATA/IDE devices, but the two ATA/IDE and two SATA combination is becoming common). External USB high-capacity drives are also available. If your computer is older and has USB1, purchase an inexpensive USB2 PCI expansion card, which is ten times faster.

SCSI has fewer limitations, but it is expensive and has tended to lock purchasers in to “flavor-of-the-month” SCSI technologies. One option for disk expansion and upgrade is the Host Bus Adaptor (HBA), such as those made by Promise Technology. An HBA is a disk controller on a PCI expansion card. HBAs typically require no additional software, have their own BIOS and are not constrained by PC BIOS limits on disk size. HBAs let you put large disks (more than 120GB) into systems with legacy BIOSes, upgrade from ATA-33 to ATA-150 or mix ATA and SATA disks.

You may want to consider purchasing a dedicated fileserver. A bare-bones server capable of holding six disks (fully preassembled, no disks or OS) can cost less than $1,500 US. With this initial investment, you can expand disk space as needed for less than $0.80 per GB or grow by plugging in USB disks. Once you have decided how many disks you need, consider their space, cooling and noise requirements. Figure 1 shows an example of a backup system build from an old server. The system has well over a terabyte of storage capacity.

Figure 1. Storage array build from an old server (capacity of nine IDE disks, including five in a converted SCSI RAID stack). Additional IDE spots added with Promise HBA.

Even if you choose to build a server from scratch and populate it with high-capacity disks, you can expect costs for your terabyte-plus backup server still to be minimal in terms of its per-gigabyte price. This is because storage costs have decreased so dramatically. Table 1 provides a variety of different configurations for a backup server, along with estimated prices per gigabyte for each (note: prices are estimates and do not include taxes or shipping costs). As you can see from the table, costs for a new server equipped with more than two terabytes of storage can be built for a cost of less than $1.50 per gigabyte. That will back up a lot of home movies, digital pictures and music files!

Table 1. Some Backup Options, with Estimated per-GB Costs

TypeConfigurationCapacity (TB)Cost per GB ($)
ATA/SATA DiskInternal disk0.40.56
Linux Desktop*Three internal disks1.20.84
Linux Desktop*Three internal disks plus two USB external2.00.73
LaCie 2TB StorageNetwork server appliance21.15
Linux Server**Six internal disks2.41.21
Linux Server**Six internal plus two USB external3.21.08

*Intel Celeron D 478 325 2.53GHz, 256MB of RAM. **Intel SC5275 chassis, Intel ATX Motherboard, dual-3GHz Xeon CPUs, 2GB of RAM.



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Backup Project Versus Backup Solution

Backup Appliance's picture

The core concepts espoused here are dead on; the trouble tends to be not when you're trying to backup one or two Linux servers but when you want a backup solution for an enterprise (even a small or medium-sized one) - and you want to be able to put it out there and then forget about it. That's where the software that operates on top of the Linux and underlying hardware platform get critical.

I think these days with higher density drives I'd tend to use RAID-6 to avoid MTTDL issues potentially occurring during RAID rebuilds.

Take rsync for example - it's great. And if you have a pristine WAN out there that stays pristine in terms of quality, then you can set and forget it. But when you're using it for disaster recovery and take into account lower quality WAN then you're either going to make an investment in increasing its fault resiliency over lower quality lines or you're going to spend a lot of time handling failure.

Simple live Linux backup using SSH

Vlatko Šurlan's picture

I've construed a nice way to back up the stuff from my server directly into an ssh connection to my home machine. Works well, costs nothing and it does not clutter the disk space on my server. Check it out: Backing up Linux web server live via SSH.


JohnnyBoyClub's picture

I am using for my linux a free backup software called Dmailer . Because is a lot easier for me to save all my backup online on their servers than keeping it on other hdd or in other way , and also is more secure than in other way.

Awesome Solution

Jared Henry's picture

We are actually starting to use this kind of solution for some of our offsite backups and it is working great so far. I highly recommend to use Rsync in an offsite backup solution.

Backup Server

Madhu Sudan Rohatgi's picture

very Help full for me

help you build a home terabyte backup system

Anonymous's picture

there is a free software to help you build a home terabyte backup system, though it is free ,the speed of running is very fast ,and it is reliable ,
click here ,you will find how amazing it is

of interest:New ‘SEP sesam’ V 3.4 Data Recovery: SAP Certified a

Xenia's picture

This is a new Linux Back up option: 'SEP sesam' version 4.3

Salt Lake City, UT [BrainShare] - March 17, 2007 – SEP Software, the technology leader in cross-platform data backup, data restore and disaster recovery, today announced the release of ‘SEP sesam’ Version 3.4, with full certifications from SAP and VMware. SAP has certified ‘SEP sesam’ 3.4 for their enterprise level database software products, i.e. SAP R/3, SAP DB/MAXDB. This latest release features many user-friendly enhancements and important technological advancements including: Encryption; VMware and VMware VCB; Disaster Recovery down to Bare Metal; Added Groupware Modules; Disk-to-Disk-Tape: Data Storage; Off-Site Storage; Automated Data Restores; Secure Data Transfer and more. The new, highly scalable ‘SEP sesam’ V.3.4 is now available for download and will be showcased at Novell Brainshare in Salt Lake City, Utah March 16-21, 2008.

“This is a perfect fit for medium to large enterprise customers and fulfills all of their data backup, data restore and disaster recovery requirements,” explained Tim Wagner, President of SEP Software LLC. “Recent certifications from SAP AG validate ‘SEP sesam’ as a highly functional and desirable solution for the SAP user community.”

New Features and Expanded Technical Functionality Details:
- Encryption
o With data security becoming an increasingly disconcerting issue, SEP offers Blow Fish algorithm SEP and an additional AES 256-bit encryption. All data is encrypted on the client side, while keys are stored on the backup server. Only encrypted data travels across the network.
- VMware and VMware VCB
o ‘SEP sesam’ now supports VMware virtual environments at the enterprise level with the ‘SEP sesam’ ESX-client as well as VMware Virtual Consolidated Backup (VCB) in all variations.
- Disaster Recovery down to Bare Metal
o Recent enhancements to the ‘SEP sesam’ product allow restoration of data to the lowest levels possible. Bare System Recovery (‘SEP sesam’ BSR) can be integrated into centralized management backup strategies using the ‘SEP sesam’ GUI. New tasks and activities for a fast and universal Disaster Recovery on systems running Windows – including recovery to new hardware – have been integrated into the new release.
- Groupware Modules Added
o Novell Groupwise along with the Netware Client is fully-supported. SEP offers a complete backup solution for Novell OES2 environments.
o The Zarafa Module allows easier and faster backup including single mail and single item restore with the new online backup module for Zarafa Groupware Server.
- Disk-to-Disk-Tape: Data Storage, Off-Site Storage
o System Administrators can store data from Disk-to-Disk-Tape. Using the ‘SEP sesam’ virtual tape library, the transfer of data to tape or removable media is fast and easy. Customized production planning and control is now easier than ever for the effective and secure back-up of important enterprise data.
- ‘SEP sesam’ Improved Availability for SAN and NAS environments
o ‘SEP sesam’ 3.4 offers enhanced data availability, making it ideal for SAN, NAS and all other common network storage devices. The new version ensures against ANY loss of critical company data. The selected back-up strategy can be automated and controlled from either a central or remote location.
o ‘SEP sesam’ methodology is readily available for all popular operating systems, including Microsoft Windows, Unix and Linux.
- Automated Data Restores
o ‘SEP sesam’ can be set up to perform automated restores to constantly monitor data and to ensure data backup integrity assurance. Enhancements to the restore function ease the workload on overtaxed IT management organizations. The powerful and easy-to-use ‘SEP sesam’ scheduler (SEPuler) with calendaring allows for management of even the most complex data management tasks. Data files can be restored using the included Restore Wizard (for example a complete generation restore). The restore can be accomplished simply and easily.
- Secure Data Transfer
o Secure network data transfer will be accomplished through ‘SEP sesam’ SSL and SSH handling. The Firewall Port-Control protects against intrusion from outside the network and prevents unauthorized access from within. Using our Java-based GUI ‘SEP sesam’ allows remote administration from every operating system.
- ‘SEP sesam’ Online Modules
o ‘SEP sesam’ Online Modules maximize the security of all database and groupware solutions. Online Modules allow the efficient scheduling and planning of backups to take place during the production day. All ‘SEP sesam’ Online Modules are certified for Groupware (Novell Groupwise, OpenXchange, Scalix, etc.) and ERP Applications (SAP, ABAS). The ‘SEP sesam’ Online Database modules and SEP Live Recovery (Manageable Database Shadowing) allow automated data mirroring to keep the data available at all times.

Additionally, SEP has a corresponding certified release running SAP on Oracle 10g, which is now product-ready and tested for medium to very large enterprise customers. The certification includes Linux 64-Bit, Microsoft Windows 64-bit and Unix 64-bit operating systems.

Pricing and availability
‘SEP sesam’ is available for immediate download at 'SEP sesam' prices start at $325 for an OES2 or Linux server and $215 for any client.

About SEP Software
SEP Software is the technology leader in cross-platform data backup, data restore and disaster recovery. Flagship product ‘‘SEP sesam’’ delivers storage management and network-wide data security software solutions for worldwide Linux, Unix and Windows systems. Based in Boulder, Colorado, SEP Software LLC is a wholly owned subsidiary of SEP AG, whose labs and headquarters are based in Weyarn, Germany. For more information, please go to For inquiries please call (303) 417-6316 or mail to

Excellent Write-Up

ncc74656m's picture

For someone who is experienced in Linux, but has never done anything serious with it, I feel that this article is most beneficial and offers a lot of promise for the average newbie. It gives the user some sense of accomplishment when done without being complex and devastatingly hard.

I plan to do this for myself, and possibly a family member's small office when I have the chance and have proven the plausibility of it.

Great, but what about backups?

systemloc's picture

I've been running a 1TB usable RAID 5 box using linux and 6 200GB drives. Having the storage space is great, and it was very cheap to build, but here's the rub: How do you inexpensively build a backup solution for 1TB of changing data? I download, and I cull, and large amounts of data flows across that array, so how do I back up 200GB/month easily and cheaply? DVD-R is not an option, building another array is not an option IMO, since I want the option of keeping old backups. Blue-Ray or tape is what I'm inclined to look at, but both add $500-600 to the cost off the top, plus media.


Anonymous's picture

if you are running windows/linux/MacOS, then use Jungledisk with Amazon S3 for the storage. $0.15/Gb plus $5/month...inexpensive/simple and "powerful enough"

help you solve your problems which you faced

Anonymous's picture

If you want to know how to back up 200GB/month easily and cheaply?
you can go here ,

ssh problem for not skilfull linux user

Anonymous's picture

Hi All:

I found the article simple and easy to follow. But I had problems to implement it on my system. First of all I am not a skilfull linux user so I asked for help and found the solution.


In my machine and network environment, ssh authentication "as is" posted was not working.


Check that you are using SSH 2. If your machine uses SSH 1 as default, as mine does, you should rewrite the line:

rsync -az /home -e ssh bob@bar:/data1/foo


rsync -az /home -e "ssh -2" bob@bar:/data1/foo

of course follow all the steps of key pair generation and so on.

hope it helps

I don't know how to do...

paco69's picture


I will want to adapt your tutorial to my needs.
My situation:
I have X server on linux system to backup. The backup is on a server windows solution.

How can i configure smb.conf? & What is the right way to write the line "mount -t smbfs ....."? & I am obliged to write in fstab? if yes, what do I have to write?

Thank you very much for your replies.


ps:excuse me if my english is bad.


Ivan Minic's picture

The idea about this and realization of it is very nice indeed!

Report Script, change?

KGW's picture

The line at the end of the report script assumes you have an MTA running on the local machine, I think. I don't and I guess that is why I get "./ line 16: mail: command not found
Is there a variation of this line that would send the generated report to my local mozilla thunderbird?

problems with Generating the Key Pair

Ridgid's picture

I tried the Generating the Key Pair section on my Fedora core 4 box connecting to a slackware 10.2 box. I had no luck getting the key pair to let me connect without a password. After some research I found this how-to on it that worked better for me.


i had trouble with this..

sergio_rrd's picture

i had some trouble with this..

i think the problem is:


should be:




should be:


nice, how do you do restore though?

Anonymous's picture

how do you do restore though?

Another backuptool:

khoffrath's picture

Bacula has a windows client which allows to backup data from windows clients without the need to set up a share (if i understand the docs correct).

Hope to find some time to test it out ;)

Try KeyChain instead of passphraseless SSH keys

Erik Postma's picture


There is an alternative for passphraseless SSH keys that works quite well if you keep your linux-based backup client on for long times at a stretch: Keychain is a small program where you enter your SSH passphrase just once per power cycle. More info at

Power Requirements

timetrap's picture

I was just looking at starting the same project, but a modern (power saving) server will cost about 80 - $100 a year for power (considering .07 - .10 a kWh). I think the linksys NSLU2 (which is already running linux) is a much better option available for low power (1.5W instead of 60W). The only caveat; you need external USB drives to hook up to it.

Great article nonetheless, I hope that more people begin to build these so that the market place will have more (low power) options available.

Tsync is also worth to mention

Anonymous's picture

Basically is a moderm rsync with many improvements. (Redundancy, peer to peer, ...). It's still in beta, but probably already more stable that rsync and other sync technologies (unison, ...).

Diverse Disks

Chris's picture

A comment on the purchase of HDD that I didn't see in the article.
For those of you considering building a home backup server, do not go to your nearest computer store and pick up 4 of the same type of drive.
As was pointed out in a previous comment, the failure rate of IDE drives increases after ~12 months.
If you purchase a number of HDD from the same manufacturer at the same time from the same location, you run the risk of getting a similar HDD failure on all your HDD at the same time due to possible manufacturing defects.
It is recommended to either purchase similar spec'd drives from different manufacturers or if you really want the same brand of drive, ensure that they drives you purchase are from separate manufacturing runs.
The diversity of drives will significantly reduce your chances of data loss in the event that a particular manufacturer has a manufacturing defect that causes the drive to fail.
Do some google lookups for IBM Deskstore or Fujitsu HDD failures in the past few years for examples.

Anyway, happy 'backuping' ;)

Why Use Linux At All?

Richard's picture

Why on Earth would I want to go to all this trouble when I can slap Server Elements NASLite ( onto a CD and 10 minutes later dispense with an operating system altogether?

NASLite is Linux Powered

Kim Yamoto's picture

NASlite is Linux based - 2.4.26 to be exact! I'll be inpressed if someone can do the impossible and cram a Win32 app of that capability in 4M ramdisk. Let alone boot and run from a floppy disk.

Re: Why Use Linux At All?

Michael Hearne's picture

It's simple, here's the first clue; from the NASLite-in-a-Nutshell.pdf:

"At the DOS prompt, check to make sure that you can view the C drive."

This is a Linux group. We are Linux users. As a think-tank, we are creating our own solutions, and do not depend on commercial groups like the Microsoft Corp. to do it for us.

We share our sources and discoveries, and are not paid for it. However, our companies make more money for the effort, and as a result, we get raises.

The old Microsoft model worked like this: The programmer put a function into an exe file, and then sold the file.

The new Linux model works like this: The programmer shares his code and research; and as a result, his company profits, as do all concerned.

Can you produce the source code for this machine? I ask because there are many of us who also assemble our own hardware, and if we had the sources, then we would not have to purchase the machine at all.

The purpose of this article was to be a howto on assembling our own gear, and not to purchase Windows-based commercial stuff. I really don't understand the purpose of your question, unless it is an attempt to sell serverelement's (windows-based) equipment.

I think it's pointless here.

Michael Hearne

You didn't read very close

Jim Budler's picture

The NASLite in a Nutshell is an example of buying a settop box on e-bay for $32 which has a built in flash disk. The DOSlike commands you listed were being used to load a Linux boot image into that flash disk. So $32 for the machine, $25 for the software, and whatever for the USB2 hard drive and you have an NAS. A dumb one but it will work.

It is not Windows based. The source code for the FOSS portion is available.

It isn't a solution I would use, as I would prefer to have the capabilities of having it be a member of a Domain instead of just a LAN community disk server, but I see no need to get so heated up about it.

You could emulate it easily using a live CD distro, and configure it for all that function, and have it able to be a member of a Domain. But that's work, and some people prefer a $25 solution that "just works."



Jeff's picture

NASLite takes a minute to set up and just works. Best of all, it takes only 5 minutes to explain to a secretary how to administer it. Prior to using NASLite, RH was my choice, but when something goes wrong, I have to make a trip to correct the problem. With NASLite, a phone call usually resolves the issue.

Customer frustration is considerably lower this way.

NASLite just works.

Cheap Hardware and Low Time Investment

Jeff's picture

One can make a lot of arguments for and against Server Elements NASLite, but if you need to get a high capacity NAS server, built on low end hardware, there really are no alternatives. The damn thing runs in a mere 4M ramdisk. That exports your shares via CIFS, NFS, FTP and HTTP nicely and coherently.

I’ve been using it for 8 months on a 120MHz/64M Gateway box with 4x250G Maxtors. I had it set up and running in no time. Formatting the drives was the most time consuming task. Considering what I charge per hour for my consulting services, the $25 investment in the NASLite software was a no brainer. I’ve purchased multiple copies and installed it in many of my customer’s locations. That way I get to do the job inexpensively for them and profitably for me.

Not a bad choice if you don’t need a NAS fortress but a simple storage bin for a small office. I’d highly recommend it if you consider your time and customer’s dollars at all valuable.

No affiliation with Server Elements, just like the product enough to speak up…

Why on Earth use NASLite

Patrick Rea's picture

Why would I pay for something like this when I can do it for free? $25 and I can only run copy? I don't think so!

Doesn't NASLite use Linux

Anonymous's picture

Doesn't NASLite use Linux for OS portion?

RE: Why use Linux at all?

Anonymous's picture

Perhaps because

"By design, NASLite v1.x is a community file server and does not support features such as user management , the ability to join domains or disk quotas."


Anirban's picture

I am using LVM2, SAMBA, rdiff-backup and pyBackpack to setup my home backup system. It works quite well.Here are the details.

When building a server with

Juhani Tali's picture

When building a server with many disks keep in mind, that hard drives use mostly 12V, but most psu-s have limited power on 12V. They give plenty of 5V or 3,3V, but not 12V. You should make certain, that the seek power consuption of all HDD is less than the psu@12V. Do not be fooled by "480W", it might not be enough for 8 hdd-s!

I highly recommend NOT to make a backup server on nonredundant IDE disks, especially if you use raid0. If you lose one disk you lose all the data! IDE hdd reliability starts to decline after ~12 months.

Also, consider the power consumption. 100+W of constant usage adds up.

failure rates for drives

tgh's picture

I've seen the "~12 months" show up twice now in this thread.

Most hard drives fail according in a "bathtub curve" pattern. Meaning, you will get a few that fail early in the lifecycle, then very few failures until the end of the lifecycle. The ones that failed early probably had manufacturing defects.

This is why you want to mix/match your drives in a RAID set from different manufacturing batches. So that a process glitch in the factory only affects one of the drives in the array. If all of your drives came from the same faulty factory line, they might all fail within the same timeframe. If this timeframe happens to be shorter then the recovery period for the RAID then you will lose everything on the RAID.

Using hot-spare drives shortens the recovery period (the RAID array can immediately start the rebuild as soon as a drive fails). That gives you good odds of getting redundancy again before a 2nd drive fails.

IDE disk statistics?

TimJowers's picture

Does IDE have spare sectors like SCSI? Does it allow reporting of disk problem statistics and remaining spare sectors?

Also, do IDE disks still not support simultaneous data requests or has a work-around been made? At the price point it seems IDE is the choice for data backups.


EVMS and BackupPC

kbob's picture

BackupPC is a preferable choice to rsync if (a) you are backing up Windows boxes, or (b) you want non-expert users to be able to restore their own files. BackupPC's HTTP interface is very nice for non-experts.

EVMS is probably preferable to LVM + RAID. It gives the same capabilities and even more configuration flexibility. I say "probably" because I built my backup server before EVMS was available.

BackupPC not so great on Windows

Mick's picture

Backup solutions like this work great on any other OS but Windows.

A list of limitations:

These limitations are Windows limitations or similar and not really limitations of BackupPC.

I really recommend BackupPC is is great, just keep these things in mind when you consider using it with Windows.

no it isnt

Anonymous's picture

backup pc will do linux too. not just windoze

why not LVM or even raid

jeskritt's picture

with so many disks in the machine you should use LVM or at least raid0 to make them look like one big uniform drive. Newer versions of RHEL or fedora give you the option to set these up if you manually partion with diskdruid.

With LVM you can even add disks to the LVM after the system is operational and increase the file system size. If you are worried about failing disks destroying all your data set up raid with hot spares or use raid5. As the article said, disks are cheap.

Never use raid0. Ever.

Anonymous's picture

Never use raid0. Ever.

RAID 0 is very useful

Anonymous's picture

If you are doing something where you want the fastest read write performance, with data that you can recover from another source if need be.

Let's say you DV video from your video camera to your RAID 0 partition. This video is still available from it's original source. You edit the video and then when you are done you compress the video and save it to a RAID 5 drive with a hot spare.

I am doing this with a set of five SCSI drives, all 18GB each. This gives me 90GB of the fastest available hard drive to play with. The read write speed is no longer a bottle neck. :D

Not true.

Anonymous's picture

RAID-0 is often used in combination with other RAID configurations (mainly when hardware RAID is involved) to gain high performance while minimizing risk. I've configured RAID "enclosures" with multiple disk "trays" that are configured as RAID-5 w/hot-spare, with multiple trays spanning multiple HBAs, which have in turn been striped RAID-0. In order for a failure with data loss you would have to have 3 failures in one tray or more than 2 failures in multiple trays. Please don't say never when clearly the risks can be mitigated. RAID-5 does not replace proper backups either.

Better-er is rsync's --link-dest option

Karl O. Pinc's picture

You can dramatically improve the backup system using rsync's --link-dest option. Done right this will get you daily snapshots, like a full backup, but using only the disk space of an incremental backup.

(There are some issues with meta-information, file permissions and the like. mtree can be used to handle this.)

Rather than rolling your own, it could be better to use something like dirvish, which uses the --link-dest technique.

A nice backup software

Anonymous's picture

A nice backup software backup to try is BackupPC.

Making this work with Win/Mac client installs

Anonymous's picture

Thank you for this. Always knew those storage boxes were obcenely priced. A question: Have you tried this with Vembu StoreGrid (see - it seems to support all OS and uses rsync/zlib etc with a friendly UI. How would this work with this Linux TB box - if I'm lazy and don't wanna go through implementing rsync etc manually? Esp since I'm planning a 'box' for backing up Win/Mac clients and another smaller FreeeBSD server?

Go for it!

Duncan_Napier's picture

The setup presented is a completely generic one. I tried to show the most vanilla implementation of a terabyte-capacity storage system. The tools I presented are merely the most rudimentary utilities that come standard with virtually every Linux installation. If you have other tools that will run on Linux, then there is little to stop you implementing them as well. Have fun!

Go for it!

Anonymous's picture

Thank you for the article! As a Linux newbie, I learned a lot from this article. Although simplistic, I now have a backup server using linux.

Thanks again!