Cooking with Linux - Mirror, Mirror, of It All
The -r indicates a recursive copy, and the -p tells scp to preserve modification times, ownership and permissions from the original files and directories. If you are transferring large amounts of data, you might consider using the -C option, which does compression on the fly. It can make a substantial difference in throughput.
Possibly the biggest problem with all these methods of mirroring data is it can take a great deal of time. wget will download new files from an FTP server, but there is no option to keep a directory entirely in sync by deleting files. Secure copy is nice, but it doesn't have any mechanism for transferring only files that are changed. That's the second downside. Making sure that the data stays in sync without transferring every single file and directory requires a program with a bit more finesse.
The best program I know for this is probably Andrew Tridgell's rsync. Linux Journal's own Mick Bauer did a superb job of covering this package in the March and April 2003 issues of this fine magazine, so I won't go over it again other than to say you might want to look up his two-parter on the subject.
In many cases, that leaves us with our old friend, FTP—well, sort of. On one side (the machine you want to mirror), you would use your FTP server, whether it was ProFTPD or wu-ftpd. On the other side, you would use Uwe Ohse's ftpcopy program. ftpcopy is a fast, easy-to-set-up and easy-to-use program that does a nice job of copying entire directory hierarchies. As it copies, it maintains permissions and modification dates and times, and it does it fast. Furthermore, it keeps track of files that already have been downloaded. This is handy because the next time you run ftpcopy, it transfers only those files that have been changed, thus making your backup even faster.
Some distributions come with ftpcopy, but for the latest version of ftpcopy, go to www.ohse.de/uwe/ftpcopy/ftpcopy.html to pick up the download. Building the package is easy and takes only a few steps:
tar -xzvf ftpcopy-0.6.2.tar.gz cd web/ftpcopy-0.6.2 make
In the directory called command, you'll find three binaries: ftpcopy, ftpcp and ftpls. You can run it from here or copy the three files to /usr/local/bin or somewhere else in your $PATH.
Here's how it works. Let's say I wanted to mirror or back up my home directory on a remote system. A basic ftpcopy command looks something like this:
ftpcopy -u marcel -p secr3t! \ remote.hostname /home/marcel /mirdir/
The -u and -p options are obviously for my user name and (fake) password on the remote system. What follows is the path to the directory you want to copy and then the local directory where this directory structure will be re-created. As the download progresses, you will see something like this:
/mirdir/scripts/backup.log: download successful /mirdir/scripts/checkhosts.pl: download successful /mirdir/scripts/ftplogin.msg: download successful /mirdir/scripts/gettime.pl: download successful
If you want a little more information on your download, add the --bps option. The results then report the rate of data transfer in bytes per second.
You should consider running ftpcopy with the --help option at least once, and you should be aware of some options. For instance, -s deals with symbolic links, and -l lets you increase the level of logging. If you want to set mirroring to run by means of a cron job, you might want to set logging to 0. Another useful option is -n. If a file is deleted on the remote side, it also will be deleted locally when you run ftpcopy. If you truly are trying to keep systems in sync, this is what you would want. To override this behavior, add -n and no deletes will occur.
Well, mes amis, the hour has arrived, and we must all go to our respective homes. Still, it is early enough for a final glass of wine, non? François, mon ami, if you will do the honors—in fact, make it two glasses, one to mirror the other, non? Until next time, mes amis, let us all drink to one another's health. A vôtre santé Bon appétit!!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
|Designing Electronics with Linux||May 22, 2013|
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
- RSS Feeds
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Designing Electronics with Linux
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Validate an E-Mail Address with PHP, the Right Way
- What's the tweeting protocol?
- Kernel Problem
5 hours 40 min ago
- BASH script to log IPs on public web server
10 hours 7 min ago
13 hours 43 min ago
- Reply to comment | Linux Journal
14 hours 15 min ago
- All the articles you talked
16 hours 39 min ago
- All the articles you talked
16 hours 42 min ago
- All the articles you talked
16 hours 43 min ago
21 hours 8 min ago
- Keeping track of IP address
22 hours 59 min ago
- Roll your own dynamic dns
1 day 4 hours ago