Automating Remote Backups
For users who prefer to use a desktop tool instead of scripts for setting up and performing backups, there is the Grsync tool. This is a GTK+-based tool that provides a nearly complete front end to rsync. It can be used to select a single source and destination and is likely available from Linux distribution repositories.

Figure 2. Grsync is a desktop tool for scheduling backups. Although generally useful, it lacks include/exclude options and direct cron management.
Although previous versions appear to have had an integrated cron configuration, the current version available with Fedora does not. Also, Grsync does not allow selection of multiple source files or directories nor does it allow setting exclusion lists. Both of these are supported by the rsync command line. Grsync can create a session file that can be called from cron, but it does not include information on how to notify the user of the results of the backup.
Due to the lack of cron integration, missing include and exclude options and no integration of user notification, Grsync is not an ideal backup solution. The scripts described here, along with the addition of ssmtp for simplified SMTP-based notification, are a better solution.
With SSH set up and the choice to script backups instead of using a desktop application out of the way, it is time to consider what files to back up. Four sets of files should be considered: system configuration files, database files, users' home directories and Web files.
System configuration files include files such as the password and group files, hosts, exports and resolver files, MySQL and PHP configurations, SSH server configuration and so forth. Backup of various system configuration files is important even if it's not desirable to reuse them directly during a system re-install. The password and group files, for example, shouldn't be copied verbatim to /etc/passwd and /etc/group but rather used as reference to re-create user logins matched to their home directories and existing groups. The entire /etc directory can be backed up, although in practice, only a few of these files need to be re-installed or merged after a distribution re-installation.
Some applications built from source, such as ssmtp, which will be used for notification in the backup scripts, may install to /usr/local or /opt. Those directories can be backed up too, or the applications can be rebuilt after a distribution upgrade.
MySQL database files can be backed up verbatim, but it may be easier to dump the databases to a text file and then reload them after an upgrade. This method should allow for the database to handle version changes cleanly.
User home directories typically contain all user data. Generally, all files under /home except the /home/lost+found directory should be backed up. This assumes that all user logins are kept on /home. Check your distribution documentation to verify the location of user home directories.
Home users may not use Web servers internally, but there is no reason they shouldn't be. Wikis, blogs, media archives and the like are easy to set up and offer a family a variety of easy-to-use communication systems within the home. Setting up document root directories (using Apache configuration files) under /home makes backing up these files identical to any other user files.
There are also files and directories to avoid when performing backups. The lost+found directory always should be excluded, as should $HOME/.gvfs, which is created for GNOME users after they log in.
All of the backups can be handled by a single script, but because backup needs change often, I find it easier to keep with UNIX tradition and created a set of four small scripts for managing different backup requirements.
The first script is used to run the other scripts and send e-mail notifications of the reports on the backup process. This script is run by root via cron each night:
#!/bin/bash
HOST=`hostname`
date=`date`
mailfile="/tmp/$$.bulog"
# Mail Header
echo "To: userid@yourdomain.org" > $mailfile
echo "From: userid@yourdomain.org" >> $mailfile
echo "Subject: $HOST: Report for $date" >> $mailfile
echo " " >> $mailfile
echo "$HOST backup report:" >> $mailfile
echo "-------------------------------" >> $mailfile
# Run the backup.
$1 >> $mailfile 2>&1
# Send the report.
cat $mailfile | \
/usr/local/ssmtp/sbin/ssmtp -t \
-auuserid@yourdomain.org -apyourpassword \
-amCRAM-MD5
rm $mailfile
The first argument to the script is the backup script to run. An enhanced version would verify the command-line option before attempting to run it.
This script uses an external program (ssmtp) for sending backup reports. If you have an alternative tool for sending e-mail from the command line, you can replace ssmtp usage with that tool. Alternatively, you can skip using this front end completely and run the backup scripts directly from cron and/or the command line.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- RSS Feeds
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Designing Electronics with Linux
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Validate an E-Mail Address with PHP, the Right Way
- What's the tweeting protocol?
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




8 hours 26 min ago
12 hours 53 min ago
16 hours 29 min ago
17 hours 2 min ago
19 hours 25 min ago
19 hours 28 min ago
19 hours 30 min ago
23 hours 54 min ago
1 day 1 hour ago
1 day 6 hours ago