Practical Backups

Practical backups for the small Solaris system

by Richard Auletta

Regular automated file system backups are as important for the single desktop or small departmental server as they are for the enterprise. For the enterprise, the large automated tape libraries of Sun Microsystem's Solstice Backup, Legato Networker, and other software systems, such as the University of Maryland's Amanda, offer backup solutions that can handle large, heterogeneous networks. But for the single Solaris desktop or a small departmental server, such solutions are unnecessary. Instead, a few simple UNIX commands will allow you to set up a reliable automated backup procedure that will protect against data loss.

Why back up?

You use backups to guard against the permanent loss of data. Although such a loss usually results from a hardware failure, good backups also protect against inadvertent data loss caused by users deleting their own files or by a system administration error. A good backup procedure not only protects against catastrophic file loss, but also against an rm gone wrong.

With the advent of inexpensive disk storage, you might be tempted to back up to a disk drive or to set up a disk mirror instead of buying a tape drive. While these methods will protect against catastrophic failure, they can't retrieve a file that's deleted one day and needed the next. A tape backup will allow you to keep a running archive of the files created and deleted on your system. It's the ultimate "trash can."

What to back up

Most small systems approximate the default operating system and applications configuration closely enough that they can usually be recovered by simply reinstalling the operating system and applications. In the case of a catastrophic hardware failure, a small system may use the occasion to upgrade both the operating system and applications to their latest revisions. However, you'll still want to preserve the hard-won information that's contained and remembered only in your configuration files.

The most volatile data is that which your users create on a daily basis. This difference in the file-system dynamics argues for backups to be split into two distinct phases: a level 0 backup that captures the current state of the system, and incremental level 1 backups that will record daily changes from the date of the level 0 backup. For larger systems, you can limit the level 1 backup to only user data. However, in typical systems, the level 1 backups of other than user data will use little space on a tape, since these backups contain few day-to-day changes.

The backup scheme

The backup scheme for a single desktop or small server will typically use just two tapes. However, when planning your file system, you should create your file-system partitions so that they can fit onto a single tape. If your tape drive can hold 5GB, then don't make a partition larger than 5GB. Otherwise, no matter what backup scheme you adopt, your overnight backup will require a tape change.

The backup scheme will perform a level 0 dump of the system onto one tape, then use a second tape for nightly incremental level 1 dumps. You can usually get a month's worth or more of nightly level 1 dumps onto a single tape. Each new dump gives you a running history of file-system changes. In addition to archiving files, the procedure of writing a series of level 1 backups spreads the wear over the entire tape, writing the latest dump to a new section of tape.

Systems will frequently have several very large static file systems, which might preclude doing a level 0 dump of the entire system onto one tape. Instead, you can individually process these large static file systems as level 0 backups. If your level 1 nightly backups exceed modern single-tape tape drive capacities, you'll need to consider a more advanced backup system and tape library.

Backup implementation

Creating automated backups requires some familiarity with the ufsdump, mt, and cron commands. ufsdump is the UNIX file-system dump command on Solaris. We prefer it over other commands, such as tar, since it will capture the entire state of the file system. Some versions of tar have path-length limitations and won't archive device files. Other versions of tar, such as star and gnutar, overcome some of these problems and can support incremental archives. But either way, ufsdump is a good choice because the corresponding restore utility, ufsrestore, allows interactive selection and recovery of files. ufsrestore is part of the base operating system and is immediately available for you to use when you're recovering from a catastrophe. ufsrestore is also statically linked and will allow file recovery even when system libraries have been lost. The mt command is the magnetic tape program that allows you to control a tape with such actions as rewinding, erasing, ejecting, and positioning the tape. It uses UNIX device files for accessing the tape drive. Different device files allow opening a tape device with rewind on close or no rewind on close. To access the tape drive with no rewind at the end of an operation, you append an n to the device file /dev/rmt/0cn. To have the tape rewind at the end of an operation, don't append an n to /dev/rmt/0c. In both of these examples, the tape drive is operated in compressing mode.

Listing A: Level 0 dump using the ufsdump script

#!/bin/sh
TAPE=/dev/rmt/0cn
LOG=/tmp/dump.log

mt rewind
date > $LOG 2>&1
ufsdump 0uf $TAPE /      >> $LOG 2>&1
ufsdump 0uf $TAPE /usr   >> $LOG 2>&1
ufsdump 0uf $TAPE /var   >> $LOG 2>&1
ufsdump 0uf $TAPE /accts >> $LOG 2>&1
ufsdump 0uf $TAPE /opt   >> $LOG 2>&1
mt rewind
mt offline
mailx -s DUMPLOG root < $LOG

Ufsdump scripts

All that you need to automate a level 0 dump is a basic shell script, here a simple Bourne sh script. The script shown in Listing A rewinds the tape, writes the date to a temporary file, then performs the dump of the file systems /, /usr, /var, /accts, and /opt. By using a non-rewind tape device (/dev/rmt/0cn), you can record each level 0 dump on a single tape. The output of each dump command is recorded in a log file. After the dump is finished, the tape is rewound, ejected from the tape drive so that it's not inadvertently overwritten, and the log file is mailed to root. You can list out your partitions and file systems with the command df -k. The level 1 script of Listing B is almost as simple. The major difference is that the first mt command positions the tape to the end of the last volume written on the tape. This way, each time the script runs a new sequence of level 1 dumps, they're appended onto the tape. Notice the tape isn't ejected from the drive, so that it's ready for the next day's backup. The mail message will indicate when the tape is full. When the tape is full, the process can begin again, with a level 0 dump on one tape, then the nightly level 1 dumps on another tape. Make sure to label the tapes clearly and to clean your tape drive regularly. If you wish to restart a level 1 dump sequence, you'll need to first erase the tape using mt erase. Better yet, save the current level 0 and level 1 dump sequences and start with a new pair of tapes.

Listing B: Level 1 DUMP1 using a ufsdump script

#!/bin/sh
TAPE=/dev/rmt/0cn
LOG=/tmp/dump.log

mt -f $TAPE eom
date > $LOG 2>&1
ufsdump 1uf $TAPE /      >> $LOG 2>&1
ufsdump 1uf $TAPE /usr   >> $LOG 2>&1
ufsdump 1uf $TAPE /var   >> $LOG 2>&1
ufsdump 1uf $TAPE /accts >> $LOG 2>&1
ufsdump 1uf $TAPE /opt   >> $LOG 2>&1
mt rewind
mailx -s DUMPLOG root < $LOG

Automating the scripts

You can run the two scripts manually. However, if you use the clock daemon cron to automate the times they'll run, your scripts can run at night without operator intervention. Simply edit your root crontab file and schedule commands to run at the desired time. Listing C shows the commands to add to the end of the root crontab file. The command /etc/Dumpdir/DUMP runs the shell script at 5:30a.m. every day. You can change the root crontab file with the command crontab –e root. If you edit the root crontab file directly, you'll need to restart cron. You can place the DUMP0 and DUMP1 scripts in a directory, such as /etc/Dumpdir, with DUMP as a symbolic link to either DUMP0 or DUMP1 scripts. Changing the link will cause cron to run the appropriate script, DUMP0, the first night, and then run DUMP1 until the tape is full. Make sure scripts are executable (chmod +x DUMP1 DUMP0). If your system is small enough and your tape capacity large enough, you can place the level 0 dump and the corresponding level 1 dumps on a single tape. Finally, running Solaris ufsdump on an active file system isn't a cause for concern. While the dump might miss changes made during the dump, the dump will still be valid and can be recovered with ufsrestore.

Listing C: Modifications to /var/spool/cron/crontabs/root

#Add to /var/spool/cron/crontabs/root
30 5 * * * /etc/Dumpdir/DUMP

Recovering a file

Doing a full recovery from a catastrophic disk failure with ufsrestore is beyond the scope of this article. However, recovering the data from the level 0 or level 1 tape may not be as obvious as it might seem, since there are multiple volumes on the single tape. Typically, you'll want to work with the last set of dumps on the level 1 tape, which will require you to position the tape to the needed volume.

To recover a file, you must first position the tape to the end by using the command mt -f /dev/rmt/0cn eom. Notice the use of the non-rewind device. Then use mt -f /dev/rmt/0cn nbsf cnt to backspace cnt volumes to the needed dump volume. Assuming you've been recording /, /usr, /var, /accts, /opt as nightly level 1 dumps, you can recover yesterday's user files that were inadvertently deleted this morning by following the steps in Listing C.

Another mode of operation is to ask ufsrestore to skip over volumes on the tape with the s option. Of course, this method requires that you know how many volumes are on the tape and which one you need to access, but it works well for the level 0 tape that has a known order and number of volumes.

Listing C: Manual file-recovery steps

#Create a temporary directory.
mkdir /tmp/lostfiles

#Change to that directory.
cd /tmp/lostfiles

#Position the tape to the end.
mt -f /dev/rmt/0cn eom

#Backup to the /accts volume.
mt -f /dev/rmt/0cn nbsf 2

#Invoke ufsrestore interactively.
ufsrestore -if /dev/rmt/0cn

#Use the commands ls, cd, add,
#and extract at the interactive
#ufsrestore prompt to locate
#and add files or directories to
#be extracted. The recovered
#directories and files will
#reside in /tmp/lostfiles.

Details and bother

Backups are simply a contingency plan that you must execute properly before the catastrophic event occurs. They depend more on preparation and execution than on reaction to the actual event. In this regard, several simple steps can improve the likelihood that your backup procedure will work when it's finally needed.

First, keep a sequence of tapes to guard against a bad tape. Since you need just one or two tapes on-hand, a series of old level 0 and level 1 tapes can be kept off-site. You may also want to make a duplicate level 0 tape each time a level 0 dump is performed. Replace your tapes regularly and watch /var/adm/messages for any indications of system troubles that are related either to the tape system or disk drives. Check your backups once in a while by recovering a file.

Getting fancy

At this point, you probably have many ideas on how to improve the procedures and scripts outlined in this article. You can modify the shell scripts to take into account the requirements for your particular system and backup requirements. For example, the backup scripts could empty user's CDE /.dt/Trash directories after each level 1 backup or format and archive the logfile.

You can also incorporate another small improvement by creating a single script, shown in Listing D, that creates and tests for the file doing.dump1 to determine if a level 0 or level 1 dump is executed. Just remember that the time you most need a successful backup is probably when a fancy backup script will fail.

Listing D: Improved ufsdump DUMP script

#!/bin/sh
TAPE=/dev/rmt/0cn
LOG=/tmp/dump.log

case $1 in

#Called with the argument restart
#this script runs usfdump level 0
#the next time it is called by cron.

'restart')
rm -f doing.dump1
;;
*)
#Level 1 dump unless doing.dump1 exists.
if [ -f doing.dump1 ]; then
 mt –f $TAPE eom
 date > $LOG 2>&1
 ufsdump 1uf $TAPE /      >> $LOG 2>&1
 ufsdump 1uf $TAPE /usr   >> $LOG 2>&1
 ufsdump 1uf $TAPE /var   >> $LOG 2>&1
 ufsdump 1uf $TAPE /accts >> $LOG 2>&1
 ufsdump 1uf $TAPE /opt   >> $LOG 2>&1
 mt rewind
 mailx -s DUMPLOG root < $LOG
else
 mt rewind
 date > $LOG 2>&1
 ufsdump 0uf $TAPE /      >> $LOG 2>&1
 ufsdump 0uf $TAPE /usr   >> $LOG 2>&1
 ufsdump 0uf $TAPE /var   >> $LOG 2>&1
 ufsdump 0uf $TAPE /accts >> $LOG 2>&1
 ufsdump 0uf $TAPE /opt   >> $LOG 2>&1
 mt rewind
 mt offline
 mailx -s DUMPLOG root < $LOG
 touch doing.dump1
fi
 ;;
esac

Conclusion

While Windows NT comes with a GUI backup tool, you can use an editor to create a shell script of basic commands. With an appropriate crontab entry, even a single Solaris desktop can run a custom backup procedure every night with minimum operator intervention. Once you're confident that the procedures we've discussed actually work, you can rest easier. Then, should disaster ever strike, you'll be prepared to start the recovery process.

Web resources

Sun Solstice Backup: http://www.sun.com
Legato Networker: http://www.legato.com/
Amanda: http://www.cs.umd.edu/projects/amanda/
Star: ftp://ftp.fokus.gmd.de/pub/UNIX/star/
Stokely Consulting UNIX SysAdmin Resources: http://www.stokely.com/UNIX.sysadm.resources/autosysmgm.backup.html
Sun Answerbook Online: http://docs.sun.com

Richard Auletta has been hacking around with UNIX since its days on a Digital Equipment PDP-11/70. His current reason for associating with UNIX is to support the advanced electronic design automation tools used in the courses he teaches at the University of Colorado at Denver.

<< Back to Tech Corner