Practical backups for the small Solaris system
by Richard Auletta
Regular automated file system backups are as important for the single desktop
or small departmental server as they are for the enterprise. For the enterprise,
the large automated tape libraries of Sun Microsystem's Solstice Backup,
Legato Networker, and other software systems, such as the University of
Maryland's Amanda, offer backup solutions that can handle large, heterogeneous
networks. But for the single Solaris desktop or a small departmental server,
such solutions are unnecessary. Instead, a few simple UNIX commands will
allow you to set up a reliable automated backup procedure that will protect
against data loss.
Why back up?
You use backups to guard against the permanent loss of data. Although such
a loss usually results from a hardware failure, good backups also protect
against inadvertent data loss caused by users deleting their own files
or by a system administration error. A good backup procedure not only protects
against catastrophic file loss, but also against an rm gone wrong.
With the advent of inexpensive disk storage, you might be tempted
to back up to a disk drive or to set up a disk mirror instead of buying
a tape drive. While these methods will protect against catastrophic failure,
they can't retrieve a file that's deleted one day and needed the next.
A tape backup will allow you to keep a running archive of the files created
and deleted on your system. It's the ultimate "trash can."
What to back up
Most small systems approximate the default operating system and applications
configuration closely enough that they can usually be recovered by simply
reinstalling the operating system and applications. In the case of a catastrophic
hardware failure, a small system may use the occasion to upgrade both the
operating system and applications to their latest revisions. However, you'll
still want to preserve the hard-won information that's contained and remembered
only in your configuration files.
The most volatile data is that which your users create on a daily
basis. This difference in the file-system dynamics argues for backups to
be split into two distinct phases: a level 0 backup that captures the current
state of the system, and incremental level 1 backups that will record daily
changes from the date of the level 0 backup. For larger systems, you can
limit the level 1 backup to only user data. However, in typical systems,
the level 1 backups of other than user data will use little space on a
tape, since these backups contain few day-to-day changes.
The backup scheme
The backup scheme for a single desktop or small server will typically use
just two tapes. However, when planning your file system, you should create
your file-system partitions so that they can fit onto a single tape. If
your tape drive can hold 5GB, then don't make a partition larger than 5GB.
Otherwise, no matter what backup scheme you adopt, your overnight backup
will require a tape change.
The backup scheme will perform a level 0 dump of the system onto
one tape, then use a second tape for nightly incremental level 1 dumps.
You can usually get a month's worth or more of nightly level 1 dumps onto
a single tape. Each new dump gives you a running history of file-system
changes. In addition to archiving files, the procedure of writing a series
of level 1 backups spreads the wear over the entire tape, writing the latest
dump to a new section of tape.
Systems will frequently have several very large static file systems,
which might preclude doing a level 0 dump of the entire system onto one
tape. Instead, you can individually process these large static file systems
as level 0 backups. If your level 1 nightly backups exceed modern single-tape
tape drive capacities, you'll need to consider a more advanced backup system
and tape library.
Backup implementation
Creating automated backups requires some familiarity with the ufsdump,
mt, and cron commands. ufsdump is the UNIX file-system
dump command on Solaris. We prefer it over other commands, such as tar,
since it will capture the entire state of the file system. Some versions
of tar have path-length limitations and won't archive device files.
Other versions of tar, such as star and gnutar, overcome
some of these problems and can support incremental archives. But either
way, ufsdump is a good choice because the corresponding restore
utility, ufsrestore, allows interactive selection and recovery of
files. ufsrestore is part of the base operating system and is immediately
available for you to use when you're recovering from a catastrophe. ufsrestore
is also statically linked and will allow file recovery even when system
libraries have been lost. The mt command is the magnetic tape program
that allows you to control a tape with such actions as rewinding, erasing,
ejecting, and positioning the tape. It uses UNIX device files for accessing
the tape drive. Different device files allow opening a tape device with
rewind on close or no rewind on close. To access the tape drive with no
rewind at the end of an operation, you append an n to the device file /dev/rmt/0cn.
To have the tape rewind at the end of an operation, don't append an n to
/dev/rmt/0c. In both of these examples, the tape drive is operated in compressing
mode.
Listing A: Level 0 dump using the ufsdump script
#!/bin/sh
TAPE=/dev/rmt/0cn
LOG=/tmp/dump.log
mt rewind
date > $LOG 2>&1
ufsdump 0uf $TAPE / >> $LOG 2>&1
ufsdump 0uf $TAPE /usr >> $LOG 2>&1
ufsdump 0uf $TAPE /var >> $LOG 2>&1
ufsdump 0uf $TAPE /accts >> $LOG 2>&1
ufsdump 0uf $TAPE /opt >> $LOG 2>&1
mt rewind
mt offline
mailx -s DUMPLOG root < $LOG
Ufsdump scripts
All that you need to automate a level 0 dump is a basic shell script, here
a simple Bourne sh script. The script shown in Listing A rewinds
the tape, writes the date to a temporary file, then performs the dump of
the file systems /, /usr, /var, /accts, and /opt. By using a non-rewind
tape device (/dev/rmt/0cn), you can record each level 0 dump on a single
tape. The output of each dump command is recorded in a log file. After
the dump is finished, the tape is rewound, ejected from the tape drive
so that it's not inadvertently overwritten, and the log file is mailed
to root. You can list out your partitions and file systems with the command
df
-k. The level 1 script of Listing B is almost as simple. The major
difference is that the first mt command positions the tape to the
end of the last volume written on the tape. This way, each time the script
runs a new sequence of level 1 dumps, they're appended onto the tape. Notice
the tape isn't ejected from the drive, so that it's ready for the next
day's backup. The mail message will indicate when the tape is full. When
the tape is full, the process can begin again, with a level 0 dump on one
tape, then the nightly level 1 dumps on another tape. Make sure to label
the tapes clearly and to clean your tape drive regularly. If you wish to
restart a level 1 dump sequence, you'll need to first erase the tape using
mt erase. Better yet, save the current level 0 and level 1 dump
sequences and start with a new pair of tapes.
Listing B: Level 1 DUMP1 using a ufsdump script
#!/bin/sh
TAPE=/dev/rmt/0cn
LOG=/tmp/dump.log
mt -f $TAPE eom
date > $LOG 2>&1
ufsdump 1uf $TAPE / >> $LOG 2>&1
ufsdump 1uf $TAPE /usr >> $LOG 2>&1
ufsdump 1uf $TAPE /var >> $LOG 2>&1
ufsdump 1uf $TAPE /accts >> $LOG 2>&1
ufsdump 1uf $TAPE /opt >> $LOG 2>&1
mt rewind
mailx -s DUMPLOG root < $LOG
Automating the scripts
You can run the two scripts manually. However, if you use the clock daemon
cron
to automate the times they'll run, your scripts can run at night without
operator intervention. Simply edit your root crontab file and schedule
commands to run at the desired time. Listing C shows the commands to add
to the end of the root crontab file. The command /etc/Dumpdir/DUMP
runs
the shell script at 5:30a.m. every day. You can change the root crontab
file with the command crontab –e root. If you edit the root crontab
file directly, you'll need to restart cron. You can place the DUMP0
and DUMP1 scripts in a directory, such as /etc/Dumpdir, with DUMP as a
symbolic link to either DUMP0 or DUMP1 scripts. Changing the link will
cause cron to run the appropriate script, DUMP0, the first night, and then
run DUMP1 until the tape is full. Make sure scripts are executable (chmod
+x DUMP1 DUMP0). If your system is small enough and your tape capacity
large enough, you can place the level 0 dump and the corresponding level
1 dumps on a single tape. Finally, running Solaris ufsdump on an
active file system isn't a cause for concern. While the dump might miss
changes made during the dump, the dump will still be valid and can be recovered
with ufsrestore.
Listing C: Modifications to /var/spool/cron/crontabs/root
#Add to /var/spool/cron/crontabs/root
30 5 * * * /etc/Dumpdir/DUMP
Recovering a file
Doing a full recovery from a catastrophic disk failure with ufsrestore
is beyond the scope of this article. However, recovering the data from
the level 0 or level 1 tape may not be as obvious as it might seem, since
there are multiple volumes on the single tape. Typically, you'll want to
work with the last set of dumps on the level 1 tape, which will require
you to position the tape to the needed volume.
To recover a file, you must first position the tape to the end
by using the command mt -f /dev/rmt/0cn eom. Notice the use of the
non-rewind device. Then use mt -f /dev/rmt/0cn nbsf cnt to backspace
cnt volumes to the needed dump volume. Assuming you've been recording /,
/usr, /var, /accts, /opt as nightly level 1 dumps, you can recover yesterday's
user files that were inadvertently deleted this morning by following the
steps in Listing C.
Another mode of operation is to ask ufsrestore to skip over volumes
on the tape with the s option. Of course, this method requires that you
know how many volumes are on the tape and which one you need to access,
but it works well for the level 0 tape that has a known order and number
of volumes.
Listing C: Manual file-recovery steps
#Create a temporary directory.
mkdir /tmp/lostfiles
#Change to that directory.
cd /tmp/lostfiles
#Position the tape to the end.
mt -f /dev/rmt/0cn eom
#Backup to the /accts volume.
mt -f /dev/rmt/0cn nbsf 2
#Invoke ufsrestore interactively.
ufsrestore -if /dev/rmt/0cn
#Use the commands ls, cd, add,
#and extract at the interactive
#ufsrestore prompt to locate
#and add files or directories to
#be extracted. The recovered
#directories and files will
#reside in /tmp/lostfiles.
Details and bother
Backups are simply a contingency plan that you must execute properly before
the catastrophic event occurs. They depend more on preparation and execution
than on reaction to the actual event. In this regard, several simple steps
can improve the likelihood that your backup procedure will work when it's
finally needed.
First, keep a sequence of tapes to guard against a bad tape. Since
you need just one or two tapes on-hand, a series of old level 0 and level
1 tapes can be kept off-site. You may also want to make a duplicate level
0 tape each time a level 0 dump is performed. Replace your tapes regularly
and watch /var/adm/messages for any indications of system troubles that
are related either to the tape system or disk drives. Check your backups
once in a while by recovering a file.
Getting fancy
At this point, you probably have many ideas on how to improve the procedures
and scripts outlined in this article. You can modify the shell scripts
to take into account the requirements for your particular system and backup
requirements. For example, the backup scripts could empty user's CDE /.dt/Trash
directories after each level 1 backup or format and archive the logfile.
You can also incorporate another small improvement by creating a single
script, shown in Listing D, that creates and tests for the file doing.dump1
to determine if a level 0 or level 1 dump is executed. Just remember that
the time you most need a successful backup is probably when a fancy backup
script will fail.
Listing D: Improved ufsdump DUMP script
#!/bin/sh
TAPE=/dev/rmt/0cn
LOG=/tmp/dump.log
case $1 in
#Called with the argument restart
#this script runs usfdump level 0
#the next time it is called by cron.
'restart')
rm -f doing.dump1
;;
*)
#Level 1 dump unless doing.dump1 exists.
if [ -f doing.dump1 ]; then
mt –f $TAPE eom
date > $LOG 2>&1
ufsdump 1uf $TAPE / >> $LOG 2>&1
ufsdump 1uf $TAPE /usr >> $LOG 2>&1
ufsdump 1uf $TAPE /var >> $LOG 2>&1
ufsdump 1uf $TAPE /accts >> $LOG 2>&1
ufsdump 1uf $TAPE /opt >> $LOG 2>&1
mt rewind
mailx -s DUMPLOG root < $LOG
else
mt rewind
date > $LOG 2>&1
ufsdump 0uf $TAPE / >> $LOG 2>&1
ufsdump 0uf $TAPE /usr >> $LOG 2>&1
ufsdump 0uf $TAPE /var >> $LOG 2>&1
ufsdump 0uf $TAPE /accts >> $LOG 2>&1
ufsdump 0uf $TAPE /opt >> $LOG 2>&1
mt rewind
mt offline
mailx -s DUMPLOG root < $LOG
touch doing.dump1
fi
;;
esac
Conclusion
While Windows NT comes with a GUI backup tool, you can use an editor to
create a shell script of basic commands. With an appropriate crontab entry,
even a single Solaris desktop can run a custom backup procedure every night
with minimum operator intervention. Once you're confident that the procedures
we've discussed actually work, you can rest easier. Then, should disaster
ever strike, you'll be prepared to start the recovery process.
Web resources
Richard Auletta has been hacking around with UNIX since its days on
a Digital Equipment PDP-11/70. His current reason for associating with
UNIX is to support the advanced electronic design automation tools used
in the courses he teaches at the University of Colorado at Denver.
<< Back to Tech Corner