Copyright © 1998
Jon
Hamilton
Here is a simple look at using dump and restore to back up and restore your data.
This information is provided as-is. While I've tried to make this useful, I can't guarantee that there are no errors or omissions. I also can't be responsible for your data or your backups; it is critical that any backup scheme you choose to implement is carefully thought out and that you test it carefully to make sure that you can get back what you expect to get back in the event of a problem.
dump and restore operate on entire filesystems, which allows them to efficiently handle all kinds of interesting things like device files, sparse files, links, and so on. Each dump session will back up files from one filesystem; if you have many filesystems to back up, you will need to invoke dump once for each filesystem.
Note: dump only works on UFS filesystems; if you have DOS filesystems to back up, you'll need a different tool. |
Let's take an example of a typical home machine which contains the following filesystems:
To back up all the filesystems on this machine, we would need four invocations of dump.
Now we need to tell dump which files within the filesystem to back up. We can perform a full dump, which backs up all files in the filesystem, or we can perform an incremental dump, which backs up only recently changed files. We specify the type of backup to perform by specifying a dump level from 0 through 9, 0 being full dump and 1-9 being incremental. For levels 1-9, dump will only back up files which have changed since the most recent lower level dump. That is, a level 1 dump will back up everything which has changed since the most level 0 backup, whereas a level 5 dump will back up only those files which have changed since the most recent level 0,1,2,3, or 4 backups.
Now we need somewhere to put this data! Under ordinary circumstances you will probably want to back your data up to some sort of tape drive, although you can back up to a file, to floppies, or to just about anywhere else you might like. Dump can be used to back up to either a locally attached tape drive or to a tape drive on a remote system (provided the remote system allows you access, of course).
To back up locally, we simply need to tell dump the name of the tape drive or
file we wish to back up to. If you have a SCSI tape drive, you might use
/dev/nrst0. Use the -f command line
flag to tell
dump where to put the data.
If you are backing up to a tape drive on a remote system, things are only
slightly more complicated; to back up to the tape drive
/dev/rmt/0mn on remote machine
backuphost, you would specify
-f backuphost:/dev/rmt/0mn on the
dump command line. In either case, you almost certainly
want to use the
no-rewind device; more on this in a later section.
Some caveats of remote tape drive access:
The name of the tape drive is the name as known to the remote system. You will have to determine this name by consulting the documentation for the remote system (or a human who is familiar with the system).
The remote machine must allow you to rsh to it without a password. This can cause serious security holes; be sure to assess carefully.
Note: Some variants of UNIX(TM) use the name remsh instead of rsh. |
Note that dump can write backups which span more than one tape; when the first tape fills up, dump will prompt you to insert another tape and tell it when to continue writing.
Dump is a very old (but still very useful) utility, and many of its default values are more suitable for quarter inch cartridge tape than for modern DAT, 8mm, or DLT drives. Some command line flags you may be able to twiddle to improve performance include:
auto size. Given this option, dump will write until the end of the tape rather than trying to calculate the amount of data which will fit on a tape. This is best for modern tape drives, but some very old units may not be able to reliably communicate to dump when they have reached end of tape, so you will have to specify density, number of feet, etc. See the dump man page for details if you need to specify these values; for most users, it's probably better and safer to simply use the -a flag.
block size. This determines the number of 1K blocks per dump record.
Options of more generic usefulness include:
This is not an exhaustive listing; see the dump manual page for other options.
As mentioned above, one generally uses the no-rewind tape device when writing backups. Because dump processes only one filesystem at a time, we would like a way to minimize the number of tapes needed. It would certainly be wasteful if we could only fit one dump on each tape; imagine backing your 20Mb / partition onto a 40Gb DLT tape and being unable to use the rest of the tape! Fortunately, it's possible to put many dump sessions on a single tape.
To do this, you need to be sure to use the no-rewind tape device. Under FreeBSD, this is the device beginning with n, that is, /dev/nrst0 rather than /dev/rst0. The difference: the no-rewind device will write the data, then leave the tape positioned where it finished writing; you can then put another backup at that location without overwriting your prior backup. If you had used the rewind-on-close device (/dev/rst0 under FreeBSD), the data would have been written, and when dump finished writing, the tape would automatically rewind. If you were to start writing another backup to the tape at that point, you would overwrite your prior backup data.
See the later section concerning tape positioning using mt for more information on how to position the tape manually.
Dump can transparently write archives which span more than one tape; it will prompt you when it thinks a tape is full and ask that you insert another. This can occur if you have a filesystem that is larger than will fit on one tape, or if you are writing multiple dumps to one tape and you run out of room. There is nothing you need to do in order to enable this behavior, although you do need to keep it in mind if you want to run your backups out of cron; if you do, you'll need to plan so that you never fill a tape completely (because there will be no way for you to interact with dump when it tries to prompt for the next tape).
Once you have a tape with multiple dump sessions on it, you'll need to have a way to position the tape so you can restore from whichever session you need. Say you have a single tape with backups of /, /usr, and /var on it, and you want to restore a file under /usr. When you insert the tape into your drive, it will most likely rewind itself to the beginning of the tape; however, the data you want isn't in the first dump set, but the second. Never fear, mt to the rescue!
The mt command is used to position tapes, eject them, and to set various parameters on the tape drive. Many of these options are beyond the scope of this document; see the mt manual page for details. We concern ourself here only with the options which position the tape and which get the status of the tape drive.
Given the above example in which the data we wanted to restore is
in the second dump set on the tape, we would insert the tape into
the drive, wait for the tape to load and rewind (usually indicated
by front panel lights on the tape drive turning solid rather than
flashing), then use mt -f /dev/nrst0 fsf 1 to move
the tape to the
second dump set (that is, move forward one backup set past the
current position). Of course, you will have to adjust
/dev/nrst0
to fit your circumstances if your tape drive is named differently.
Unlike dump and restore, mt cannot be used to control a remote tape device. If you are using a tape drive attached to another machine, you will have to log into the remote machine directly and use mt on that machine to do the tape positioning for dump.
Note: Some other UNIX(TM)-like operating systems do provide an mt command which can work remotely; check your system's manual pages if you're curious about this. Restore has a -s switch which can be used to specify which backup set on the tape to use, so in ordinary use, you can probably get by without having to even use mt. |
A couple of other mt options bear mentioning:
mt -f tapedev status
reports some status information about the tape drive. The
level of detail varies depending on the driver for your tape drive,
but this usually contains interesting information such as whether
the tape is rewound to the beginning, whether the drive is ready,
write density, and other such information.
mt -f tapedev rewind
rewinds your tape.
mt -f tapedev fsf count will move your tape
forward count backup sets from its current location. If you
executed
mt -f /dev/nrst0 fsf 2 when the
tape was positioned at
the beginning of the second backup set, you would wind up at the
beginning of the fourth backup set when finished, which may not be
what you intended. A common mistake is to assume that count is
an absolute offset from the beginning of the tape. If you aren't
sure, it's usually best to first rewind the tape using mt -f
tapedev rewind first before using mt -f tapedev fsf count to
move the tape forward.
mt -f tapedev offline will eject your
tape from the drive, if possible (it depends on the drive itself
as well as the driver whether you can eject a tape via software).
Of course, we need to be able to get our data back off the tapes; that's where restore comes in. There are two typical methods of restoring files: interactive and non-interactive. In interactive mode, you can easily choose which files to restore; in non-interactive mode restore will bring back all files from the backup set.
To perform a restore of all files in a backup set, follow this recipe:
Use mt as described above to position the tape to the correct spot (or use the -s flag of restore as described in the examples.
Change directory to the root of where you wish to restore. Restore will use the current directory as its starting point for extractions, so it is important to be in the correct directory when invoking restore.
Invoke restore with the -r flag:
restore -rf tapedev
Get a cup of your favorite beverage while the restore runs.
Note that restore will not delete any existing files, but it will overwrite files in the target area even if they are newer than those in the backup set.
To selectively restore only a subset of the backup set (for example, if you inadvertently deleted a particular file/directory and wish to retrieve only that file from the tape), use an interactive restore session.
To start an interactive restore, use the
-i flag to restore:
restore -if tapedev
(be sure to correctly position the tape and cd to your desired extraction
directory before invoking restore).
You will then see a restore> prompt, at which you can
type interactive restore commands, including:
add [arg] - Adds [arg] to the list of files to be extracted from the backup.
cd [arg] - Use to navigate the backup set just like you'd use a shell to navigate a regular filesystem.
delete [arg] - Removes [arg] from the list of files to extract from the backup. Note that this won't actually cause any files to be removed from the disk or from the backup set; it only removes the file from the extract list.
extract - After you have used add to mark all the files and directories you want to recover, use this command to actually begin the recovery.
ls - List the contents of the backup archive. You can use this and the cd command above to view the contents of the backup set. Files and directories marked for extraction are marked with an x.
pwd - Use this to tell you where you are within the backup set.
quit - As you would expect, this quits the restore session immediately, without restoring any files.
verbose - Tells restore to be much more chatty about what it is doing during the actual extraction.
See the examples at the end of this document for some more concrete examples of actually performing an interactive restore.
Your exact backup schedule will vary depending on your needs; the important thing is to always make sure to do backups regularly. For a business or a home machine on which important files change frequently, a full backup one day per week with incremental backups the other six days is often called for. On a home machine containing less important (or more slowly changing) files, or on a client machine whose important files reside on an NFS server and are backed up there, a schedule with one full backup per month and incrementals once per week may be sufficient. You'll have to evaluate how much work you're willing to lose and schedule your backups accordingly - keep in mind that accidents and system failures seem most common right before a backup is made, so always assume that if there is a failure, it'll be in the worst case and you'll lose everything since your last backup.
Some people like to minimize their tape usage, and will start with a full backup and progress up the backup level with a level 1 dump, followed by a level 2 dump, followed by a level 3, and so on up to level 9. While it is true that this approach minimizes tape usage, it is also quite a hassle if you need to restore to in the middle of a cycle. Say you did a full backup on Sunday, a level 1 on Monday, a level 2 on Tuesday, and a level 3 on Wednesday. First thing Thursday morning, you inadvertently removed a directory you were working in, and want to restore it to the state it was in at the time of Wednesday's backup. You would then have to recover your full backup from Sunday, then your level 1 from Monday, then your level 2 from Tuesday, then your level 3 from Wednesday - total of four restores to get your directory back the way it was yesterday. Contrast this with a more tape-wasteful but more convenient schedule where you perform a full backup on Sunday, then a level 9 backup each other day of the week. In the same scenario, you would only have to perform two restores to get your directory back - one from the full backup and one from Wednesday's level 9.
In the first scenario (when we started with a full backup and worked our way up the levels each day thereafter), each day's backup will contain only those files which had changed since the previous day's backup. In the second, each day's partial backups would contain all the files which had changed since the last full backup. This can all get to be somewhat mind-numbing, but it's worth spending the time to understand. Sketching the two scenarios out on paper and tracing through them is a good idea if it's still not clear.
To make this all a little more concrete, let's walk through an example. On a system with /, /usr, /var, and /home filesystems, we'll go through a full backup and several incrementals, then perform a restore. For this example, assume that we are backing up to a remote system called squishy, and the name of the tape drive we'll be using on squishy is /dev/nrst0. As mentioned above, it is necessary that squishy allow us to remote shell to it without a password. Remote dump sessions must be run as root, and as a practical matter, you should almost always run your system backups as root even when writing to a local tape drive. For this example, we'll do a level 0 (full) dump on Sunday night, and level 9 dumps on each evening from Monday through Friday. We'll also use one tape for the full backups, and a seperate tape for each subsequent incremental, so for the week we'll have a total of 6 tapes.
Sunday night, do our full backups. Ordinarily, the commands below would be put in a shell script and you would then execute that script.
dump 0uaf squishy:/dev/nrst0 / # backs up the root filesystem
dump 0uaf squishy:/dev/nrst0 /usr # backs up /usr
dump 0uaf squishy:/dev/nrst0 /var # backs up /var
dump 0uaf squishy:/dev/nrst0 /home # backs up /home
Each night Monday through Friday, do our incremental backups.
dump 9uaf squishy:/dev/nrst0 / # backs up the root filesystem
dump 9uaf squishy:/dev/nrst0 /usr # backs up /usr
dump 9uaf squishy:/dev/nrst0 /var # backs up /var
dump 9uaf squishy:/dev/nrst0 /home # backs up /home
Now we're feeling pretty good about ourselves, but we sit down Saturday morning and promptly sit down to our shell prompt and type rm -rf $HOME as directed in a Usenet article. Oops, we just blew away our home directory, and need to restore from backup. Because of our backup scheme, we'll need to restore first the full backup from Sunday night, then the incremental from Friday night. Since we don't want to restore all of /home, but rather just our home directory, we'll use an interactive restore session.
Restore from the full backup taken on Sunday Note that we want to restore from the /home backup, which is the fourth backup session on the tape.
mkdir /home/hamilton # remember, we removed the directory!
cd /home/hamilton
restore -i -s 4 -f squishy:/dev/nrst0
Note: the -s 4 flag to restore tells it to restore from the fourth backup session on the tape. |
We now get a restore> prompt.
restore> add hamilton
restore> extract
When that completes, we will have restored /home/hamilton to what it looked like on Sunday night. Now we need to restore from the Friday backups:
Unload the Sunday tape and load the Friday tape in squishy's tape drive
restore -i -s 4 -f squishy:/dev/nrst0
Again, we get a restore> prompt.
restore>add hamilton
restore> extract
When that completes, we are finished. Eject the tape from squishy's tape drive. There will be a file in /home called restoresymtable which you should now remove by hand.
Planning backups is a fairly complicated thing, and as with everything else, involves many tradeoffs. Hopefully this document has been enough to get you started, but keep in mind that until you're absolutely sure that your chosen scheme is working, it's very important to actually look through the tapes and make sure that everything you think is getting backed up really is getting backed up. It's a very good idea to also periodically choose a file at random and try to restore it from a particular date, just to be sure that you can do so. It's no fun to discover in the middle of an emergency that you're not sure how to get your files back, and it's even worse to discover that you don't have the backups you thought you did.
The examples given in this document are fairly straightforward, with few complicating factors. In real life, you will run into things like bad tapes, backups that don't all fit on one tape, and other such entertaining situations. With a little thought and a decent understanding of dump and restore most of these problems are easy to overcome (bad tapes being the exception!).
Don't forget to care for your tape drive, too. Be sure to clean it as directed by the manufacturer - it really does make a difference in head life and in tape reliability.
If you are looking for a large scale backup solution, there are a number of commercial and freeware packages which act as a frontend for dump and restore, and some which use a proprietary format to store backup data. Depending on the needs of your organization, dump, restore, and some shell or perl scripts may well be all you need. You can do all sorts of extravagent things with scripts and UNIX(TM) commands, even managing tape changers and robots to do very large scale backups. If you're not comfortable doing this on your own, or if you lack the skill and need a working solution that's reliable right away, a commercial solution may be a good alternative.
This document is intended to be a work in progress, and if you have any suggestions or additions, please send them to hamilton@pobox.com. You may redistribute this document as you see fit, as long as you keep in mind that I am in no way responsible for any problems stemming from attempts to use the information herein, and that you leave this notice intact and attribute the authorship.
The SGML Docbook source for this document is also available at http://www.pobox.com/~hamilton/docs/dump.sgml. This source is quite ugly; it was marked up from an already-existing document, so don't look at it as a good example of markup!