"Backups don't fail. Restores fail," or so the old joke goes.
Unfortunately, the joke is more bitter than funny. It doesn't matter if the backup failed or the restore failed -- it's a distinction without a difference for storage administrators who are trying to recover data. And they both do fail: As many as half of all attempts to restore from tape fail, according to some estimates.
"Surprisingly, not finding the right piece of media is a big part of [the failures]," said Erik Pounds, senior product manager at EMC Dantz, a Walnut Creek, Calif., maker of backup software. "You have all this media and all these rotation strategies, and they're sometimes so unwieldy that you forget to do something. Over time you just lose track of some of the stuff."
Part of the problem is keeping track of the number of tapes and other media and laying hands on the right one when you need it. Part of it is that archival media has to be kept for long periods, often years, and the sheer length of time makes errors more likely.
This results in problems with backward compatibility, Pounds says. "The user pulls the data off the shelf, goes to do a restore and, for some reason, the product they're using now can't read data written by the previous product." Even if it is the same product, the archived data may have been written with software four or more generations back. The problem can be particularly acute with data stored in a proprietary format. "That's why some companies have a strategy of using the format native to their operating system, or even independent of the operating system like a UDF [universal disk format]," Pounds says.
There are several ways to minimize human error and its effects on backups. They include:
Know what you're storing
Do you really know what you're backing up? And are you backing up the right things? It's uncommon to back up everything on a system even with a full backup. Temporary files and other files are usually excluded. Which is fine, as long as you're actually backing up what you think is being backed up. In too many cases, however, you're not. Check your backup rules to make sure you're including all the files you should be.
Simplify your backups
In theory, elaborate backup schemes involving multiple tape sets add redundancy and therefore protection. But beyond a certain point, the complexity increases the possibility of human error.
Backup processes should be as simple and as automated as possible. This includes not only making the backups, but the subsequent handling of the media, too.
Verify your data
Make sure the data was correctly written to tape. Generally, you will want to use all the verification features your backup system offers. Because verification stretches the time it takes for backup, the temptation is to leave the more elaborate verification features off. Think carefully before you do that.
Document your backups
Every tape should be clearly identified in both machine and human-readable formats. You should also maintain an audit trail of who has handled the tapes and where they are being kept.
Audit your tapes
Do you actually have the tapes you think you do in the locations where you think they are? Perform regular audits to make sure.
Duplicate critical data
For critical data you may want to keep two sets of tapes, preferably written by two different drives and stored in two different places. Simply making a copy of the tapes as they are produced isn't as secure, but it is better than nothing.
Rick Cook has been writing about mass storage since the days when the term meant an 80 KB floppy disk. The computers he learned on used ferrite cores and magnetic drums. For the past 20 years, Cook has been a freelance writer specializing in storage and other computer issues. Let us know what you think about this tip; email firstname.lastname@example.org.