Due to an interesting event earlier this week I thought it might be worthwhile doing a quick blog entry on how tape drives work and why some restores of small files can take an age to restore along with some tips on how to improve this.
Tape backup technology started out many years ago with things like the QIC and DAT. It's still used today but the tapes are more likely to be DLT, SUPER-DLT or LTO. These tapes and the corresponding drives deliver superior speed and storage capacity but the very technology that makes them high speed and high capacity can have a serious detrimental effect when backing up thousands or even millions of small files.
When you back up a file to tape it works like this:
The next file is encoded in the same way so you get:
All fine so far. The problem occurs when you have more than a few thousand files. What happens is that the drive cannot work at it's full speed, it speeds up, overshoots the header point, slows down, reverses the tape, reads the header, maybe restores the file (if it's in the list of files to be restored) then moves onto the next one where the process is repeated all over again.
This means that a nice and shiny super fast LTO can actually run SLOWER than an older technology QIC tape drive. The QIC suffers from the same problem but due to a lesser extent because the drive technology is slower.
So what can do you to work around this issue?
What you do is the equivalent of stuffing all the files into a zip archive - Backup Exec calls this the . All it's actually doing is dumping all the files into one big file so that it can write the file at the drives top speed without having to stop, calculate the checksum, write the checksum, write the next header, etc.
'intelligent image option'