Disk Partitions and Backups 231 A similar problem happened four years later to Michael Travers at the Media Lab’s music and cognition group. Here’s a message that he for- warded to UNIX-HATERS from one of his system administrators (a posi- tion now filled by three full-time staff members): Date: Mon, 13 Nov 89 22:06 EST From: saus@media-lab.mit.edu Subject: File Systems To: mt@media-lab.mit.edu Mike, I made an error when I constructed the file systems /bflat and /valis. The file systems overlapped and each one totally smashed the other. Unfortunately, I could find no way to reconstruct the file sys- tems. I have repaired the problem, but that doesn’t help you, I’m afraid. The stuff that was there is gone for good. I feel bad about it and I'm sorry but there’s nothing I can do about it now. If the stuff you had on /bflat was not terribly recent we may be able to get it back from tapes. I’ll check to see what the latest tape we have is. Down and Backups Disk-based file systems are backed up regularly to tape to avoid data loss when a disk crashes. Typically, all the files on the disk are copied to tape once a week, or at least once a month. Backups are also normally per- formed each night for any files that have changed during the day. Unfortu- nately, there’s no guarantee that Unix backups will save your bacon. From: bostic@OKEEFFE.CS.BERKELEY.EDU (Keith Bostic) Subject: V1.95 (Lost bug reports) Date: 18 Feb 92 20:13:51 GMT Newsgroups: comp.bugs.4bsd.ucb-fixes Organization: University of California at Berkeley We recently had problems with the disk used to store 4BSD system bug reports and have lost approximately one year’s worth. We would very much appreciate the resubmission of any bug reports sent to us since January of 1991. The Computer Systems Research Group.1
232 System Administration One can almost detect an emergent intelligence, as in “Colossus: The Forbin Project.” Unix managed to purge from itself the documents that prove it’s buggy. Unix’s method for updating the data and pointers that it stores on the disk allows inconsistencies and incorrect pointers on the disk as a file is being created or modified. When the system crashes before updating the disk with all the appropriate changes, which is always, the file system image on disk becomes corrupt and inconsistent. The corruption is visible during the reboot after a system crash: the Unix boot script automatically runs fsck to put the file system back together again. Many Unix sysadmins don’t realize that inconsistencies occur during a sys- tem dump to tape. The backup program takes a snapshot of the current file system. If there are any users or processes modifying files during the backup, the file system on disk will be inconsistent for short periods of time. Since the dump isn’t instantaneous (and usually takes hours), the snapshot becomes a blurry image. It’s similar to photographing the Indy 500 using a 1 second shutter speed, with similar results: the most important files—the ones that people were actively modifying—are the ones you can’t restore. Because Unix lacks facilities to backup a “live” file system, a proper backup requires taking the system down to its stand-alone or single-user mode, where there will not be any processes on the system changing files on disk during the backup. For systems with gigabytes of disk space, this translates into hours of downtime every day. (With a sysadmin getting paid to watch the tapes whirr.) Clearly, Unix is not a serious option for applica- tions with continuous uptime requirements. One set of Unix systems that desired continuous uptime requirements was forced to tell their users in /etc/motd to “expect anomalies” during backup periods: SunOS Release 4.1.1 (DIKUSUN4CS) #2:Sun Sep 22 20:48:55 MET DST 1991 --- BACKUP PLAN ---------------------------------------------------- Skinfaxe: 24. Aug, 9.00-12.00 Please note that anomalies can Freja & Ask: 31. Aug, 9.00-13.00 be expected when using the Unix Odin: 7. Sep, 9.00-12.00 systems during the backups. Rimfaxe: 14. Sep, 9.00-12.00 Div. Sun4c: 21. Sep, 9.00-13.00 -------------------------------------------------------------------- 1This message is reprinted without Keith Bostic’s permission, who said “As far as I can tell, [reprinting the message] is not going to do either the CSRG or me any good.” He’s right: the backups, made with the Berkeley tape backup program, were also bad.
Previous Page Next Page