264 The File System for record-based operations: reading, writing, or locking a database record- by-record. (This might be one of the reasons that most Unix database com- panies bypass the Unix file system entirely and implement their own.) More than simple database support, a mature file systems allows applica- tions or users to store out-of-band information with each file. At the very least, the file system should allow you to store a file “type” with each file. The type indicates what is stored inside the file, be it program code, an exe- cutable object-code segment, or a graphical image. The file system should store the length of each record, access control lists (the names of the indi- viduals who are allowed to access the contents of the files and the rights of each user), and so on. Truly advanced file systems allow users to store comments with each file. Advanced file systems exploit the features of modern hard disk drives and controllers. For example, since most disk drives can transfer up to 64K bytes in a single burst, advanced file systems store files in contiguous blocks so they can be read and written in a single operation. Most files get stored within a single track, so that the file can be read or updated without moving the disk drive’s head (a relatively time-consuming process). They also have support for scatter/gather operations, so many individual reads or writes can be batched up and executed as one. Lastly, advanced file systems are designed to support network access. They’re built from the ground up with a network protocol that offers high performance and reliability. A network file system that can tolerate the crash of a file server or client and that, most importantly, doesn’t alter the contents of files or corrupt information written with it is an advanced sys- tem. All of these features have been built and fielded in commercially offered operating systems. Unix offers none of them. UFS: The Root of All Evil Call it what you will. UFS occupies the fifth ring of hell, buried deep inside the Unix kernel. Written as a quick hack at Bell Labs over several months, UFS’s quirks and misnomers are now so enshrined in the “good senses” of computer science that in order to criticize them, it is first necessary to warp one’s mind to become fluent with their terminology.
UFS: The Root of All Evil 265 UFS lives in a strange world where the computer’s hard disk is divided into three different parts: inodes, data blocks, and the free list. Inodes are point- ers blocks on the disk. They store everything interesting about a file—its contents, its owner, group, when it was created, when it was modified, when it was last accessed—everything, that is, except for the file’s name. An oversight? No, it’s a deliberate design decision. Filenames are stored in a special filetype called directories, which point to inodes. An inode may reside in more than one directory. Unix calls this a “hard link,” which is supposedly one of UFS’s big advantages: the ability to have a single file appear in two places. In practice, hard links are a debugging nightmare. You copy data into a file, and all of a sudden—sur- prise—it gets changed, because the file is really hard linked with another file. Which other file? There’s no simple way to tell. Some two-bit moron whose office is three floors up is twiddling your bits. But you can’t find him. The struggle between good and evil, yin and yang, plays itself out on the disks of Unix’s file system because system administrators must choose before the system is running how to divide the disk into bad (inode) space and good (usable file) space. Once this decision is made, it is set in stone. The system cannot trade between good and evil as it runs, but, as we all know from our own lives, too much or too little of either is not much fun. In Unix’s case when the file system runs out of inodes it won’t put new files on the disk, even if there is plenty of room for them! This happens all the time when putting Unix File Systems onto floppy disks. So most people tend to err on the side of caution and over-allocate inode space. (Of course, that means that they run out of disk blocks, but still have plenty of inodes left…) Unix manufacturers, in their continued propaganda to convince us Unix is “simple to use,” simply make the default inode space very large. The result is too much allocated inode space, which decreases the usable disk space, thereby increasing the cost per useful megabyte. UFS maintains a free list of doubly-linked data blocks not currently under use. Unix needs this free list because there isn’t enough online storage space to track all the blocks that are free on the disk at any instant. Unfortu- nately, it is very expensive to keep the free list consistent: to create a new file, the kernel needs to find a block B on the free list, remove the block from the free list by fiddling with the pointers on the blocks in front of and behind B, and then create a directory entry that points to the inode of the newly un-freed block. To ensure files are not lost or corrupted, the operations must be performed atomically and in order, otherwise data can be lost if the computer crashes