Not Fully Serviceable 285 By design, NFS is connectionless and stateless. In practice, it is neither. This conflict between design and implementation is at the root of most NFS problems. “Connectionless” means that the server program does not keep connections for each client. Instead, NFS uses the Internet UDP protocol to transmit information between the client and the server. People who know about net- work protocols realize that the initials UDP stand for “Unreliable Data- gram Protocol.” That’s because UDP doesn’t guarantee that your packets will get delivered. But no matter: if an answer to a request isn’t received, the NFS client simply waits for a few milliseconds and then resends its request. “Stateless” means that all of the information that the client needs to mount a remote file system is kept on the client, instead of having additional infor- mation stored on the server. Once a magic cookie is issued for a file, that file handle will remain good even if the server is shut down and rebooted, as long as the file continues to exist and no major changes are made to the configuration of the server. Sun would have us believe that the advantage of a connectionless, stateless system is that clients can continue using a network file server even if that server crashes and restarts because there is no connection that must be rees- tablished, and all of the state information associated with the remote mount is kept on the client. In fact, this was only an advantage for Sun’s engi- neers, who didn’t have to write additional code to handle server and client crashes and restarts gracefully. That was important in Sun’s early days, when both kinds of crashes were frequent occurrences. There’s only one problem with a connectionless, stateless system: it doesn’t work. File systems, by their very nature, have state. You can only delete a file once, and then it’s gone. That’s why, if you look inside the NFS code, you’ll see lots of hacks and kludges—all designed to impose state on a stateless protocol. Broken Cookie Over the years, Sun has discovered many cases in which the NFS breaks down. Rather than fundamentally redesign NFS, all Sun has done is hacked upon it. Let’s see how the NFS model breaks down in some common cases:
286 NFS • Example #1: NFS is stateless, but many programs designed for Unix systems require record locking in order to guarantee database consistency. NFS Hack Solution #1: Sun invented a network lock protocol and a lock daemon, lockd. This network locking system has all of the state and associated problems with state that NFS was designed to avoid. Why the hack doesn’t work: Locks can be lost if the server crashes. As a result, an elaborate restart procedure after the crash is necessary to recover state. Of course, the original reason for making NFS stateless in the first place was to avoid the need for such restart procedures. Instead of hiding this complexity in the lockd program, where it is rarely tested and can only benefit locks, it could have been put into the main protocol, thoroughly debugged, and made available to all programs. • Example #2: NFS is based on UDP if a client request isn’t answered, the client resends the request until it gets an answer. If the server is doing something time-consuming for one client, all of the other clients who want file service will continue to hammer away at the server with duplicate and triplicate NFS requests, rather than patiently putting them into a queue and waiting for the reply. NFS Hack Solution #2: When the NFS client doesn’t get a response from the server, it backs off and pauses for a few milliseconds before it asks a second time. If it doesn't get a second answer, it backs off for twice as long. Then four times as long, and so on. Why the hack doesn’t work: The problem is that this strategy has to be tuned for each individual NFS server, each network. More often than not, tuning isn’t done. Delays accumulate. Performance lags, then drags. Eventually, the sysadmin complains and the com- pany buys a faster LAN or leased line or network concentrator, thinking that throwing money at the problem will make it go away. • Example #3: If you delete a file in Unix that is still open, the file’s name is removed from its directory, but the disk blocks associated with the file are not deleted until the file is closed. This gross hack allows programs to create temporary files that can’t be accessed by other programs. (This is the second way that Unix uses to create temporary files the other technique is to use the mktmp() function and create a temporary file in the /tmp directory that has the process ID in the filename. Deciding which method is the grosser of the two