Not File System Specific? (Not Quite) 295 file server goes down. Why not soft-mount the server instead? Because if a server is soft-mounted, and it is too heavily loaded, it will start corrupting data due to problems with NFS’s write-back cache. Another way that NFS can also freeze your system is with certain programs that expect to be able to use the Unix system call creat() with the POSIX- standard “exclusive-create” flag. GNU Emacs is one of these programs. Here is what happens when you try to mount the directory /usr/lib/emacs/ lock over NFS: Date: Wed, 18 Sep 1991 02:16:03 GMT From: meuer@roch.geom.umn.edu (Mark V. Meuer) Organization: Minnesota Supercomputer Institute Subject: Re: File find delay within Emacs on a NeXT To: help-gnu-emacs@prep.ai.mit.edu4 In article 1991Sep16.231808.9812@s1.msi.umn.edu meuer@roch.geom.umn.edu (Mark V. Meuer) writes: I have a NeXT with version 2.1 of the system. We have Emacs 18.55 running. (Please don’t tell me to upgrade to version 18.57 unless you can also supply a pointer to diffs or at least s- and m- files for the NeXT.) There are several machines in our network and we are using yellow pages. The problem is that whenever I try to find a file (either through “C-x C-f”, “emacs file” or through a client talking to the server) Emacs freezes completely for between 15 and 30 seconds. The file then loads and everything works fine. In about 1 in 10 times the file loads immediately with no delay at all. Several people sent me suggestions (thank you!), but the obnoxious delay was finally explained and corrected by Scott Bertilson, one of the really smart people who works here at the Center. For people who have had this problem, one quick hack to correct it is to make /usr/lib/emacs/lock be a symbolic link to /tmp. The full explanation follows. I was able to track down that there was a file called !!!SuperLock!!! in /usr/lib/emacs/lock, and when that file existed the delay would occur. When that file wasn’t there, neither was the delay (usually). 4Forwarded to UNIX-HATERS by Michael Tiemann.
296 NFS We found the segment of code that was causing the problem. When Emacs tries to open a file to edit, it tries to do an exclusive create on the superlock file. If the exclusive create fails, it tries 19 more times with a one second delay between each try. After 20 tries it just ignores the lock file being there and opens the file the user wanted. If it succeeds in creating the lock file, it opens the user’s file and then immediately removes the lock file. The problem we had was that /usr/lib/emacs/lock was mounted over NFS, and apparently NFS doesn’t handle exclusive create as well as one would hope. The command would create the file, but return an error saying it didn’t. Since Emacs thinks it wasn't able to create the lock file, it never removes it. But since it did create the file, all future attempts to open files encounter this lock file and force Emacs to go through a 20-second loop before proceeding. That was what was causing the delay. The hack we used to cure this problem was to make /usr/lib/emacs/lock be a symbolic link to /tmp, so that it would always point to a local directory and avoid the NFS exclusive create bug. I know this is far from perfect, but so far it is working correctly. Thanks to everyone who responded to my plea for help. It’s nice to know that there are so many friendly people on the net. The freezing is exacerbated by any program that needs to obtain the name of the current directory. Unix still provides no simple mechanism for a process to discover its “cur- rent directory.” If you have a current directory, “.”, the only way to find out its name is to open the contained directory “. .”—which is really the parent directory—and then to search for a directory in that directory that has the same inode number as the current directory, “.”. That’s the name of your directory. (Notice that this process fails with directories that are the target of symbolic links.) Fortunately, this process is all automated for you by a function called getcwd(). Unfortunately, programs that use getcwd() unexpectedly freeze. Carl R. Manning at the MIT AI Lab got bitten by this bug in late 1990.
Previous Page Next Page