The Shell Game 149 Hardware stores contain screwdrivers or saws made by three or four differ- ent companies that all operate similarly. A typical Unix /bin or /usr/bin directory contains a hundred different kinds of programs, written by dozens of egotistical programmers, each with its own syntax, operating paradigm, rules of use (this one works as a filter, this one works on temporary files, etc.), different strategies for specifying options, and different sets of con- straints. Consider the program grep, with its cousins fgrep and egrep. Which one is fastest?1 Why do these three programs take different options and implement slightly different semantics for the phrase “regular expres- sions”? Why isn’t there just one program that combines the functionality of all three? Who is in charge here? After mastering the dissimilarities between the different commands, and committing the arcane to long-term memory, you’ll still frequently find yourself startled and surprised. A few examples might be in order. Shell crash The following message was posted to an electronic bulletin board of a compiler class at Columbia University.2 Subject: Relevant Unix bug October 11, 1991 Fellow W4115x students— While we’re on the subject of activation records, argu- ment passing, and calling conventions, did you know that typing: !xxx%s%s%s%s%s%s%s%s tcsh Csh with emacs-style editing. ksh KornShell, another command and programming lan- guage. zsh The Z Shell. bash The GNU Bourne-Again SHell. 1Ironically, egrep can be up to 50% faster than fgrep, even though fgrep only uses fixed-length strings that allegedly make the search “fast and compact.” Go figure. 2Forwarded to Gumby by John Hinsdale, who sent it onward to UNIX-HATERS.
150 csh, pipes, and find to any C-shell will cause it to crash immediately? Do you know why? Questions to think about: • What does the shell do when you type “!xxx”? • What must it be doing with your input when you type “!xxx%s%s%s%s%s%s%s%s” ? • Why does this crash the shell? • How could you (rather easily) rewrite the offending part of the shell so as not to have this problem? MOST IMPORTANTLY: • Does it seem reasonable that you (yes, you!) can bring what may be the Future Operating System of the World to its knees in 21 key- strokes? Try it. By Unix’s design, crashing your shell kills all your processes and logs you out. Other operating systems will catch an invalid memory refer- ence and pop you into a debugger. Not Unix. Perhaps this is why Unix shells don’t let you extend them by loading new object code into their memory images, or by making calls to object code in other programs. It would be just too dangerous. Make one false move and—bam—you’re logged out. Zero tolerance for programmer error. The Metasyntactic Zoo The C Shell’s metasyntactic operator zoo results in numerous quoting problems and general confusion. Metasyntactic operators transform a com- mand before it is issued. We call the operators metasyntactic because they are not part of the syntax of a command, but operators on the command itself. Metasyntactic operators (sometimes called escape operators) are familiar to most programmers. For example, the backslash character (\) within strings in C is metasyntactic it doesn’t represent itself, but some operation on the following characters. When you want a metasyntactic operator to stand for itself, you have to use a quoting mechanism that tells the system to interpret the operator as simple text. For example, returning to our C string example, to get the backslash character in a string, it is nec- essary to write \\.