The great thing about Unix is that you can write programs to do just about anything you want. The down side is that you may have to write a program every time you want to get anything done. A Unix system administrator—and the average Linux user is his own administrator—is faced with a seemingly never-ending stream of small jobs which are too tedious for human hands but too infrequent for a large programming effort.
Most of the programs I write are run once and thrown away. A significantly smaller number see action as often as once a week. I can count the number of programs run every day on one hand. Obviously, I can't afford to spend much time on any one program. I need a language in which it's easy to develop, easy to debug, and easy to extend. C, the traditional Unix programming language, doesn't offer any of these. In this article I'll introduce a language which does.
Scheme is closely related to Lisp, a language whose name once stood for “LISt Processing”. Lisp first saw the light of day in 1958; unfortunately, it has not become a stellar commercial success since then. In fact, common knowledge says that Lisp and its sister language Scheme are bloated and slow. While that may have been true in the bad old days when every programmer wrote in assembler and toggled the program into the computer's front panel with his teeth, today it's more important to maximize programmer productivity than to minimize machine cost. Scheme advances this goal by providing the programmer with a flexible but safe language which allows him to operate at a higher level than he would with C.
How exactly does it do that, you ask? First, and most importantly, Scheme provides automatic memory management. A C programmer must explicitly allocate and de-allocate every object he uses. If the program allocates more than he de-allocates, memory will leak and be wasted. If the program de-allocates too often, the program will behave incorrectly or even crash. Thanks to a process known as “garbage collection”, a Scheme programmer need only concern himself with allocation. He allocates an object when he needs it, and the Scheme runtime system frees it when the object is no longer needed.
Scheme provides a richer selection of data types than C does. While the C programmer has only numbers, characters, arrays, and pointers to choose from, the Scheme programmer has at his disposal numbers, characters, arrays, strings, lists, association lists, functions, closures, ports, and booleans. In addition, Scheme and C disagree on how to handle typing information: C assigns a type to each variable, and each variable may hold only values of that type. Scheme assigns no type to variables, but identifies each value with a type. One of the benefits of this approach is that it allows “polymorphism”, which means that one function can take arguments of many types. For example, you might have one function that could search for a word or a list of words.
An arguably peripheral issue plays an important part in making Scheme such a wonderful language for development: Scheme is available in both interpreted and compiled implementations. My experience with interpreted languages shows that development with an interpreted language leads to faster prototyping and debugging of the finished product. After two years programming in interpreted languages (mostly Perl and Scheme/Lisp), I cannot tolerate the edit-compile-run cycle that plagues the C programmer. With most Scheme interpreters, you can simply reload the particular function definition that you have changed. You have the full capabilities of the language at your disposal from the debugger, and you can even modify a running program!
Finally, a Scheme program is usually safer and more robust than its C counterpart. The Scheme primitives are type-safe—unlike the C primitives which will let you add a string, a character, and an integer, or cast any number to a pointer and then dereference it—and the Scheme environment provides significantly better error-checking than any C compiler could.
Why does C allow these deficiencies to exist? Do Scheme's conveniences come for free? Of course not.
As with most things in computer science, you are given three options: fast, cheap, and correct (you may pick two). The most common Scheme implementation, an interpreter, suffers from slowness and slightly inflated memory usage. Scheme compilers produce much faster code at the expense of larger executables and decreased (or totally removed) error checking. Which two of the attributes you pick depends on which two you need. Most programs don't need to be blindingly fast, but you can make Scheme fast if you need it.
Another drawback stems directly from the overwhelming popularity of C. Most external libraries and system interfaces are available as C-linkable libraries. Scheme users have little or no access to such libraries. My favorite Scheme implementation's solution to this is its Foreign Function Interface (FFI). An FFI allows a Scheme program to access variables and functions written in another language. Here's a short Scheme program that uses the FFI provided by Bigloo, a version of Scheme, to access the C function “printf” and the global system error variable “errno”:
(module ffi-example (foreign (int errno "errno") (int printf (string . foreign) "printf")) (main show-errno)) (define (show-errno argv) (printf "The value of errno is %d" errno) (newline))
Scheme can allow C the use of its functions through similar directives. With a decent FFI, Scheme and C programs can share data and interfaces as freely as two C programs.
Even the slowest Scheme interpreter is adequately fast for most of my day-to-day programs. Most Unix programs spend most of their time waiting for I/O to complete, and mine are no exception. The few programs that must be as computationally efficient as possible gain respectable increases in speed from using a Scheme compiler. In some cases, a program compiled by the Bigloo compiler at maximum optimization ran exactly as fast as the C equivalent compiled with “gcc -O2”. A trivial example is provided below. (See the table and two program listings).
time language 0.72 gcc -O2 0.72 Bigloo -unsafe 1.03 Bigloo 2.92 SCM/compiled 3.00 Scheme->C 79.04 Scheme48 90.30 SCM 91.76 Perl5 109.04 GNU awk 174.36 Perl4
(module optest (main main)) (define (main argv) (let ((b (string->integer (cadr argv))) (j 0)) (do ((i 1 (+ i 1))) ((> i b)) (if (even? i) (set! j (+ j 1)) (set! j (- j 1)))) (display j) (newline)))