Compile C Faster on Linux

Many people who love the GNU gcc compiler still think that it is too slow in normal use, or that it uses too much memory.
An Adaptable Tool

Its compact source code helps people adapt lcc. An early adaptation by one of us (Hanson) injected profiling code that counts executions for a profiler that comes with lcc. For example, the lines

for (<1965>r = 0; <15720>r < 8; <15720>r++)
   if (<15720>rows[r] &&
       <5508>up[r-c+7] &&
       <3420>down[r+c]) { ...

annotate source code from the lcc test suite's implementation of a program that finds all ways to place eight queens on a chessboard such that none of them attacks any other. The numbers in angle brackets report how many times the following code fragment was executed, which can differ within one line. This profiler would have been a big project without lcc, but it was modest one with it. Other adaptations of lcc include interpreters (www.cit.gu.edu.au/~sosic/papers/sigplan92.ps.Z), code generators for multiple targets (ftp://ftp.cs.princeton.edu/pub/lcc/contrib/), a C++ compiler, programmable debuggers that can debug across a network (http://www.cs.purdue.edu/homes/nr/ldb/), and a retargetable optimizing linker (http://www.cs.princeton.edu/~mff/mld/). A group at Stanford University has adapted lcc for use with a global optimizer (suif.stanford.edu/suif/suif.html). At least some of these efforts chose lcc over gcc because lcc's small size made it seem easier to comprehend and change. Many of these projects were begun before the lcc book was done; we expect even more adaptations now that extensive documentation is available.

Literate Programming

Most developers will use pre-built executables for lcc and never study the source code. The Linux community, however, expects source, and lcc provides an annotated version of most of its code in the form of a book. lcc's annotations, like its small size, are designed to help developers modify lcc.

lcc is written as what Knuth has termed a “literate program”, which interleaves the source code with prose explanations. Two programs process this input. One program extracts just the C source code, which can be compiled with any C compiler. The other program processes both the prose and the code and emits the typescript for the lcc book. We generate the book and the compiler from a single source because it's too easy for multiple sources to get out of sync with one another.

A brief fragment of the chapter on the X86 code generator demonstrates literate programming:

Static locals get a generated name to avoid other static locals of the same name:

<X86 defsymbol>=
if (p->scope > LOCAL && p->sclass == STATIC)
   p->x.name = stringf("L%d", genlabel(1));

Generated symbols already have a unique numeric name. Defsymbol simply prefixes a letter to make a valid assembler identifier:

<X86 defsymbol>+=
else if (p->generated)
   p->x.name = stringf("L%s",p->name);

Each of the two displays above consists of a “fragment label” in angle brackets and a “fragment” of C code. The fragment label names the piece of the C program being described (here the version of the routine defsymbol for the X86). The += in the second fragment says that the second code fragment is appended to the first.

This example is necessarily tiny, but it shows how literate programming allows one to build up a complex program a bit at a time, explaining it on the way. The lcc distribution includes conventional C code that can be modified as usual, but when some explanation would help, one can easily get it from the annotated code in the book.

Not shown in this sample are page numbers in each fragment that point to adjoining fragments, and miniature indices in the page margin that point to the page that defines each identifier that's being used. Many readers have identified these mini-indices as especially helpful.

Availability

lcc's C source code and Linux executables are available for anonymous ftp at URL ftp://ftp.cs.princeton.edu/pub/lcc/. It's about a megabyte, so it can be downloaded using even, say, a 14.4kbaud modem in about 10 minutes. The package includes Dennis Ritchie's preprocessor for ANSI C, but lcc is also used with gcc's preprocessor. Like gcc, lcc emits assembler code for the standard Linux assembler, debugger, and C library, so the package does not include any of these. A sub-directory collects code generators and other companion software contributed by others. The package describes mailing lists for communicating with others working on, and with, lcc.

Our book about lcc, A Retargetable C Compiler: Design and Implementation, (ISBN 0-8053-1670-1) is available from Addison Wesley at 800-447-2226 and from other sources listed on lcc's home page (http://www.cs.princeton.edu/software/lcc/).

lcc is free for non-commercial use. The lcc book amounts to a single-user license for lcc, so some have arranged commercial use by simply including a copy of the book with their product (and charging for it); the publisher offers substantial discounts. Other arrangements are possible.

Chris Fraser (cwf@research.att.com) has been writing compilers since 1974. He earned a Ph.D. in computer science at Yale in 1977 and does computing research at AT&T Bell Laboratories in Murray Hill, New Jersey.

Dave Hanson (drh@cs.princeton.edu) is Professor of Computer Science at Princeton University. His research interests include programming language design and implementation, software engineering, and programming environments. His Web is at: www.cs.princeton.edu/~drh/.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix