Writing a GCC Front End

Language designers rejoice! Now it's easier to put a front end for your language onto GCC.
Writing the Driver

Currently GCC requires your front end to be visible at build time—there is no way to write a front end that is built separately and linked against an installed GCC. For this step, read through the appropriate section of the GCC manual to find out how to write the build infrastructure needed for your front end. Ordinarily, the simplest way is to copy another front end's files and modify them to suit.

Next, write two files to help integrate your front end into the GCC driver program. The lang-specs.h file describes your front end to the GCC driver. It tells the driver the file extensions that, when seen on the command line, should cause GCC to invoke your front end. It also gives the driver some instructions for what other programs must be run, such as whether the assembler should be run after your front end and how to pass or modify certain command-line options. It may take a while to write this file, as specs are their own strange language. However, examples in the other front ends can help.

The lang.opt file describes any command-line options specific to your front end. This is a plain-text file written in a straightforward format. Simple options, such as warning flags, can be put in lang.opt and do not require any other code on your part. Other arguments have to be handled by a lang hook you must write.

Next, implement the lang hooks needed to drive the compilation process. The important ones in this category are:

  • init_options: the first call made to your front end, before any option processing is done.

  • handle_option: called to handle a single command-line option.

  • post_options: called after all command-line processing has been done. This lang hook also is a convenient place to determine the name of the input file to parse.

  • init: called after post_options to initialize your front end.

  • finish: called after all compilation is done. You can use this to clean up after your front end, if necessary.

  • parse_file: a lang hook that does all the parsing, semantic analysis and code generation needed for the input file. It does all the actual work of compilation.

Initialization

GCC needs your front end to do some initialization. Most of GCC is self-initializing, but in order to accommodate the needs of different front ends, it is possible to initialize some tree-related global variables in atypical ways. I recommend not trying to delve too deeply into this. It is simpler to define the standard tree nodes in the standard ways and to think up your own names for trees representing, say, the standard types in your language.

During initialization you want to call build_common_tree_nodes, set_sizetype and build_common_tree_nodes_2. set_sizetype is used to set the type of the internal equivalent of size_t; it is simplest to set this always to long_unsigned_type_node.

Other setup steps can be done in this phase. For instance, in the initialization code for gcjx, we build types representing various structures that we need to describe Java classes and methods.

Compiling to GENERIC

Your parse_file lang hook calls your compiler to generate your internal data structures. Assuming this completes without errors, your front end now is ready to generate GENERIC trees from your AST. In gcjx, this is done by walking the AST for a class using a special visitor API. The GENERIC-specific implementation of this API incrementally builds trees representing the code and then hands this off to GCC.

All the details of generating trees are outside the scope of this article. Below are examples, however, showing three major tree types so you can see what each looks like.

Type

One kind of tree represents a type. Here is an example from gcjx of the Java char type:

tree type_jchar = make_node (CHAR_TYPE);
TYPE_PRECISION (type_jchar) = 16;
fixup_unsigned_type (type_jchar);

You can represent any type using trees. In particular, there are tree types representing records, unions, pointers and integers of various sizes.

Decl

Decl represents a declaration or, in other words, a name given to some object. For instance, a local variable in the source code is represented by a decl:

tree local = build_decl (VAR_DECL, get_identifier ("variable_name"),
			 type_jchar);

There are decls representing various named objects in a program: translation units, functions, fields, variables, parameters, constants, labels and types. A type decl represents the declaration of the type, as opposed to the type itself.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Incremental development of a gcc front end from source.

Joe Garvey's picture

Tom Tromey reported herein 2005...April 6th
"Writing the Driver
Currently GCC requires your front end to be visible at build time—there is no way to write a front end that is built separately and linked against an installed GCC. "

Has that situation change or is it still so that if I even change one variable declaration in a new frontend source , that the WHOLE (if only core) gcc would need to be rebuilt from sources?

Any idea of best resources for writing such a frontEnd?

Perl

dude's picture

For years the Perl compiler has always been "any day now".
I wonder how hard it would be make a perl front-end to GCC.

Actually since Perl 6 is to be translated into Parrot
you only need to make a Parrot front end.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix