Writing a GCC Front End

Language designers rejoice! Now it's easier to put a front end for your language onto GCC.
Writing the Driver

Currently GCC requires your front end to be visible at build time—there is no way to write a front end that is built separately and linked against an installed GCC. For this step, read through the appropriate section of the GCC manual to find out how to write the build infrastructure needed for your front end. Ordinarily, the simplest way is to copy another front end's files and modify them to suit.

Next, write two files to help integrate your front end into the GCC driver program. The lang-specs.h file describes your front end to the GCC driver. It tells the driver the file extensions that, when seen on the command line, should cause GCC to invoke your front end. It also gives the driver some instructions for what other programs must be run, such as whether the assembler should be run after your front end and how to pass or modify certain command-line options. It may take a while to write this file, as specs are their own strange language. However, examples in the other front ends can help.

The lang.opt file describes any command-line options specific to your front end. This is a plain-text file written in a straightforward format. Simple options, such as warning flags, can be put in lang.opt and do not require any other code on your part. Other arguments have to be handled by a lang hook you must write.

Next, implement the lang hooks needed to drive the compilation process. The important ones in this category are:

  • init_options: the first call made to your front end, before any option processing is done.

  • handle_option: called to handle a single command-line option.

  • post_options: called after all command-line processing has been done. This lang hook also is a convenient place to determine the name of the input file to parse.

  • init: called after post_options to initialize your front end.

  • finish: called after all compilation is done. You can use this to clean up after your front end, if necessary.

  • parse_file: a lang hook that does all the parsing, semantic analysis and code generation needed for the input file. It does all the actual work of compilation.

Initialization

GCC needs your front end to do some initialization. Most of GCC is self-initializing, but in order to accommodate the needs of different front ends, it is possible to initialize some tree-related global variables in atypical ways. I recommend not trying to delve too deeply into this. It is simpler to define the standard tree nodes in the standard ways and to think up your own names for trees representing, say, the standard types in your language.

During initialization you want to call build_common_tree_nodes, set_sizetype and build_common_tree_nodes_2. set_sizetype is used to set the type of the internal equivalent of size_t; it is simplest to set this always to long_unsigned_type_node.

Other setup steps can be done in this phase. For instance, in the initialization code for gcjx, we build types representing various structures that we need to describe Java classes and methods.

Compiling to GENERIC

Your parse_file lang hook calls your compiler to generate your internal data structures. Assuming this completes without errors, your front end now is ready to generate GENERIC trees from your AST. In gcjx, this is done by walking the AST for a class using a special visitor API. The GENERIC-specific implementation of this API incrementally builds trees representing the code and then hands this off to GCC.

All the details of generating trees are outside the scope of this article. Below are examples, however, showing three major tree types so you can see what each looks like.

Type

One kind of tree represents a type. Here is an example from gcjx of the Java char type:

tree type_jchar = make_node (CHAR_TYPE);
TYPE_PRECISION (type_jchar) = 16;
fixup_unsigned_type (type_jchar);

You can represent any type using trees. In particular, there are tree types representing records, unions, pointers and integers of various sizes.

Decl

Decl represents a declaration or, in other words, a name given to some object. For instance, a local variable in the source code is represented by a decl:

tree local = build_decl (VAR_DECL, get_identifier ("variable_name"),
			 type_jchar);

There are decls representing various named objects in a program: translation units, functions, fields, variables, parameters, constants, labels and types. A type decl represents the declaration of the type, as opposed to the type itself.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Incremental development of a gcc front end from source.

Joe Garvey's picture

Tom Tromey reported herein 2005...April 6th
"Writing the Driver
Currently GCC requires your front end to be visible at build time—there is no way to write a front end that is built separately and linked against an installed GCC. "

Has that situation change or is it still so that if I even change one variable declaration in a new frontend source , that the WHOLE (if only core) gcc would need to be rebuilt from sources?

Any idea of best resources for writing such a frontEnd?

Perl

dude's picture

For years the Perl compiler has always been "any day now".
I wonder how hard it would be make a perl front-end to GCC.

Actually since Perl 6 is to be translated into Parrot
you only need to make a Parrot front end.

Geek Guide
The DevOps Toolbox

Tools and Technologies for Scale and Reliability
by Linux Journal Editor Bill Childers

Get your free copy today

Sponsored by IBM

Webcast
8 Signs You're Beyond Cron

Scheduling Crontabs With an Enterprise Scheduler
On Demand
Moderated by Linux Journal Contributor Mike Diehl

Sign up and watch now

Sponsored by Skybot