Writing a GCC Front End
Currently GCC requires your front end to be visible at build time—there is no way to write a front end that is built separately and linked against an installed GCC. For this step, read through the appropriate section of the GCC manual to find out how to write the build infrastructure needed for your front end. Ordinarily, the simplest way is to copy another front end's files and modify them to suit.
Next, write two files to help integrate your front end into the GCC driver program. The lang-specs.h file describes your front end to the GCC driver. It tells the driver the file extensions that, when seen on the command line, should cause GCC to invoke your front end. It also gives the driver some instructions for what other programs must be run, such as whether the assembler should be run after your front end and how to pass or modify certain command-line options. It may take a while to write this file, as specs are their own strange language. However, examples in the other front ends can help.
The lang.opt file describes any command-line options specific to your front end. This is a plain-text file written in a straightforward format. Simple options, such as warning flags, can be put in lang.opt and do not require any other code on your part. Other arguments have to be handled by a lang hook you must write.
Next, implement the lang hooks needed to drive the compilation process. The important ones in this category are:
init_options: the first call made to your front end, before any option processing is done.
handle_option: called to handle a single command-line option.
post_options: called after all command-line processing has been done. This lang hook also is a convenient place to determine the name of the input file to parse.
init: called after post_options to initialize your front end.
finish: called after all compilation is done. You can use this to clean up after your front end, if necessary.
parse_file: a lang hook that does all the parsing, semantic analysis and code generation needed for the input file. It does all the actual work of compilation.
GCC needs your front end to do some initialization. Most of GCC is self-initializing, but in order to accommodate the needs of different front ends, it is possible to initialize some tree-related global variables in atypical ways. I recommend not trying to delve too deeply into this. It is simpler to define the standard tree nodes in the standard ways and to think up your own names for trees representing, say, the standard types in your language.
During initialization you want to call build_common_tree_nodes, set_sizetype and build_common_tree_nodes_2. set_sizetype is used to set the type of the internal equivalent of size_t; it is simplest to set this always to long_unsigned_type_node.
Other setup steps can be done in this phase. For instance, in the initialization code for gcjx, we build types representing various structures that we need to describe Java classes and methods.
Your parse_file lang hook calls your compiler to generate your internal data structures. Assuming this completes without errors, your front end now is ready to generate GENERIC trees from your AST. In gcjx, this is done by walking the AST for a class using a special visitor API. The GENERIC-specific implementation of this API incrementally builds trees representing the code and then hands this off to GCC.
All the details of generating trees are outside the scope of this article. Below are examples, however, showing three major tree types so you can see what each looks like.
One kind of tree represents a type. Here is an example from gcjx of the Java char type:
tree type_jchar = make_node (CHAR_TYPE); TYPE_PRECISION (type_jchar) = 16; fixup_unsigned_type (type_jchar);
You can represent any type using trees. In particular, there are tree types representing records, unions, pointers and integers of various sizes.
Decl represents a declaration or, in other words, a name given to some object. For instance, a local variable in the source code is represented by a decl:
tree local = build_decl (VAR_DECL, get_identifier ("variable_name"), type_jchar);
There are decls representing various named objects in a program: translation units, functions, fields, variables, parameters, constants, labels and types. A type decl represents the declaration of the type, as opposed to the type itself.
Special Reports: DevOps
Have projects in development that need help? Have a great development operation in place that can ALWAYS be better? Regardless of where you are in your DevOps process, Linux Journal can help!
With deep focus on Collaborative Development, Continuous Testing and Release & Deployment, we offer here the DEFINITIVE DevOps for Dummies, a mobile Application Development Primer, advice & help from the experts, plus a host of other books, videos, podcasts and more. All free with a quick, one-time registration. Start browsing now...
- The Ubuntu Conspiracy
- Science on Android
- A First Look at IBM's New Linux Servers
- Vigilante Malware
- Disney's Linux Light Bulbs (Not a "Luxo Jr." Reboot)
- Vagrant Simplified
- Bluetooth Hacks
- System Status as SMS Text Messages
- Libreboot on an X60, Part I: the Setup
- October 2015 Issue of Linux Journal: Raspberry Pi