Compiling Java with GCJ

Although Java isn't a popular choice for free projects, GJC can make it a viable option.
Kawa: Compiling Scheme to Native via Java

Java bytecodes are a fairly direct encoding of Java programs not really designed for anything else. However, they have been used to encode programs written in other languages. See grunge.cs.tu-berlin.de/~tolk/vmlanguages.html for a list of other programming languages implemented on top of Java. Most of these are interpreters, but a few actually compile to bytecode. The former could use GCJ as is; the latter potentially can use GCJ to compile to native code.

One such compiler is Kawa, which I have been developing since 1996. Kawa is both a toolkit for implementing languages using Java and an implementation of the Scheme programming language. You can build and run Kawa using GCJ without needing any non-free software. The Kawa home page (www.gnu.org/software/kawa) has instructions for downloading and building Kawa with GCJ.

You can use Kawa in interactive mode. Here, we first define the factorial function and then call it:

$ kawa
#|kawa:1|# (define (factorial x)
#|(---:2|#  (if (< x 2) x (* x (factorial (- x 1)))))
#|kawa:3|# (factorial 30)
265252859812191058636308480000000

An interesting thing to note is the factorial function actually gets compiled by Kawa to bytecode and is immediately loaded as a new class. This process uses Java's ClassLoader mechanism to define a new class at runtime for a byte array containing the bytecodes for the class. The methods of the new class are interpreted by GCJ's bytecode interpreter.

Of course, it is usually more convenient to put the code in a file:

$ cat > factorial.scm
(define (factorial x)
(if (< x 2) x (* x (factorial (- x 1)))))
(format #t "Factorial ~d is ~d.~%~!" 30 (factorial 30))
^D
$ kawa -f factorial.scm
Factorial 30 is 265252859812191058636308480000000.

You can increase the performance of Scheme code by using Kawa to compile it ahead of time, creating one or more .class files:

$ kawa --main -C factorial.scm
(compiling factorial.scm)
You can then load the compiled file:
$ kawa -f factorial.class
Factorial 30 is 265252859812191058636308480000000.
To compile the class file to native code, you can use gckawa, a script that sets up appropriate environment variables (LD_LIBRARY_PATH and CLASSPATH) and calls gcj:
$ gckawa -o factorial
--main=factorial -g -O factorial*.class
Using the wildcard in factorial*.class is not needed in this case, but it is a good idea in case Kawa needs to generate multiple .class files.

Then, you can execute the resulting factorial program, which is a normal GNU/Linux ELF executable. It links with the shared libraries libgcj.so (the GCJ runtime library) and libkawa.so (the Kawa runtime library).

The same approach can be used for other languages. For example, I am currently working on implementing XQuery, W3C's new XML-query language, using Kawa.

Other applications that have been built with GCJ include Apache modules, GNU-Paperclips and Jigsaw.

Conclusion

GCJ has seen a lot of activity recently and is a solid platform for many tasks. We hope that you consider Java for your free software project, using GCJ as your preferred Java implementation and that some of you will help make GCJ even better.

email: per@bothner.com

Per Bothner (www.bothner.com/per) has worked on GNU software since the 1980s. At Cygnus he was technical leader of the GCJ Project. He is currently an independent consultant.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Re: Compiling Java with GCJ

Anonymous's picture

I agree with the remarks about the beginning of the article. I would go further and criticize the claim that the two phase approach of Java is similar to C - it is much more similar to Basic with an initial compilation to a state that is later interpreted. Some may not like seeing Java compared to Basic or the original implementation of Pascal with p-code, but that is the Java model.

As a former OS internals developer, I wonder exactly what is supposed to be meant by "when the JVM becomes part of the OS" and how that is supposed to improve startup time to any great extent. It is the class load time and the byte code interpretation time that is the big issue here, not how long it takes to start the process running the JVM. Putting the JVM inside the kernel like a device driver would not be at all helpful in a virtual memory environment. Requiring a service trap from the application code to the kernel to get to the JVM would slow things down much more. If the claim is that putting the JVM on the same distribution CD as the OS will somehow speed things up, that does not make any sense. As for the JVM becoming part of the OS in JDK 1.5, I have not seen any announcement at www.java.sun.com about that. Besides, the JDK is irrelevant at runtime because the JRE is what interprets the byte codes in the class file.

As for transporting classes, a better example would be running Java classes inside a web browser for an applet. That is where the portability of Java classes is worth the cost at runtime to interpret everything or use a "not quite Just In Time" compiler that detects when too much time has been wasted and then optimizes code that may not run again.

In a servlet environment, noticing repeatedly executed code can have a payoff for future requests. For a Java application that runs for a while and then exits, the JIT optimizations are too late for any big payoff in performance.

Sun has resisted having Java be a compiled language all along. Whether this has helped them sell faster hardware is unknown but it certainly has slowed acceptance of Java in many cases. GCJ has the potential to be the perfect solution for cases where Java as source is desired but execution speed is important. This would allow source to be in a language that has many advantages, yet allow the installation to be specific to the hardware and OS for execution speed similar to C++.

Re: Compiling Java with GCJ

Anonymous's picture

Java hasn't been "interpreted" for a long while now. It is compiled "just-in-time", which is a totally different thing. The code that runs is real machine code for the actual processor type it is running on, unlike "p-code" or similar.

I don't know what the comment "when the JVM becomes part of the OS" comment means either. However there is a feature in Java1.5 where starting a new java application will *not* start a new JVM intance. Instead, it just loads the classes associated with the new application into an existing JVM. And a JVM can be left "idle in the background" when no java apps at all are running, so that when one is started it starts much quicker.

So whether you are running 1, 2, 5 or 50 java-based apps, there is only one JVM. This is possible because using different class-loaders can totally isolate applications from each other; they aren't aware that they are sharing a JVM.

Possible issues that I can see, though, involve:
* process priority ("nice" etc)
* process killing (kill -9)
* JNI libraries loaded by one app crashing the JVM

In many cases, however, sharing a JVM could be beneficial, particularly if the java standard libraries only need to be loaded once (and JIT'ed once).

Yes, the app itself still needs to be "JIT'ed" when run.

Re: Compiling Java with GCJ

Anonymous's picture

Most people don't need to use RMI in a JINI environment. They just want something that works, nicely, for writing apps. They want a rich programming environment with overflow detection, garbage collection, and a nice simple, usable object model. You are right that for a lot of things Java is currently used for, gcj probably won't work, but for a lot of things C and C++ are used for, gcj will work *better*.

Re: Compiling Java with GCJ

Anonymous's picture

Your vision of the uses for Java is very limited. There are times that we want to reuse code for Windows client apps from other types of apps and we don't want to re-write to C++ or VB or something.
GCJ (and other native compilers) are useful in these cases because we want to protect our source code (not all software is free.. some of us need to eat). Platform portability is not needed and not even desirable.
The Java class format and obfuscators are not good at protecting source code well enough. Native compilers are much better.
Also telling a user to copy one file is much easier than telling him to install a JVM, set the classpath, etc..
GCJ + SWT is very attractive.

Re: Compiling Java with GCJ

Anonymous's picture

Hallo ,
Can i import c++ code to java code using the cni interface and compile it to a class file (and how)??
Sbile

Re: Compiling Java with GCJ

Anonymous's picture

"Although Java isn't a popular choice for free projects [...]"

Yeah, sure... Number of projects registered at Sourceforge, by technology:

C (10368)

C++ (9957)

Java (8101)

Perl (4413)

PHP (6103)

Re: Compiling Java with GCJ

Anonymous's picture

Yea but how many of those have made it into a linux distrobution?

Re: Compiling Java with GCJ

Anonymous's picture

After reading this, I decided to do some rudimentary benchmarking. Here are my results and comments:

CPU OS Compiler JVM Parsing Unparse

500Cel W2K Javac 1.4.1_01 JSE 1.4.1_01 1.8 (2.25) 1.6 (2) (interpolation to 400Mhz)

400Cel RH8 Javac 1.4.1_01 JSE 1.4.1_01 2.5 2.3

400Cel RH8 Jikes JSE 1.4.1_01 2.5 2.3

400Cel RH8 IBM-1.4.0 IBM-1.4.0 5.3 2.3

400Cel RH8 gcj Native Code 6.8 4.3

400Cel RH8 Javac 1.4.1_01 J2ME/Personal 46.9 11.0

400Cel RH8 Javac 1.4.1_01 gnu (gij) 142 6.7

400Cel RH8 Jikes Wonka 170 347

Notes: All tests were repeated for 10,000 iterations. This is the average per iteration in milliseconds. The

tested routine is an xml parser which is processor intensive. There is very little network/disk

utilization. Most of the the other JVM projects seem dead. I could not test the WebLogic

Jrockit JVM as that requires (?) Redhat Advanced Server. Sun seems to be doing some

thing right as their JVM seems to smoke all the others...

Re: Compiling Java with GCJ

Anonymous's picture

Strange, testing numerical array operation (say dot product of two double array of length 1000, repeated 4 x 1000000 times) gcj is almost as fast as gcc (10sec vs 17 sec) when Sun jdk1.4 is 2.5 to 3 times slower (55 sec) .

This is great stuff for me. Developping in java, with all its comfort, and then compile it to be as fast as gcc.

Re: Compiling Java with GCJ

Anonymous's picture

To correct my numbers in the previous message:
gcj (-O3, no bounds check) : 20 seconds, gcc (-O3) : 17 seconds, JDK HotSpot 1.4.1_01: 55 seconds. Al this very approximate, but seeing the difference no neeed of statistical test.

Re: Compiling Java with GCJ

Anonymous's picture

Did you test with the server version of hotspot jvm?
java -server
Default is client.

Re: Compiling Java with GCJ

Anonymous's picture

The IBM 1.3.1 JVM smokes the IBM 1.4.0 JVM (and all of the Sun JVMs, last I heard). For some reason, the IBM JVM has gotten slower in the latest revision. Also, what optimizations did you use with gcj. -O2 at the minimum. You should also consider -O3, -fno-bounds-checking and -fomit-frame-pointer.

On the other hand, I'm currently working with some acoustic modeling code that someone else translated from Fortan to Java. The gcj dynamically linked binary takes 10 times as long to run as the Sun 1.4.1_01 JVM, and statically linking the binary makes it take 20 times longer than the JVM. The source code is identical. I don't know what's wrong. It may be that StrictMath is too young in gcj. StrictMath isn't available in GCJ 3.0 (the default version under Debian Linux), but is available in GCJ 3.2.

When can we expect suport for Swing?

Anonymous's picture

I would be really really impressed of you could compile Swing apps.

Re: When can we expect suport for Swing?

Anonymous's picture

Actually some swing applications compile. The main limitations seems to be missing methods. For example, to get one of my applications to compile I had to change code from:

JEditorPane pane=new JEditorPane(url);
pan.setEditable(false);

to:

JEditorPane pane=new JEditorPane(url);
final Class [] params = { Boolean.TYPE };
final Object [] args = { Boolean.FALSE };
try {
JeditorPane.class.getMethod(
"setEditable",params).invoke(pane,args);
} catch (final Throwable ignored) {}

With this change, my application compiled.

Of course compiling and running are two different things. Only about 4% of the swing code is actually implemented. The rest of the methods are place holders.

Bill

Re: When can we expect suport for Swing?

Anonymous's picture

Here! Here!

Re: Compiling Java with GCJ

Anonymous's picture

Can gcj compile from a jar file and/or .class files rather than from Java source code? I want to compile

an app which uses a couple of jar files, and recompiling them

from source is hard.

Re: Compiling Java with GCJ

Anonymous's picture

great stuff!

i came from: http://www.rhoads.com/papers/holygrail.jsp

~hugh

Re: Compiling Java with GCJ

Anonymous's picture

Interesting....unfortunately I could not even get HelloWorld to compile using Cynwin.

Ohh well will try it again on my linux machine when I get home.

Keep up the good work.

Re: Compiling Java with GCJ

Anonymous's picture

You need to install the libiconv.a package and add all the missing libraries on the command line. Doesn't seem to all be setup under cygwin.

gcc -c Hello.java
gcc --main=Hello -o Hello Hello.o -l gcj -l iconv -l z

Damn there is still something missing. What is this _WinMain@16 ? Oh well never mind.

Re: Compiling Java with GCJ

Anonymous's picture

that second line should be 'gcj'; not gcc. Then you can leave off all the '-l' options.

Re: Compiling Java with GCJ

Anonymous's picture

I love java, and I can only thank the GCJ team for the very good job they are making. Java, as released by Sun, is not open source and this has limited the language acceptance in the linux world: I can only hope that the GCJ project will solve this! On my side, I will for sure start writing my java code for gcj!

Tanks, and keep up the good work

Enrico

Re: Compiling Java with GCJ

Anonymous's picture

You damn sure my freind....
Unfortunately I am learning by now but one day I will like you :)

Re: Compiling Java with GCJ

Anonymous's picture

You damn sure my freind....
Unfortunately I am learning by now but one day I will like you :)

Tried to compile a simple

Andre's picture

Tried to compile a simple "Hello world" program, and the exe created is 4.2 MB !! Whoa ! So I stripped it and it's size went down to 2.1 MB
....
Isn't this some kind of bloat ? It works well tough

that plus has one dll dependency

MoFoQ's picture

yea...for me...it was 3.4MB plus a 980KB dll (libiconv2.dll).
With UPX 1.25, the size of the exe is almost 2MB and the dll is 650KB.
Even after strip (strip hello.exe), it's 2.1MB (and it's 640KB after strip and UPX).

in FPC, it's about 20KB (2.0) (1.x is around 8KB or less) before UPX.

so much bloat.
it needs lipo...badly.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState