Eclipse Goes Native

Red Hat's Eclipse Engineering team has freed the popular integrated development environment from its ties to a proprietary Java Virtual Machine.
Changes to libgcj

Our first round of changes to libgcj was bug fixing only. We implemented protection domains properly. Then, we made a pass over the entire runtime, fixing bugs related to class loading. Because of the way class loading had been implemented in libgcj, we had to modify all the places in the native code that conceivably might load a class to forward the request to the appropriate class loader.

Once this was done, we were able to start Eclipse using the libgcj bytecode interpreter. At this point the question became, how can we take real advantage of GCJ to compile Eclipse?

The naïve approach to this dilemma, namely precompiling all the classes and linking them all together, had been ruled out by our investigations into Eclipse's internals. This approach would clash with Eclipse's relatively sophisticated class loading strategy.

More investigation revealed that most classes are loaded by instances of the DelegatingURLClassLoader, which is a subclass of the standard URLClassLoader that has been extended to understand Eclipse's plugin architecture. It seemed like the best approach was to modify Eclipse to allow it to load precompiled shared libraries as well as bytecode files. We reasoned that the required changes would be localized due to the way plugin class loading had been structured.

In fact, we had to go one step further and extend libgcj a bit as well. libgcj knew how to load shared libraries invisibly in response to a call to, for example, Class.forName(). However, this magic always happened at the level of the bootstrap class loader. That wouldn't work well for Eclipse or for any other application that defines its own class loaders, so we invented a new gcjlib URL type. This is like a jar URL, but it points to a shared library. We also made some minor extensions to our implementation of URLClassLoader so that gcjlib URLs would be treated specially.

Doing this wasn't enough, however. We also had to solve the linkage problems. In particular, if we compiled a jar file to a shared library, how could we prevent the dlopen() of such a shared library from immediately failing due to unresolved symbols? The solution to this problem was to resurrect and clean up the -fno-assume-compiled option in GCJ. This option, which never had been finished, enabled an alternative ABI that caused GCJ's output to resolve most references at runtime rather than at link time.

The -f-no-assume-compiled option has various limitations and inefficiencies. On the boards for the future is a cleaner way to achieve this same goal. On the GCJ mailing list (see the on-line Resources section) this option is referred to either as the binary compatibility ABI or -findirect-dispatch. This new ABI does everything -fno-assume-compiled does, but in a much more efficient and compatible way. Development is underway and is coming along nicely on this new feature, one of several contributing to GCJ's enterprise readiness.

Changes to Eclipse

Once all this was in place, we finally were ready to make our changes to Eclipse. These turned out to be remarkably small. Most of the work involved making the same sort of change in three different places. In essence, we modified Eclipse so that when it's looking for a plugin's jar file, it also looks for a similarly named shared library installed alongside it. If there is one, we rewrite the URL passed to the class loader from a jar URL to a gcjlib URL. All rewriting is done conditionally, so our natively compiled Eclipse still works with an unmodified JVM. In other words, users are not locked in to native compilation if they would rather use a JVM instead.

Once that was done, we wrote our own launcher that understood how to bootstrap the Eclipse platform from shared libraries. This was accomplished in a modest 90 lines of code.

Profiling

After all that, Eclipse was mysteriously slow. Had we done something wrong? Was GCJ-compiled code substantially worse than the code generated on the fly by the current crop of just-in-time (JIT) compilers? Did -fno-assume-compiled have enormous overhead?

One nice advantage of GCJ is its output generally can be treated in the same way one treats any object code. That is, existing tools such as OProfile can be applied to it directly without any change. And that, in fact, is how we investigated our performance problem.

The first thing we noticed was a large number of exceptions being thrown during platform startup. Amid the grumblings of compiler writers (exceptions should be for exceptional circumstances), and although we were considering changes to the GCJ runtime that would violate Java semantics, we noticed a strange symbol in the OProfile output. It turned out that a small bit of buggy assembly code deep in the libgcj runtime was causing a linear search of exception handling tables rather than the expected binary search. The overhead of this search through the entire program every time an exception was thrown was vast. A fix to the errant assembly code proved this was the problem, and suddenly our natively compiled Eclipse was able to start a second faster than the stock version using a JVM. To quantify it a bit further, the startup time dropped from more than a minute before the fix to less than 15 seconds after it.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Ill stick to my linux box

Charles's picture

Ill stick to my linux box thanks

Re: Eclipse Goes Native -- Eclipse went native in Dec 2003

Anonymous's picture

Sweeping out the custom classloaders from source code is bad idea. Eclipse evolves rapidly so you have to introduce these patches into each new release not to mention weekly integration builds if you wanna try them. It nails the compiled Eclipse down to a particular fixed version.

Even more problems exist for server-side apps. Imagine you use an application server (free, open source, etc.) on which you deploy your web app and want to compile it to native code. The EJB mechanism employed by any J2EE app servers *heavily* relies on custom classloading (and class reloading, BTW). Would you like to patch the app server souce code to get your app working after native compilation?

Check, for instance, JBoss and you get convinced that the custom classloaders are used everywhere around the JBoss' code. A solution would be appropriate handling custom classloaders not patching the source.

-----------

One of possible ways around the custom clasloading problem for static compiler is used in a proprietary JVM (Excelsior JET available for both Windows and Linux). They have a static Java compiler and caching Just-In-Time compiler which saves the results of dynamic compilation. The innovative point is that the JIT cache can be later optimized by the static compiler! I think that a similar technique has to be implemented in GCJ as well to really harness the advantages of native Java compilation.

Guess what's the app on which they illustrated such an approach? Right, that's Eclipse platform "AS IS" (without code patching). See

http://www.excelsior-usa.com/kb/000010.html

for details. BTW, this article is dated back to Dec 2003. Nothing is new in this world. ;)

Re: Eclipse Goes Native

Anonymous's picture

What about a Fedora Core 2 port?
The FC2 tree seems to already contain native ports of many java tools, it seems most of the work is done and yet no native eclipse for Fedora!
Are there any announced plans for that?

Best Regards,
rkpr

Re: Eclipse Goes Native

Anonymous's picture

Agreed with the last poster. The important thing is that it has been freed...as in freedom...from a proprietary piece of software.

Perhaps more examples like this will get Sun off of its duff so that they GPL their Java implementation. Note that this strategy worked pretty darned well for getting Troll Tech to GPL the Qt library; in that case, it was the GNOME dev team with Harmony.

For those who want to dog Red Hat for making a buck selling Free Software, that same buck is what funds efforts like this, which is a contribution back to the entire community. I thank them for being around and for doing what they do, as long as they keep doing stuff like this.

Re: Eclipse Goes Native

John Raller's picture

Well - didn't I read in the las week that Sun is going GNU with this piece of software?

Re: Eclipse Goes Native

Cindy Banter's picture

Hi John, thats wrong - I think. Eclipse is not GNU.

Re: Eclipse Goes Native

John Raller's picture

Oh, thanks a lot. I think that was my mistake then 8-)

Re: Eclipse Goes Native

Anonymous's picture

This may be naive but sense when do you have to pay for a JVM? And what isn't free about Sun's JVM? Please enlighten me.

Re: Eclipse Goes Native

Anonymous's picture

Sun's JVM is not Free Software (http://www.gnu.org), even if it costs nothing. It doesn't come with a source and you can't modify it and distribute your modifications.

"Sun's JVM is not Free

Green Laser Pointer's picture

"Sun's JVM is not Free Software (http://www.gnu.org), even if it costs nothing. It doesn't come with a source and you can't modify it and distribute your modifications."

Agreed, alot of people make this assumption without realizing the importance of obtaining source (hi open source!)

~Doug

If you get a free plane

Anonymous's picture

If you get a free plane ticket, does that mean you can do whatever you want with the plane ? No, but the ticket was still free.

Thus, I would say Sun's JVM is free.... Although it is true you cannot do whatever you want with it, you don't have to pay to get it and you can use it for free.

I agree that open source is better than free software, but to me, free software is free.

It's about freedom!

Anonymous's picture

A lot of the comment here seem to be missing the point: "[RH] has FREED [Eclipse] from its ties to a PROPRIETARY [JVM]!" Frankly, I'm excited by this - it means I may finally get to try Eclipse! I've been hearing great things about it, but the requirement to install Sun's non-free, unsupported (at least by my vendor) software to get it to run has just been more of a hassle than I've been willing to undergo. I'd have been just as happy if they'd gotten it to run with kaffee (which my vendor also supports), but from what I hear, gjc is more advanced, so the approach RH took here doesn't suprise me.

Another quote from the article: "A full-featured and completely open-source Java environment is an attractive alternative to proprietary JVMs, and it's now within reach." To which I can only say, amen!

I suspect this will also be popular with the Gentoo, "I want everything optimized for *my* machine," crowd.

Re: Eclipse Goes Native

Anonymous's picture

All this work just to find out that throwing exceptions is expensive?

Where are the real results? Improving the startup time is nice, but I only start my IDE about once a week.

Re: Eclipse Goes Native

Anonymous's picture

The real results are the improvements made to gcj and libgcj.

Compiling eclipse is a major challenge, and the fact that it now works is testament to how far the gcj project has come.

And that is a very important result, IMHO, since having a full free software implementation of Java, with the added bonus of native compilation for multiple platforms would mean quite a lot for the acceptance of Java in the Linux and Free Software communities, which have long been quite wary of Java.

> The real results are the improvements made to gcj and libgcj.

Barney's picture

Yes, but will these improvements be seen outside of Redhat? These are gcc projects, and the Free Software Foundation generally says that for large contributions they want either copyright assinged to the FSF, or the code to be put in the public domain, although not for small changes, presumably so that they can re-licence the code in future under GPL v.3, or something else.

I imagine Redhat would be reluctant to either assign its work to the fsf or to make it public domain.

Seems a shame that the fsf is rigid about this issue. Isn't that what split Emacs in two? (no flame war intended.)

Re: Eclipse Goes Native

Anonymous's picture

For swing maybe they should use SwingWT ? (a swing implementation based on SWT)

Re: Eclipse Goes Native

Anonymous's picture

Eclipse already uses SWT, they invented the whole SWT thing...

Re: Eclipse Goes Native

Anonymous's picture

Either you can't read or you're very, very stupid, or both.

Re: Eclipse Goes Native

Anonymous's picture

Either you can't read or you're very, very stupid, or both.I'd say *you're* the stupid one for suggesting SWT on Swing. The whole design of SWT is philosophically different than Swing, the designers of Eclipse wrote SWT simply because they couldn't get by with the poor performance and non-native look and feel of Swing. Read up.

Poor performance?

Anonymous's picture

Both of which are complete non-issues today

Re: Eclipse Goes Native

Anonymous's picture

NEWS FLASH! Man compiles program! Story at Eleven.

Zzzzzz. Wake me when there's some real news.

Re: Eclipse Goes Native

Anonymous's picture

Hi from Joe in Edmonton joejoseph00@hotmail.com

I read your article on Eclipse java byte-code being compiled into native code. Isn't that what the JVM does on the fly already? Is this to speed up startup times? Why would a compiled Eclipse run any faster than a JVM Eclipse aside from startup time?

In Microsoft.NET the .NET environment plugs into non-virtual machine api's such as the proprietary platform dependant unmanaged win32 api.

With processing speeds astronomically fast these days isn't platform independance worth the performance hit? I mean, when I first ran SunONE on my Athlon 550 mhz, it was a bit slow, but after running it on my AthlonXP 1800+ it became lightning fast instantly. And now with a 64-bit JVM on the horizon, programming time becomes a lot more costly than the cost of a new 3800+ Athlon64 :)

Yes yes, I do see the point of your efforts with Eclipse, I might have to try that some day, hearing lots of good things about it. I am thrilled with SunONE though, it is super cool. Once I figure out CVS versioning control over WAN that will be cool.

Joe (joe@k9k.net if you're not on the spammer blacklist you might be able to send me an e-mail) or joejoseph00@hotmail.com

Re: Eclipse Goes Native

Anonymous's picture

Some VMs, known as JIT (Just in time compilers) do compile some of the code into native machine code on running. Usually, what the JIT does is figure out which parts need to run fast, and compiles those.

That process takes some time, and JIT VMs usually are a bit slower starting up than ordinary VMs.

Anyway, a completely native binary will always be significantly faster than any VM.

Whenether platform independence is worth the performance hit is obviously a judgement call on the programmers part. Note the idea is that for most programs, you don't need to sacrifice the platform-independence of the source code to be able to compile it to native. As noted in the article, it's if you use classloaders you have to be careful. Given that it is already quite possible to write nonportable programs in Java, I personally don't see that as a major problem.

Or why not distribute your software as both bytecode and as well as native for the supported platforms? Best of both worlds. It's nice to have that option.

It's also worth noting that GCJ can also compile to java bytecode and work as an ordinary VM if you want that.

Re: Eclipse Goes Native

Anonymous's picture

I have done some tests for java programs compiled by gcj (native) versus running them on JDK 1.4.2, and the latter was almost always faster. It has been claimed often that a JIT can do a better job in principle (since it knows the dynamic access path of the code and thus knows what optimisation variant to choose), and I have found that it also does a better job in reality.

I think this native eclipse is a very very bad idea. It probably won't perform much better (rather worse) and binds the release to a specific platform (incl. specific versions of libraries etc).

Re: Eclipse Goes Native

Anonymous's picture

You have compared old GNU C-based native compiler with the modern HotSpot dynamic compiler. It does not relate to comparison of static vs. dynamic compilation approaches.

HotSpot compiler employs the state-of-the-art SSA (Static Single Assignment) form to highly optimize code. As for GNU compilers, the work has started only recently by Kenneth Zadeck

http://www.naturalbridge.com/

His one of the first implementors of SSA for IBM compilers (Hi Kenny!).

No doubts, Kenny is the right person to do this in the GNU compiler family (wish good luck to him) but right now GCJ is far away from the state-of-the-art of the compiler construction technology.

If you wanna compare dynamic vs. native Java compilers, check out Excelsior JET for Linux which employs many modern optimizations including SSA-based ones.

http://www.excelsior-usa.com/jetlatest.html

If you do not use proprietary JVMs, this solution is not for you. However, it shows the strength of native Java compilers that can be potentially achieved by GCJ at some point in the future.

Re: Eclipse Goes Native

Anonymous's picture

> Anyway, a completely native binary will always be significantly
> faster than any VM.

Not necessarily.

State-of-the-art JVMs contain JITs that create machine code that is tailored to the current execution. Not only do such systems choose which methods are important to optimize, but they also optimize those methods based on the paths/values being used during the current execution. An ahead-of-time compiler does not have this advantage. It typically assumes all paths through a method are equally likely.

Well, state-of-the-art ahead-

HRJ's picture

Well, state-of-the-art ahead-of-time compilers can also incorporate runtime profile information to generate optimal code.

gcc can do this (investigate

Anonymous's picture

gcc can do this (investigate -fprofile-arcs). You build the program with profiling, run it for a while, and then rebuild it with the profile information which is then used to reprioritize code paths.

Re: Eclipse Goes Native

Anonymous's picture

Can you compile AWT? Was compiling SWT a challenge?

Re: Eclipse Goes Native

Anonymous's picture

You can compile AWT. Will it work? A little, but not enough for real programs.
AWT support is almost there though, and Swing is being worked on (Swing being heavily AWT dependent).

There is a gui-branch of gcj in which Swing and AWT is rapidly being developed, and Redhat is very active in this.

(Note: I'm not the author or a Redhat employee.. but big kudos to them for their contributions to gcj and indeed gcc)

Re: Eclipse Goes Native

Anonymous's picture

You could use the SwingWT,

Re: Eclipse Goes Native

Anonymous's picture

"Posted on Thursday, July 01, 2004"

yeah right... (june 11th)

Re: Eclipse Goes Native

Anonymous's picture

Someone needs to read that article on NTP.. :-)

Re: Eclipse Goes Native

Anonymous's picture

What is a performance of native ???

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix