Eclipse Goes Native
Our first round of changes to libgcj was bug fixing only. We implemented protection domains properly. Then, we made a pass over the entire runtime, fixing bugs related to class loading. Because of the way class loading had been implemented in libgcj, we had to modify all the places in the native code that conceivably might load a class to forward the request to the appropriate class loader.
Once this was done, we were able to start Eclipse using the libgcj bytecode interpreter. At this point the question became, how can we take real advantage of GCJ to compile Eclipse?
The naïve approach to this dilemma, namely precompiling all the classes and linking them all together, had been ruled out by our investigations into Eclipse's internals. This approach would clash with Eclipse's relatively sophisticated class loading strategy.
More investigation revealed that most classes are loaded by instances of the DelegatingURLClassLoader, which is a subclass of the standard URLClassLoader that has been extended to understand Eclipse's plugin architecture. It seemed like the best approach was to modify Eclipse to allow it to load precompiled shared libraries as well as bytecode files. We reasoned that the required changes would be localized due to the way plugin class loading had been structured.
In fact, we had to go one step further and extend libgcj a bit as well. libgcj knew how to load shared libraries invisibly in response to a call to, for example, Class.forName(). However, this magic always happened at the level of the bootstrap class loader. That wouldn't work well for Eclipse or for any other application that defines its own class loaders, so we invented a new gcjlib URL type. This is like a jar URL, but it points to a shared library. We also made some minor extensions to our implementation of URLClassLoader so that gcjlib URLs would be treated specially.
Doing this wasn't enough, however. We also had to solve the linkage problems. In particular, if we compiled a jar file to a shared library, how could we prevent the dlopen() of such a shared library from immediately failing due to unresolved symbols? The solution to this problem was to resurrect and clean up the -fno-assume-compiled option in GCJ. This option, which never had been finished, enabled an alternative ABI that caused GCJ's output to resolve most references at runtime rather than at link time.
The -f-no-assume-compiled option has various limitations and inefficiencies. On the boards for the future is a cleaner way to achieve this same goal. On the GCJ mailing list (see the on-line Resources section) this option is referred to either as the binary compatibility ABI or -findirect-dispatch. This new ABI does everything -fno-assume-compiled does, but in a much more efficient and compatible way. Development is underway and is coming along nicely on this new feature, one of several contributing to GCJ's enterprise readiness.
Once all this was in place, we finally were ready to make our changes to Eclipse. These turned out to be remarkably small. Most of the work involved making the same sort of change in three different places. In essence, we modified Eclipse so that when it's looking for a plugin's jar file, it also looks for a similarly named shared library installed alongside it. If there is one, we rewrite the URL passed to the class loader from a jar URL to a gcjlib URL. All rewriting is done conditionally, so our natively compiled Eclipse still works with an unmodified JVM. In other words, users are not locked in to native compilation if they would rather use a JVM instead.
Once that was done, we wrote our own launcher that understood how to bootstrap the Eclipse platform from shared libraries. This was accomplished in a modest 90 lines of code.
After all that, Eclipse was mysteriously slow. Had we done something wrong? Was GCJ-compiled code substantially worse than the code generated on the fly by the current crop of just-in-time (JIT) compilers? Did -fno-assume-compiled have enormous overhead?
One nice advantage of GCJ is its output generally can be treated in the same way one treats any object code. That is, existing tools such as OProfile can be applied to it directly without any change. And that, in fact, is how we investigated our performance problem.
The first thing we noticed was a large number of exceptions being thrown during platform startup. Amid the grumblings of compiler writers (exceptions should be for exceptional circumstances), and although we were considering changes to the GCJ runtime that would violate Java semantics, we noticed a strange symbol in the OProfile output. It turned out that a small bit of buggy assembly code deep in the libgcj runtime was causing a linear search of exception handling tables rather than the expected binary search. The overhead of this search through the entire program every time an exception was thrown was vast. A fix to the errant assembly code proved this was the problem, and suddenly our natively compiled Eclipse was able to start a second faster than the stock version using a JVM. To quantify it a bit further, the startup time dropped from more than a minute before the fix to less than 15 seconds after it.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- ServersCheck's Thermal Imaging Camera Sensor
- The Italian Army Switches to LibreOffice
- Linux Mint 18
- Chris Birchall's Re-Engineering Legacy Software (Manning Publications)
- Petros Koutoupis' RapidDisk
- Oracle vs. Google: Round 2
- The FBI and the Mozilla Foundation Lock Horns over Known Security Hole
- Privacy and the New Math
Until recently, IBM’s Power Platform was looked upon as being the system that hosted IBM’s flavor of UNIX and proprietary operating system called IBM i. These servers often are found in medium-size businesses running ERP, CRM and financials for on-premise customers. By enabling the Power platform to run the Linux OS, IBM now has positioned Power to be the platform of choice for those already running Linux that are facing scalability issues, especially customers looking at analytics, big data or cloud computing.
￼Running Linux on IBM’s Power hardware offers some obvious benefits, including improved processing speed and memory bandwidth, inherent security, and simpler deployment and management. But if you look beyond the impressive architecture, you’ll also find an open ecosystem that has given rise to a strong, innovative community, as well as an inventory of system and network management applications that really help leverage the benefits offered by running Linux on Power.Get the Guide