Eclipse Goes Native
Eclipse is an open-source, extensible integrated development environment (IDE) that's growing quickly in popularity. Written in Java, it provides a multilanguage development environment that allows developers to code in Java, C and C++. In response to the need for improved performance and additional platform coverage for the Red Hat Developer Suite, of which Eclipse is the core, we created a version of Eclipse that's compiled natively. Instead of running on top of a virtual machine the way Java programs usually do—although that can still be done if the user prefers—Red Hat's version of Eclipse is compiled to binary and runs natively using the libgcj runtime libraries, similar to the way a C program runs using the GNU C libraries.
To compile Eclipse natively, Red Hat's Eclipse Engineering team used GCJ, a free, optimizing, ahead-of-time compiler for Java. GCJ can compile Java source code to native machine code, Java source code to Java bytecode and Java bytecode to native machine code. The approach we took involves using GCJ to compile Java bytecode to native machine code.
This article discusses why native compilation was an attractive choice; explains what we had to do to GCJ, libgcj and Eclipse to make it possible; and shows, using a real-world example, that open-source Java has come a long way and now is useful commercially.
Two main factors from the early days of Developer Suite planning and engineering drove us toward native compilation: platform coverage and performance. Red Hat Enterprise Linux was scheduled to ship on several 64-bit architectures, and we wanted to make sure Developer Suite could run on all of them. One big problem was Eclipse had never been run on a 64-bit platform and it contained some code, specifically the interface between SWT, the graphics toolkit in Eclipse, and its native C libraries, that assumed 32-bit addresses. Aside from having to create a clean 64-bit version of SWT, we were faced with a more significant problem: no 64-bit Java Virtual Machine (JVM) for x86_64, AMD's 64-bit architecture, existed at the time, and it didn't look hopeful that one would be available before we had to ship.
Another problem we had was performance. Eclipse worked well on Microsoft Windows but the version available at the time was pretty slow on Linux. We found that startup alone took well over a minute, and early user testing found that the interface was a little too sluggish for comfortable use. For example, Eclipse is based on perspectives, which are collections of views and editors, only one of which is visible at a time. Switching between them is something that a user does fairly frequently. However, changing perspectives introduced substantial delays we thought unacceptable for the enterprise development market Red Hat Developer Suite was targeting.
The solution we came up with was to use GCJ to compile Eclipse into native binaries that could run without having a JVM installed. We knew that native compilation would help with the performance problems, because we would no longer have the overhead that comes with the JVM layer. It also would solve the platform coverage problem, as GCJ/libgcj was available on all of the 64-bit platforms we had to support, although in some cases, such as x86_64, it still needed a lot of work. Native compilation solved the technical problems we had and gave us the additional benefits of reducing our external dependencies, allowing us to make some significant improvements to open-source Java and to demonstrate that open-source Java has matured to the point of being useful commercially.
At the outset of this project, we really didn't know if it was possible to compile Eclipse with GCJ and expect it to run. First, Eclipse is a large program—more than two million lines of code as counted by wc. We didn't know much about Eclipse internals or what runtime facilities it might use. Second, GCJ's background is in embedded systems, and we knew that work remained on parts of the Java programming language, class loaders in particular, which are used heavily by Eclipse. Third, the free class libraries were not complete. We didn't know if Eclipse could use facilities we hadn't written yet or even whether Eclipse might break the rules and use internal, undocumented com.sun.* interfaces, as too many Java programs seem to do.
We therefore took a two-pronged approach to determining whether a project like this could succeed. First, we used GCJ to make a list of the APIs used by Eclipse that we did not or could not implement. To accomplish this, we wrote a shell script that would try to compile each Eclipse Java archive library (jar file) to object code. We then looked through the error messages to see what was missing. The results of this script were not encouraging: we found a large number of missing packages. Still, more investigation was required because some things didn't make sense. For instance, there were dependencies on the Swing graphical user interface classes, but we knew that Eclipse used SWT and not Swing.
Further investigation showed that many of the weird undefined references came not from Eclipse itself but from the third-party jar files included with it. For example, Eclipse includes its own copy of the Ant build tool and its own copy of the Apache Tomcat dynamic Web server. We knew that in many cases, the referenced classes would not actually be invoked in the Eclipse environment. This encouraged us to take another look at how to get Eclipse working.
Our second angle of attack was to try running Eclipse using the bytecode interpreter that comes with libgcj. By doing this, we reasoned, we would concentrate on runtime bugs, including the aforementioned class loader problems and missing functionality actually used by Eclipse.
This approach also was discouraging initially. We ran into problems not only with class loading, but also with the fact that libgcj's implementation of protection domains needed work. These are the bases for Java's secure sandbox architecture, which allows untrusted code to be run in a secure way. Problems in this area had an unfortunate shadowing effect—we had to fix each bug before we could discover the next one.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Google's SwiftShader Released
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Interview with Patrick Volkerding
- Managing Linux Using Puppet
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- Non-Linux FOSS: Caffeine!
- SuperTuxKart 0.9.2 Released
- Tech Tip: Really Simple HTTP Server with Python
- Parsing an RSS News Feed with a Bash Script
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide