Writing a Java Class to Manage RPM Package Content

A look inside RPM packages and how to use Java to extract information.
A Portable Tool to View RPM Packages

Now we know RPM packages are interesting. Many of them are available on the Net today and having a portable tool able to analyze an RPM package before installing it could be an interesting utility.

The Choice of Language

I think there are only two possibilities if you want to be portable to multiple UNIX and non-UNIX systems and easy to use in the Internet context: Perl or Java. From a technical point of view, there is no reason to prefer one over the other. The choice is a personal decision.

I have more experience programming Java than Perl. After a long and difficult thought process, I decided to start in Java, reasoning that if I later needed to add graphical presentation classes to the component, I could use the Java Swing package (which is available with JDK1.1 or JDK 1.2).

Where to Start

If you look at the /usr/lib directory of a Red Hat distribution, you will find a librpm.a static archive library. This library is provided with its corresponding C language prototypes: rpmlib.h, header.h and dbindex.h, located in /usr/include/rpm.

You can use those prototypes if you need to develop C utilities which deal with RPM resources. Chapter 21 of E. C. Bailey's book (see Resources) provides detailed information on how to do this. But, since we want to provide an independent Java package, these prototypes are of no interest to us.

The right place to start from (in the same resource) is Appendix A: Format of the RPM file, which gives us the RPM File format. The same appendix also provides us with the following sage advice: “RPM file format is subject to change.”

If an RPM file format is to be manipulated, you are strongly urged to use RPM routines to access the package file. Why? “RPM file format is subject to change”!

In our case, we will assume there is no immediate danger in querying an existing RPM package, since we commit to never modifying its structure inside our Java package.

very confusing. Please make sure a pair of technical eyes looks it over to make sure it sounds OK. Dave Wright's changes were incorporated. -Ellen

The RPM Class Design

Figure 2. Structure of the Java RPM Classes Design in UML Format

Figure 2 represents the structure of the Java RPM class designs in UML format (Unified Modeling Language). Let's explain it in more detail. The UML class design provides a clean high-level representation of what an RPM package is.

Content is interesting information on the package and its installation rules. The content itself (not represented in the UML picture, for clarity) is only a compressed archive. When uncompressed, it is a cpio archive in SVR4 format with a CRC checksum (see Resources).

I cleanly separate the RPM object from its graphical representation. The classes in Figure 2 implement only operations on RPM files; they don't provide any graphical representation of them. Another class, called RpmFilePanel, will be added to provide a simple Swing display, which will graphically manipulate the basic RpmFile class, designed to implement the behavior of an RPM file.

The first interesting class is the RpmException class. This class inherits from the basic Java Exception class and implements a default constructor with no parameters and a constructor which takes a String message parameter. This class is the only exception rendered by the RpmFile Java Package. I am convinced that, when writing a new Java package, the first thing you should do is build an exception wrapper for the package. Later on, all the classes of the RpmFile package will throw an RpmException with an accurate message when something goes wrong. From an object-oriented design point of view, this technique improves your design's robustness, providing your package with full isolation from the basic system layer. You can, of course, do the same thing in C++. The only problem is that support for exceptions by some C++ compiler implementations may not be available, and portability of your C++ code could be more difficult to implement.

The next public Java class is the RpmFile class itself. The public methods made available by the RpmFile class implement the following basic services (The constructor gives the ability to build a class instance. It does not take any parameters.):

  • set_rpmFileName (fileName) method: this takes a URL fileName string as its parameter. This method binds the RpmFile instance with an URL representing a valid RPM package to view. If a problem occurs during the bind, a RpmFileException is thrown.

  • Vector get_rpmReport() method: once the Rpm package has been bound to the RpmFile instance, this method can be called to get the package information. The information given back by this method uses a String vector which contains all the information found in the RPM package header structures.

The RpmFile logic is based on following two inner classes:

  • _RPM_LEAD_: internally instantiated by the RpmFile class to validate the RPM lead structure of the loaded RPM URL file.

  • Once the _RPM_LEAD_ has been validated, the RpmFile class instantiates an _RPM_STRUCTURE_HEADER_ class, which is used to check the RPM file header content. The header content consists of multiple _RPM_INDEX_ENTRY_ stored in an internal array. Each element of this array represents a piece of header information which will be made available later via the get_rpmReport method. Since there is no reason to make those classes visible to mere mortals outside the RpmFile, they have been implemented as internal Java Classes inside the RpmFile class. A more precise UML graph is provided in Figure 3. I used the JVision 1.2 tool to automatically generate the UML class diagram from the Java source code. JVision is a very interesting, easy-to-use tool from Object Insight (http://www.object-insight.com/). This tool is able to automatically generate UML diagrams from Java source classes. Although not free, the license price is reasonable compared to other products. I have been using it for more than one year now, and it helps me in producing Java project documentation. A Linux beta version (free for non-commercial use) of the product is available on the Object Insight web site.

Figure 3. RPM


One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix