Writing a Java Class to Manage RPM Package Content

A look inside RPM packages and how to use Java to extract information.
A Portable Tool to View RPM Packages

Now we know RPM packages are interesting. Many of them are available on the Net today and having a portable tool able to analyze an RPM package before installing it could be an interesting utility.

The Choice of Language

I think there are only two possibilities if you want to be portable to multiple UNIX and non-UNIX systems and easy to use in the Internet context: Perl or Java. From a technical point of view, there is no reason to prefer one over the other. The choice is a personal decision.

I have more experience programming Java than Perl. After a long and difficult thought process, I decided to start in Java, reasoning that if I later needed to add graphical presentation classes to the component, I could use the Java Swing package (which is available with JDK1.1 or JDK 1.2).

Where to Start

If you look at the /usr/lib directory of a Red Hat distribution, you will find a librpm.a static archive library. This library is provided with its corresponding C language prototypes: rpmlib.h, header.h and dbindex.h, located in /usr/include/rpm.

You can use those prototypes if you need to develop C utilities which deal with RPM resources. Chapter 21 of E. C. Bailey's book (see Resources) provides detailed information on how to do this. But, since we want to provide an independent Java package, these prototypes are of no interest to us.

The right place to start from (in the same resource) is Appendix A: Format of the RPM file, which gives us the RPM File format. The same appendix also provides us with the following sage advice: “RPM file format is subject to change.”

If an RPM file format is to be manipulated, you are strongly urged to use RPM routines to access the package file. Why? “RPM file format is subject to change”!

In our case, we will assume there is no immediate danger in querying an existing RPM package, since we commit to never modifying its structure inside our Java package.

very confusing. Please make sure a pair of technical eyes looks it over to make sure it sounds OK. Dave Wright's changes were incorporated. -Ellen

The RPM Class Design

Figure 2. Structure of the Java RPM Classes Design in UML Format

Figure 2 represents the structure of the Java RPM class designs in UML format (Unified Modeling Language). Let's explain it in more detail. The UML class design provides a clean high-level representation of what an RPM package is.

Content is interesting information on the package and its installation rules. The content itself (not represented in the UML picture, for clarity) is only a compressed archive. When uncompressed, it is a cpio archive in SVR4 format with a CRC checksum (see Resources).

I cleanly separate the RPM object from its graphical representation. The classes in Figure 2 implement only operations on RPM files; they don't provide any graphical representation of them. Another class, called RpmFilePanel, will be added to provide a simple Swing display, which will graphically manipulate the basic RpmFile class, designed to implement the behavior of an RPM file.

The first interesting class is the RpmException class. This class inherits from the basic Java Exception class and implements a default constructor with no parameters and a constructor which takes a String message parameter. This class is the only exception rendered by the RpmFile Java Package. I am convinced that, when writing a new Java package, the first thing you should do is build an exception wrapper for the package. Later on, all the classes of the RpmFile package will throw an RpmException with an accurate message when something goes wrong. From an object-oriented design point of view, this technique improves your design's robustness, providing your package with full isolation from the basic system layer. You can, of course, do the same thing in C++. The only problem is that support for exceptions by some C++ compiler implementations may not be available, and portability of your C++ code could be more difficult to implement.

The next public Java class is the RpmFile class itself. The public methods made available by the RpmFile class implement the following basic services (The constructor gives the ability to build a class instance. It does not take any parameters.):

  • set_rpmFileName (fileName) method: this takes a URL fileName string as its parameter. This method binds the RpmFile instance with an URL representing a valid RPM package to view. If a problem occurs during the bind, a RpmFileException is thrown.

  • Vector get_rpmReport() method: once the Rpm package has been bound to the RpmFile instance, this method can be called to get the package information. The information given back by this method uses a String vector which contains all the information found in the RPM package header structures.

The RpmFile logic is based on following two inner classes:

  • _RPM_LEAD_: internally instantiated by the RpmFile class to validate the RPM lead structure of the loaded RPM URL file.

  • Once the _RPM_LEAD_ has been validated, the RpmFile class instantiates an _RPM_STRUCTURE_HEADER_ class, which is used to check the RPM file header content. The header content consists of multiple _RPM_INDEX_ENTRY_ stored in an internal array. Each element of this array represents a piece of header information which will be made available later via the get_rpmReport method. Since there is no reason to make those classes visible to mere mortals outside the RpmFile, they have been implemented as internal Java Classes inside the RpmFile class. A more precise UML graph is provided in Figure 3. I used the JVision 1.2 tool to automatically generate the UML class diagram from the Java source code. JVision is a very interesting, easy-to-use tool from Object Insight (http://www.object-insight.com/). This tool is able to automatically generate UML diagrams from Java source classes. Although not free, the license price is reasonable compared to other products. I have been using it for more than one year now, and it helps me in producing Java project documentation. A Linux beta version (free for non-commercial use) of the product is available on the Object Insight web site.

Figure 3. RPM

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState