The REDACLE Work-Flow Management System
Subnuclear particles, tiny objects indeed, need to be revealed and measured by huge detectors. This field is known as high-energy Physics (HEP), and experimental HEP is a cutting-edge science. It uses and promotes the most recent technologies, it invents new tools and it encourages knowledge exchange. For all of these reasons, HEP has long been the realm of open-source software.
The bad news is HEP has become increasingly complicated; what was built in a craftsman-like style yesterday is now an industrial process that requires dedicated management software, usually expensive. We are living this experience in our experiment: a large international collaboration engaged in the construction of a 12,500-ton detector, called CMS (Compact Muon Solenoid), scheduled to take data at the CERN, Geneva, Large Hadron Collider in 2007. Our group in Rome, endowed by the Italian Institute for Nuclear Physics (INFN) and located in the Physics Department of the University La Sapienza, is working on the construction of the electromagnetic calorimeter. The calorimeter is made from about 500,000 parts, including scintillating crystals and photo-detectors. This process requires data management, quality control and bookkeeping, all of which relies on work-flow management.
A work-flow management system (WFMS) is “software that enables the automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules” (www.e-workflow.org). Using a WFMS allows a coordinator to establish the flow of operations needed to realize a product. Operators are guided through the construction sequence, and unforeseen deviations from the sequence are avoided. Each operation generates data, such as measurements, comments and tags, that are recorded in a database.
Originally, a WFMS based on proprietary components was used in our production for about two years. It proved to be clumsy, slow, resource-demanding, hard to resume after hang-ups and troublesome to integrate with other tools. When the flow of incoming calorimeter parts became higher and the assembly rate could not catch up, we made the decision to develop our own solution, based on open-source components. Our requirements were to avoid the previous inefficiencies, to interface transparently with input and output data and to have a flexible solution. We chose to implement a system based on the LAMP (Linux, Apache, MySQL and Perl/PHP/Python) platform. Each component of LAMP has an important role: Linux and Apache provide the basic infrastructure for services and programming; MySQL is the back end of our WFMS; and Perl/PHP manage the interaction operator database.
Our WFMS is called REDACLE (Relational ECAL Database at Construction LEvel). In more detail, our requirements for the database design were:
High flexibility: the database structure should not change when adding new products or activities.
Ability to store quality control (QC) data: quality assurance is an important part of our work and collected data must be available to everyone for statistical analysis.
Variety of access: the database should be able to be queried through different methods, including shells, programs, scripts and the Web.
Requirement 3 automatically was satisfied by MySQL, and this fact, together with its simplicity and completeness, was the main reason we adopted LAMP. In order to satisfy the first two requirements, we developed a set of tables following a pattern, which is a common and standard way to solve a given problem, as in OO programming. We used the pattern called homomorphism, which is a simple representation of a many-to-one relationship. In practice, each part of the specific process with which we are dealing is represented in the database as records in two tables, an object table and an object definition table. Each object definition has an ID, actually a MySQL primary key number. Many objects share the same object definition, and the relationship between them is provided by a foreign key in the object table containing the corresponding definition ID.
An example might explain this design better. As stated in the introduction, our calorimeter is composed of many parts of different types. Each kind of part, such as a crystal, has many instances. The whole calorimeter has about 75,000 crystals. Parts and part definitions are kept in two separate database tables, as shown in Tables 1 and 2. Different instances of a part share the same part definition by the proper part ID in the partDefinition_id column. In these two tables, the part 33105000006306 is a type 1L barrel crystal, as shown by its partDefinition_id 195 found in the partDefinition table.
Table 1. The Part Table in REDACLE
Table 2. The partDefinition Table in REDACLE
The real benefit of this approach is flexibility. If, for any reason, new parts enter the game, the REDACLE database structure will not be modified. It is enough to add a new record to the definition table and relate it to new parts. But the REDACLE database is even more flexible; if we were building cars rather than calorimeters, the database structure would be exactly the same.
Activities are represented using the same approach: an Activity table holds instances of records described in the ActivityDescription table. Inserting a new activity within the work flow is a matter of supplying its description to the definition table and relating it to its occurrences in the Activity table. Again, with this design it is possible to describe a completely different business seamlessly. For mail delivery, for instance, the definition records could contain the description of the operations to be done on reception, shunting and delivery, while the Activity table could contain records with information about when a given operation was done on which parcel.
The work flow is defined by collecting several activity definitions and defining the order in which they should be executed. The interface software then checks that the activity being executed at a given time follows, in the work-flow definition, the last completed activity performed on a part. Activities can be skipped or repeated according to the interface software.
For quality control data we adopted the same homomorphic pattern by adding a further level of abstraction. We defined characteristics as data collected during a given activity performed on a part. The Characteristics table, however, does not store actual values, because they can be of a different nature—strings, numbers or even more complex types. The Characteristics table simply is a collection of keys: one of them links the characteristics to its definition in the charDefinition table. Actual characteristics are kept in separate tables according to their type.
Our process has three data types: single floating-point numbers, triplets of numbers and strings. The length of a crystal, for example, is a single number and is stored in the Value table. Some measurements are taken at different points along the crystal axis and in different conditions. The optical transmission, for one, is measured every 2cm at different wavelengths. It constitutes a triplet, the first number representing the position, the second the wavelength and the third the transmission. Each triplet is stored as a record in the multiValue table. The same is true for strings: operators perform a visual inspection of each crystal before manipulation, and they may provide comments to illustrate possible defects. In Tables 3 through 7, we show the above-mentioned tables for characteristics representation. The part 33101000018045 has been measured for length and transmission (TTO). Length is 229.7815mm in table value. The char_id field is 134821 pointing in the Characteristics table to charDefinition_id=6 corresponding to crystal length. The TTO is a set of triplets in the multiValue table. The visual inspection of that crystal resulted in the comment nonhomogeneous in the charValue table.
Table 3. charDefinition
|2||result of visual inspection||VIS_I_OPER||2|
Table 4. Characteristics
Table 6. multiValue
Again, this method makes REDACLE qualified for different types of businesses; in a dairy it could be used to record the bacterial load for each batch of milk, besides the producer (a character string), as characteristics. In addition, a completely new data type, such as pictures or sounds, could be added to the database without disturbing the schema simply by defining a new table. Adding pictures, for instance, implies the creation of a table with three fields: primary key, picture data as a BLOB and the relation to the Characteristics table, which is expressed by an integer ID. The MySQL code to create such a table is:
CREATE TABLE picture ( id INT NOT NULL AUTO_INCREMENT, data BLOB, char_id INT, INDEX (char_id), PRIMARY KEY (id) );
In our application, humans interact with the database in a multitude of ways—with the MySQL client, C++ and Java programs, Perl scripts and PHP scripts through Web pages. The use of a Web browser to render a graphical user interface (GUI) provides considerable advantages. The GUI is portable and does not require installation of specific components, no time is wasted on graphics, and the Web browser environment is well known by now to both operators or customers.
Figure 2. A set of five crystals being measured to determine their dimensions. Instruments need to report their measurements to the work-flow management system.
Another significant feature of REDACLE is its interface to other machines. During the calorimeter construction process, automatic machines take measurements of crystals and other parts without any human support (Figure 2). These machines, then, must be able to interact with REDACLE to learn the right sequence of operations to perform, to inform it about the start and the end time of the operations and to provide data to be stored as characteristics.
Our goal was to create a system that would allow almost any device to interact with REDACLE. We avoided imposing a given programming language or providing libraries for all the possible devices, because it does not scale with the market. Furthermore, some devices can be embedded systems with proprietary software.
We developed a dæmon called the Instrument Agent (IA) to act as an interface between REDACLE and instruments. The IA is a process that connects to an Internet port and is able to read ASCII characters and write them to that port. Instruments are required only to be able to connect to the network and send strings over the connection.
The sequence of operations is as follows:
After connecting, an instrument declares the part on which it is operating.
The IA queries the REDACLE database and searches for the last completed activity for that part in the work flow. The command the instrument should execute is stored in the database as a description field in the activityDefinition table.
The IA sends the instrument the proper command.
Upon recognition of the command, the instrument executes it, and the IA inserts a new activity in the REDACLE database after acknowledgement.
At the end of the job, the IA updates the activity just inserted, marks it as FINISHED and gets the data from the instrument as XML-formatted strings.
The result of the activity may contain both multiValue and charValue fields. Single values are formatted as follows:
<RE><FI>field name<VA>field value</VA></FI>...</RE>
<RE> stands for result, <FI> is field and <VA> is value. From the field name, the instrument agent obtains the characteristics definition ID and fills in the appropriate table according to the field value format (value for numbers and charValue for strings). The multiValue table is populated if the result is of the form:
<RE><NT>ntuple name <FI>field name<VA>field value</VA></FI> ... </NT></RE>
<NT> here stands for n-tuple, a collection of n elements.
Instrument software developers need not have knowledge of the details of the REDACLE database; they simply have to be instructed on the string formats to be used. No libraries to link to the program are prescribed, nor files to be included. The programming language is not imposed. The only requirement is to be able to provide a network connection to the IA.
Besides the GUI and instrument interfaces, we developed a set of ancillary command-line scripts for administrators and coordinators. In addition, we created a small library to run C++ programs and Perl scripts over the database without needing to formulate SQL queries.
REDACLE was released in our laboratory four months after the first discussions of the project were held. The whole system contains about 10,000 lines of code in Perl, C++, PHP and Java. The resources needed to run the software are small compared to the ones requested by the former system, hosted on a dual 800MHz Pentium III server. That system saturated the CPU at about 100% and occupied almost all of the 512MB of RAM. We also needed to upgrade all the client PCs, doubling their memories to support Java GUIs. So we planned a server upgrade to a dual 1GHz Pentium III with 1GB of RAM to improve the performance of the previous system. When using REDACLE, we discovered, amazingly enough, that CPU load was negligible and the average memory usage was 140–200MB.
It became clear that we had a need for tools to import from or export to the previous database, which still was used in other labs. These tools were built in Perl quickly, to read or write XML files, and we were able to import all the old data into REDACLE in one day.
Currently, we have about 13,000 parts in the database. The stored characteristics are 97,000, each of which may be composed of several values, for a total database size of 50MB. Out of the 15 tables the multiValue table, containing more than 1,000,000 records, is the largest at 41MB.
But the most spectacular result was obtained by comparing the time spent by operators in the calorimeter assembling. Before the introduction of REDACLE, 25% of the operators' time was wasted in the interaction with the work-flow manager. Using REDACLE, the interaction between operators and the database takes a negligible amount of time, improving the overall detector assembling efficiency.
Moreover, operators soon familiarized themselves with REDACLE's flexibility and started requesting new tools and interfaces. What before required weeks to develop or might have been almost impossible to build, now can be implemented in a short time with REDACLE and LAMP—between a few hours and two or three days.
The extraordinary flexibility of the REDACLE database design makes it practical for many different business processes and industrial applications, clearly illustrating how open-source software can be superior to proprietary offerings. In the future, we plan to support even more complex work-flow models as well as a library of frequently used queries and functions to be employed in the development of other REDACLE-based projects.
Giovanni Organtini (G.Organtini@roma1.infn.it) teaches Computing and Programming for Physicists at the University of Rome. He has been a Linux user both for research and teaching for nine years, and he has been married to Federica for ten. He is involved in the design and realization of particle detectors and advanced computing for high-energy Physics.
Luciano M. Barone (Luciano.Barone@roma1.infn.it) is Associate Professor of Physics at Rome University La Sapienza. Born to physics as a high-energy experimentalist, he quickly turned to computing applications in the same field. He likes to tackle large problems and tame them to the finest detail.