ROOT: An Object-Oriented Data Analysis Framework

A report on a data analysis tool currently being developed at CERN.
Installation

The binaries and sources of ROOT can be downloaded from http://root.cern.ch/root/Version200.html. After downloading, uncompress and unarchive (using tar) the file root_v2.00.Linux.2.0.33.tar.gz in your home directory (or in a system-wide location such as /opt). This procedure will produce the directory /root. This directory contains the following files and subdirectories:

  • AA_README: read this file before starting

  • bin: directory containing executables

  • include: directory containing the ROOT header files

  • lib: directory containing the ROOT libraries (in shared library format)

  • macros: directory containing system macros (e.g., GL.C to load OpenGL libs)

  • icons: directory containing xpm icons

  • test: some ROOT test programs

  • tutorials: example macros that can be executed by the bin/root module

Before using the system, you must set the environment variable ROOTSYS to the root directory, e.g., export ROOTSYS=/home/rdm/root, and you must add $ROOTSYS/bin to your path. Once done, you are all set to start rooting.

First Interactive Session

In this first session, start the ROOT interactive program root. This program gives access via a command-line prompt to all available ROOT classes. By typing C++ statements at the prompt, you can create objects, call functions, execute scripts, etc. Go to the directory $ROOTSYS/tutorials and type:

bash$ root
root [0] 1+sqrt(9)
(double)4.000000000000e+00
root [1] for (int i = 0; i < 5; i++)<\n>
printf("Hello %d\n", i)
Hello 0
Hello 1
Hello 2
Hello 3
Hello 4
root [2] .q

As you can see, if you know C or C++, you can use ROOT. No new command-line or scripting language to learn. To exit, use .q, which is one of the few “raw” interpreter commands. The dot is the interpreter escape symbol. There are also some dot commands to debug scripts (step, step over, set breakpoint, etc.) or to load and execute scripts.

Let's now try something more interesting. Again, start root:

bash$ root
root [0] TF1 f1("func1", "sin(x)/x", 0, 10)
root [1] f1.Draw()
root [2] f1.Dump()
root [3] f1.Inspect()
 // Select File/Close Canvas
root [4] .q

Figure 1. Output of f1.Draw()

Here you create an object of class TF1, a one-dimensional function. In the constructor, you specify a name for the object (which is used if the object is stored in a database), the function and the upper and lower value of x. After having created the function object you can, for example, draw the object by executing the TF1::Draw member function. Figure 1 shows how this function looks. Now, move the mouse over the picture and see how the shape of the cursor changes whenever you cross an object. At any point, you can press the right mouse button to pop-up a context menu showing the available member functions for the current object. For example, move the cursor over the function so that it becomes a pointing finger, and then press the right button. The context menu shows the class and name of the object. Select item SetRange and put -10, 10 in the dialog box fields. (This is equivalent to executing the member function f1.SetRange(-10,10) from the command-line prompt, followed by f1.Draw().) Using the Dump member function (that each ROOT class inherits from the basic ROOT class TObject), you can see the complete state of the current object in memory. The Inspect function shows the same information in a graphics window.

Histogramming and Fitting

Let's start root again and run the following two macros:

bash$ root
root [0] .x hsimple.C
root [1] .x ntuple1.C
 // interact with the pictures in the canvas
root [2] .q

Note: if the above doesn't work, make sure you are in the tutorials directory.

Figure 2. Output of ntuple1.C

Macro hsimple.C (see $ROOTSYS/tutorials/hsimple.C) creates some 1D and 2D histograms and an Ntuple object. (An Ntuple is a collection of tuples; a tuple is a set of numbers.) The histograms and Ntuple are filled with random numbers by executing a loop 25,000 times. During the filling, the 1D histogram is drawn in a canvas and updated each 1,000 fills. At the end of the macro, the histogram and Ntuple objects are stored in a ROOT database.

The ntuple1.C macro uses the database created in the previous macro. It creates a canvas object and four graphics pads. In each of the four pads, a distribution of different Ntuple quantities is drawn. Typically, data analysis is done by drawing in a histogram with one of the tuple quantities when some of the other quantities pass a certain condition. For example, our Ntuple contains the quantities px, py, pz, random and i. The command:

ntuple->Draw("px", "pz < 1")

will fill a histogram containing the distribution of the px values for all tuples for which pz < 1. Substitute for the abstract quantities used in this example quantities such as name, sex, age, length, etc., and you can easily understand that Ntuples can be used in many different ways. An Ntuple of 25,000 tuples is quite small. In typical physics analysis situations, Ntuples can contain many millions of tuples. Besides the simple Ntuple, the ROOT system also provides a Tree. A Tree is an Ntuple generalized to complete objects. That is, instead of sets of tuples, a Tree can store sets of objects. The object attributes can be analyzed in the same way as the tuple quantities. For more information on Trees, see the ROOT HOWTOs at http://root.cern.ch/root/Howto.html.

During data analysis, you often need to test the data with a hypothesis. A hypothesis is a theoretical/empirical function that describes a model. To see if the data matches the model, you use minimization techniques to tune the model parameters so that the function best matches the data; this is called fitting. ROOT allows you to fit standard functions like polynomials, Gaussian exponentials or custom defined functions to your data. In the top right pad in Figure 2, the data has been fit with a polynomial of degree two (red curve). This was done by calling the Fit member function of the histogram object:

hprofs->Fit("pol2")

Moving the cursor over the canvas allows you to interact with the different objects. For example, the 3D plot in the lower-right corner can be rotated by clicking the left mouse button and moving the cursor.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Root

Leo Tilson's picture

I am often a little nervous when approaching a software package for the first time. This is exactly the sort of article to help me overcome my fears. From what I hear, ROOT is probably one of the most powerful databases out there - I am impressed for instance by the way in which the data can be highly compressed - but you also seem to have devoted a huge amount of effort into making it as user friendly as possible.

I have been reviewing my use of databases over the last few days because of some concern about MySQL, these concerns being prompted by the legal action being taken by Oracle against Google, and Oracle's reduction in support for OpenSolaris. I had never considered ROOT as an alternative to MySQL, until recently thinking of it as a rather specialised tool. I was pleased to note however that there are bindings for Ruby amongst other languages.

I wonder about your figures for OS shares? I have just installed ROOT, but instead of downloading it from its official site I have installed it from the Debian repository. If many people do this, then the numbers you record for Linux usage may underestimate its Linux usage.

Thank you.

Leo

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix