ROOT: An Object-Oriented Data Analysis Framework
ROOT is a system for large scale data analysis and data mining. It is being developed for the analysis of Particle Physics data, but can be equally well used in other fields where large amounts of data need to be processed.
After many years of experience in developing interactive data analysis systems like PAW and PIAF (see Resources), we realized that the growth and maintainability of these products, written in FORTRAN and using 20-year-old libraries, had reached its limits. Although still popular in the physics community, these systems do not scale up to the challenges offered by the next generation particle accelerator, the Large Hadron Collider (LHC), currently under construction at CERN, in Geneva, Switzerland. The expected amount of data produced by the LHC will be on the order of several petabytes (1PB = 1,000,000GB) per year. This is two to three orders of magnitude more than what is being produced by the current generation of accelerators.
Therefore, in early 1995, Rene Brun and I started developing a system, intending to overcome the deficiencies of these previous programs. One of the first decisions we made was to follow the object-oriented analysis and design methodology and to use C++ as our implementation language. Although all of our previous programming experience was in FORTRAN, we soon realized the power of OO and C++, and after some initial “throw-away” prototyping, the ROOT system began to take shape.
In November 1995, we gave the first public presentation of ROOT at CERN and, at the same time, version 0.5 was released via the Web. By then, Nenad Buncic and Valery Fine had joined our team.
Since the initial release, there has been a constantly increasing number of users. In response to comments and feedback, we've been regularly releasing new versions containing bug fixes and new features. In January 1997, version 1.0 was released and in March 1998 version 2.0. Since the release of version 1.0, more than 9,300 copies of the ROOT binaries have been downloaded from our web site, about 500 people have registered as ROOT users, and the web site gets up to 100,000 hits per month.
ROOT is currently being used in many different fields such as physics, astronomy, biology, genetics, finance, insurance, pharmaceuticals, etc.
The source and binaries for many different platforms can be downloaded from the ROOT web site (http://root.cern.ch/). The current version can be used and distributed freely as long as proper credit is given and copyright notices are maintained. For commercial use, the authors would like to be notified.
The main components of the ROOT system are:
A hierarchical object-oriented database (machine independent, highly compressed, supporting schema evolution and object versioning)
A C++ interpreter
Advanced statistical analysis tools (classes for multi-dimensional histogramming, fitting and minimization)
Visualization tools (classes for 2D and 3D graphics including an OpenGL interface)
A rich set of container classes that are fully I/O aware (list, sorted list, map, btree, hashtable, object array, etc.)
An extensive set of GUI classes (windows, buttons, combo-box, tabs, menus, item lists, icon box, tool bar, status bar and many others)
An automatic HTML documentation generation facility
Run-time object inspection capabilities
Client/server networking classes
Shared memory support
Remote database access, either via a special daemon or via the Apache web server
Ported to all known UNIX and Linux systems and also to Windows 95 and NT
The complete system consists of about 450,000 lines of C++ and 80,000 lines of C code. There are about 310 classes grouped in 24 different frameworks, each class represented by its own shared library.
One of the key components of the ROOT system is the CINT C/C++ interpreter. CINT, written by Masaharu Goto of Hewlett Packard Japan, covers 95% of ANSI C and about 85% of C++. Template support is being worked on, and exceptions are still missing. CINT is complete enough to be able to interpret its own 70,000 lines of C and to let the interpreted interpreter interpret a small program.
The advantage of a C/C++ interpreter is that it allows for fast prototyping, since it eliminates the typical time consuming edit/compile/link cycle. Once a script or program is finished, you can compile it with a standard C/C++ compiler (gcc) to machine code and enjoy full machine performance. Since CINT is very efficient (for example, for/while loops are byte-code compiled on the fly), it is quite possible to run small programs in the interpreter. In most cases, CINT outperforms other interpreters like Perl and Python.
Existing C and C++ libraries can easily be interfaced to the interpreter. This is done by generating a dictionary from the function and class definitions. The dictionary provides CINT with all necessary information to be able to call functions, create objects and call member functions. A dictionary is easily generated by the program rootcint that uses the library header files as input and produces a C++ file containing the dictionary as output. You compile the dictionary and link it with the library code into a single shared library. At run-time, you dynamically link the shared library, and then you can call the library code via the interpreter. This can be a very convenient way to quickly test some specific library functions. Instead of having to write a small test program, you just call the functions directly from the interpreter prompt.
The CINT interpreter is fully embedded into the ROOT system. It allows the ROOT command line, scripting and programming languages to be identical. The embedded interpreter dictionaries provide the necessary information to automatically create GUI elements like context pop-up menus unique for each class and for the generation of fully hyperized HTML class documentation. Furthermore, the dictionary information provides complete run-time type information (RTTI) and run-time object introspection capabilities.
|Designing Electronics with Linux||May 22, 2013|
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
- Designing Electronics with Linux
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Validate an E-Mail Address with PHP, the Right Way
- Tech Tip: Really Simple HTTP Server with Python
- Build a Skype Server for Your Home Phone System
- Why Python?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Reply to comment | Linux Journal
28 min 6 sec ago
- Reply to comment | Linux Journal
1 hour 18 min ago
- Not free anymore
5 hours 20 min ago
9 hours 7 min ago
- Reply to comment | Linux Journal
9 hours 15 min ago
- Understanding the Linux Kernel
11 hours 30 min ago
13 hours 59 min ago
- Kernel Problem
1 day 2 min ago
- BASH script to log IPs on public web server
1 day 4 hours ago
1 day 8 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?