New Projects - Fresh from the Labs

SOFA—Statistics Open For All (

If statistics is your game, and you're chasing an easy-to-use and comprehensive package that outputs great-looking charts, look no further. According to the Web site: “SOFA is a user-friendly statistics, analysis and reporting program. It is free, with an emphasis on ease of use, learn as you go and beautiful output. SOFA lets you display results in an attractive format ready to share.”

SOFA Statistics provides a highly flexible visualization system for analyzing complex data.

A montage of some of SOFA's beautiful graphs and charts, generated on the fly.

My favorite feature of SOFA's is its ability to generate HTML pages of your work dynamically, which can be viewed by anyone with a browser.


Binary packages are available for Linux, Windows and Mac (with Linux at the top of the list). Sadly, the Linux binary is only for Ubuntu, but the obligatory source also is available. Ubuntu users can grab the .deb and work things out for themselves, but the source is a bit trickier. At the time of this writing, the installation process was in a state of flux, so project maintainer Grant Paton-Simpson will have some special instructions up at the Web site for LJ readers when this article is printed.

As far as library requirements, here's what Grant told me you need:

  • python (>= 2.6.2).

  • wx-common (>=

  • python-wxversion (>=

  • python-wxgtk2.8 (>=

  • python-numpy (>= 1:1.2.1).

  • python-pysqlite2 (>= 1.0.1).

  • python-mysqldb (>= 1.2.2).

  • python-pygresql (>= 1:4.0).

  • python-matplotlib (>=

  • python-webkit (>= 1.0.0).

Once the program is installed, you should be able to find SOFA Statistics in your menu; otherwise, you'll need to run it from a terminal. If you need to use the command line, enter:

$ python /usr/share/pyshared/sofa/

This path may be different on some distributions, and Grant may have made a link to a bin directory by the time this article is published (meaning you could start SOFA Statistics with a simple one-word command).


Grant has gone to a lot of effort making some excellent video tutorials, and there's no way I can improve upon them, so instead, I concentrate on highlighting cool features here. Again, Grant appears to be one step ahead of me in that he's provided a default set of preloaded values you can use to explore the project with ease, rather than going through the laborious process of first having to learn how to enter data and then making it display something meaningful. For now, let's look at the three main sections: Report Tables, Charts and Statistics.

Under Report Tables, choose some random settings under Table Type, provide names for Title and Subtitle, and choose some of the available data fields with the Add button. Now click Run, and a swank new table is presented to you. Don't like the aesthetics? No problem. The Style output using... drop-down box lets you change the border to something more pleasing—a nice touch.

The pièce de résistance is probably the Charts section. This is where you can play around with the charts you see here in the screenshots, and more and more chart types are being added over time. Whether you want a bar graph, pie chart, line graph or something like a Scatterplot configuration, chances are it's doable. Play with some values in the Variables section, choose a Chart Type, click Run and a beautiful chart appears.

The Statistics section is where the elegance of design and data flow really come into play. This section is a bit beyond me, but here you can run statistical tests on your data, with a focus on the kind of tests most users need, most of the time. You can choose from common tests, such as ANOVA or Chi Square, or run through a check list of choices to choose what's right for you. Click Configure Test on the right, and you'll be presented with the final screen.

From here, you can choose which variables and groupings you want to test against. And finally, click Run. This section gives you the most impressive of the readouts and provides a comprehensive bundle of tables and graphs of analyzed statistics.

However, one of the most impressive and practical features under all three of these main sections is the Send output to... feature, with its View button. Here you actually can view each page of output in any Web browser in HTML format. This gives the project some instant credibility and practicality in that any work you do in SOFA can be opened instantly by anyone (like your coworkers) on their own computers, without needing to install SOFA Statistics. Plus, the information they see will be presented professionally with some impressive graphics to boot.

Although SOFA Statistics is still in its slightly buggy developmental stage, project maintainer Grant Paton-Simpson has shown an impressive grasp of what needs to be included in SOFA, from the small touches to the big. My hope is that this program becomes an adopted industry standard of sorts, mentioned in everyday conversation by organization workers the world over. And, given its free and multiplatform nature, combined with a very canny coder and designer, this hope of mine may not be an unrealistic one.


John Knight is the New Projects columnist for Linux Journal.

One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix