New Projects - Fresh from the Labs
If statistics is your game, and you're chasing an easy-to-use and comprehensive package that outputs great-looking charts, look no further. According to the Web site: “SOFA is a user-friendly statistics, analysis and reporting program. It is free, with an emphasis on ease of use, learn as you go and beautiful output. SOFA lets you display results in an attractive format ready to share.”

My favorite feature of SOFA's is its ability to generate HTML pages of your work dynamically, which can be viewed by anyone with a browser.
Installation
Binary packages are available for Linux, Windows and Mac (with Linux at the top of the list). Sadly, the Linux binary is only for Ubuntu, but the obligatory source also is available. Ubuntu users can grab the .deb and work things out for themselves, but the source is a bit trickier. At the time of this writing, the installation process was in a state of flux, so project maintainer Grant Paton-Simpson will have some special instructions up at the Web site for LJ readers when this article is printed.
As far as library requirements, here's what Grant told me you need:
python (>= 2.6.2).
wx-common (>= 2.8.9.2).
python-wxversion (>= 2.8.9.2).
python-wxgtk2.8 (>= 2.8.9.2).
python-numpy (>= 1:1.2.1).
python-pysqlite2 (>= 1.0.1).
python-mysqldb (>= 1.2.2).
python-pygresql (>= 1:4.0).
python-matplotlib (>= 0.98.5.2).
python-webkit (>= 1.0.0).
Once the program is installed, you should be able to find SOFA Statistics in your menu; otherwise, you'll need to run it from a terminal. If you need to use the command line, enter:
$ python /usr/share/pyshared/sofa/start.py
This path may be different on some distributions, and Grant may have made a link to a bin directory by the time this article is published (meaning you could start SOFA Statistics with a simple one-word command).
Usage
Grant has gone to a lot of effort making some excellent video tutorials, and there's no way I can improve upon them, so instead, I concentrate on highlighting cool features here. Again, Grant appears to be one step ahead of me in that he's provided a default set of preloaded values you can use to explore the project with ease, rather than going through the laborious process of first having to learn how to enter data and then making it display something meaningful. For now, let's look at the three main sections: Report Tables, Charts and Statistics.
Under Report Tables, choose some random settings under Table Type, provide names for Title and Subtitle, and choose some of the available data fields with the Add button. Now click Run, and a swank new table is presented to you. Don't like the aesthetics? No problem. The Style output using... drop-down box lets you change the border to something more pleasing—a nice touch.
The pièce de résistance is probably the Charts section. This is where you can play around with the charts you see here in the screenshots, and more and more chart types are being added over time. Whether you want a bar graph, pie chart, line graph or something like a Scatterplot configuration, chances are it's doable. Play with some values in the Variables section, choose a Chart Type, click Run and a beautiful chart appears.
The Statistics section is where the elegance of design and data flow really come into play. This section is a bit beyond me, but here you can run statistical tests on your data, with a focus on the kind of tests most users need, most of the time. You can choose from common tests, such as ANOVA or Chi Square, or run through a check list of choices to choose what's right for you. Click Configure Test on the right, and you'll be presented with the final screen.
From here, you can choose which variables and groupings you want to test against. And finally, click Run. This section gives you the most impressive of the readouts and provides a comprehensive bundle of tables and graphs of analyzed statistics.
However, one of the most impressive and practical features under all three of these main sections is the Send output to... feature, with its View button. Here you actually can view each page of output in any Web browser in HTML format. This gives the project some instant credibility and practicality in that any work you do in SOFA can be opened instantly by anyone (like your coworkers) on their own computers, without needing to install SOFA Statistics. Plus, the information they see will be presented professionally with some impressive graphics to boot.
Although SOFA Statistics is still in its slightly buggy developmental stage, project maintainer Grant Paton-Simpson has shown an impressive grasp of what needs to be included in SOFA, from the small touches to the big. My hope is that this program becomes an adopted industry standard of sorts, mentioned in everyday conversation by organization workers the world over. And, given its free and multiplatform nature, combined with a very canny coder and designer, this hope of mine may not be an unrealistic one.
John Knight is the New Projects columnist for Linux Journal.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?






2 hours 8 min ago
2 hours 24 min ago
4 hours 15 min ago
10 hours 7 min ago
14 hours 39 min ago
14 hours 39 min ago
16 hours 39 min ago
1 day 1 hour ago
1 day 1 hour ago
1 day 2 hours ago