The Scalable Test Platform
The Open Source Development Lab (OSDL) is a nonprofit company working to enhance Linux scalability and telco capabilities. OSDL sponsors (www.osdlab.org/sponsors) have financed a full-scale test and development lab, complete with terabytes of storage and an array of SMP servers with anywhere from 2 to 16 CPUs. At the lab we provide developers with full access to enterprise-class machines via remote login.
We have been working with developers on the creation and execution of their tests. During this process, we have noticed a number of things that have to be done again and again for each test that comes through the lab. We listed the tasks that went into running an average test sequence and found a great deal of the process involved human interaction that could be automated. The Scalable Test Platform (STP) is the result of our attempt to automate the testing process from request to report.
Benchmarking itself has inherent concept problems that are outside both the scope of this article and the scope of the Scalable Test Platform effort. There are, however, solvable problems with current testing practices, and that is what the STP attempts to address. Please keep in mind, the benchmarking we focus on is completely different from methods used to get marketable benchmark numbers.
The configuration of a testing environment is rarely as well documented as it should be. Documentation on the setup of systems used in tests is usually limited to what the tester believes is relevant to their specific research goals. This lack of detail will cause problems later on, when other analysts are examining the report. It is not uncommon for an analyst to have to duplicate an entire test sequence to get the data required to answer questions that come up later. It is also common practice for a testing setup to be only partially automated. The resulting human interaction at undocumented moments will also affect the repeatability of the results.
Performance testing can require massive resources, both in the form of time and hardware. How many open-source developers can get access to 50 two-way client servers on a gigabit network in order to test a server farm made up of multiple 8-CPU servers and a 16-CPU server? Few companies would stretch to provide access to hardware like that and then only with a full entourage of managers and the potential revenue return to justify the expense. A good idea conceived by a developer without access to hardware like this is likely to remain unexplored.
Currently no central archive exists of well-documented results for performance, stability and standard compliance tests. Researchers are forced to run their own tests or pick and choose from mediocre results to come up with a less-than-accurate guess. System administrators have no central place to look for starter information on what combination of kernel, distribution and hardware tends to work well for a workload similar to what they anticipate. This lack of available research leads to confusion regarding the performance and reliability among the myriad of Linux choices.
Linux kernel developers cannot spend the time and effort required to run long performance and stability tests on their patches. Even if a developer is willing to spend the time testing a patch, testing software often requires a great deal of knowledge and specialized hardware just to install and configure. Occasionally this situation leads to problems being introduced into both the stable and development kernel trees. It also can allow problems solved previously to recur in future development but go unnoticed because of a lack of regression testing.
A number of developers have spoken up on the Linux kernel mailing list requesting a standard testing procedure for new patches. Many users and developers agree that a simple procedure, including performance, stability, standards compliance and regression testing, would benefit Linux kernel development.
While you can't test for every bug out there, you can check for common types of problems. It's generally not too difficult to add a regression test case to your testing suite after a bug is found and fixed. The problem is not in the creation of these tests. Most developers realize that it's a good idea to have a few synthetic tests available and very often do so. The problem is that most developers can't or won't take the time to configure a full range of verification tests. While coding can be fun, testing is often quite boring. If a developer could easily request a full test of their code and then continue working while someone else does the dirty work, we think they would be more inclined to attempt verification runs on their patches.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Validate an E-Mail Address with PHP, the Right Way
- RSS Feeds
- Readers' Choice Awards
- Tech Tip: Really Simple HTTP Server with Python
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




1 hour 53 min ago
2 hours 25 min ago
4 hours 49 min ago
4 hours 52 min ago
4 hours 53 min ago
9 hours 18 min ago
11 hours 9 min ago
16 hours 23 min ago
19 hours 34 min ago
21 hours 49 min ago