Speeding Up the Scientific Process
As a research staff member at NASA Ames Research Center at Moffett Field, in the heart of Silicon Valley, California, I was a part of a team that used Linux for some interesting and advanced research. I worked in the Neuro Engineering Lab at Ames in support of the construction of a brain-computer interface, a system by which EEG (electroencephalogram—brain wave) signals can be used to control electronic systems and robotic devices. My job was to take ideas and prototype code from primary lab researchers and develop and evaluate efficient implementations of them for use in real-time data processing with human subjects. Often, I would be handed only a rough sketch of an algorithm or a fragment of code to see if it could be used on the brain wave data we had been collecting.
Matlab and the free software GNU Octave were great tools for doing this work; they allowed me to develop effective methods for data processing and data visualization that would have been a real pain to construct in C or, heaven forbid, Fortran. Ease of implementation is a great concern when dealing with large amounts of experimental code that may or may not end up as a finished product.
When a process did indeed fit the bill, and it was time to start thinking about using it in our real-time data processing system, it would become immediately apparent that the advantages of Matlab in programming ease did not come without cost. The cost was speed. The time to process data representing a single second, for instance, could take minutes or even hours. Obviously, this would not do in a real-time system. Also, any code deemed worthy would have to end up in C or C++ to fit into our existing code base. To address both of these issues, we rewrote much of our Matlab code in C.
Now, if you have some experience with Matlab, you might think, “but Matlab already exports to C on its own” or, “what about the new Matlab JIT compiler?” Although the new JIT compiler may speed up code in places (looking at the documentation, there are many exceptions to what it will try to optimize), it cannot equal the efficiency of well-written, compiled C code. As for the C export feature of Matlab, the code exported by Matlab is as slow as the interpreted code running inside the Matlab environment and is fairly difficult to merge into existing projects without a bit of interface work. And, none of this helps users of GNU Octave or those that can't keep up with expensive Matlab upgrades. In general, it seems the best way to work something originally developed in Matlab into fast, production-level code, is to do it by hand.
This article first offers a few tips on how to write somewhat more efficient Matlab code. Then it illustrates the process of integrating C code into a Matlab program using MEX functions, in order to speed up program execution while still tweaking and evaluating it in the Matlab environment. From there, it is a relatively short step to bring the entire project into C or C++. Most of the information here is available in different places on-line; this article is presented as a sort of a HOWTO or a personal account of bringing a piece of Matlab experimental code into the real world.
For this article, I use as an example a piece of code developed to isolate rapid changes of voltage measured on the surface of the head. The code uses an algorithm called multicomponent event-related potential estimation, or simply mcERP. I first looked into porting Matlab code to C when working on this algorithm. When testing the algorithm with different configuration parameters and input data sets, I usually would have to let it run overnight. No amount of optimization inside Matlab was able to drastically cut down its execution time.
After full conversion to C, it usually would take on the order of tens of seconds to execute with a large input data set. I view this as an extreme time savings, due to the highly nested, looping nature of the algorithm (see Listing 1). I would not expect most algorithms to speed up this much. Even so, this performance still is not quite good enough for real-time operation, but it is close enough that we could start to look at data reduction techniques, parallelizing the code and other tricks to pare it down to something closer to the speed we need.
The main area in which the performance of Matlab suffers greatly is looping. Matlab abhors the loop; it was written to be more efficient to do many loop-type operations by vectorizing the code, applying functions over a range of data in a matrix, than it is to iterate through the data. Unfortunately, this only works with certain kinds of operations. When dealing with high-dimensional matrices, this often produces code that is hard to read and understand. Looping happens to be an area in which C excels—iterating through a matrix using pointer arithmetic is an extremely efficient and sometimes more understandable way to do operations over large chunks of data. Most of the effort of C optimization of Matlab code is spent trying to optimize nested loop structures.
Other ways to code in Matlab more efficiently include:
Make sure to allocate all, even moderately sized, arrays using the zeros() function before assigning values to the array, instead of having Matlab append data to existing arrays as values are assigned.
As mentioned in the Matlab documentation, store all of your code in functions instead of scripts. This offers about a two- to threefold speed increase.
Organize data such that operations over a range of a matrix operate in a column-major fashion. Matlab stores arrays like Fortran does, in that data in a particular matrix column is contiguous in memory. This is unlike C, where data in a matrix row are contiguous in memory. If you are going to apply functions over a range of data, store that data in a column rather than along a row in the matrix. This is completely anecdotal and may be false, but it seems to make sense.
Try to avoid internal-type conversions that happen over and over. This is another one where I don't have hard proof, but as Matlab usually does not make you explicitly label the data types of variables, it is sometimes easy to have a loop of repeated implicit type conversions. It is better to convert to a common data type first, then do your repeated operations. This is like programming in C or C++ but harder to detect right away, because variables are almost never explicitly typed in Matlab.
That being out of the way, let's take a look at a code snippet from the mcERP algorithm (Listing 1). This represents one of the many nested loop structures within the code. The mcERP algorithm relies on a complicated process of iterative Bayesian waveform estimation. A number of the following loopy bits are in the code, all of which are run repeatedly to hone in on waveform shapes present at the data.
Listing 1. Nested Loops in the mcERP Algorithm
One can see how this sort of structure would not run so quickly with an interpreter that does not perform well with loops. However, because of the inner if statement, the code cannot be vectorized without adding an inner function call—which can't be any better. This code, then, is a prime candidate for translation to C/C++. However, it is nice to have a foot in Matlab when developing the algorithm, because it is easy to produce pretty pictures like those in Figure 1. So, we write something called a MEX function. That way, we can have the core fast bits run quickly while retaining interface points around those parts in Matlab that tune and inspect the overall algorithm.
Figure 1 is an example output from the mcERP algorithm, showing estimates of the fundamental waveforms driving real-time potential readouts at scalp electrodes during simulated experimental trials. Each of these waveshapes is the result of many iterations of progressively accurate Bayesian waveshape estimation, requiring many calculations per iteration. These results can take many hours to achieve with Matlab but take seconds or minutes if portions of the algorithm are rewritten in C.
The photograph in Figure 2 shows our experimental setup for conducting experiments in brain-computer interface with real-time feedback. With the three large displays, we have complete control over what the subject sees within most of his or her field of view. All of the number crunching and display software was developed in-house and runs on Linux.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- RSS Feeds
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- Developer Poll
- Dart: a New Web Programming Experience
- What's the tweeting protocol?
- New Products
- Web Hosting IQ
47 min 43 sec ago - Thanks for taking the time to
2 hours 24 min ago - Linux is good
4 hours 22 min ago - Reply to comment | Linux Journal
4 hours 39 min ago - Web Hosting IQ
5 hours 9 min ago - Web Hosting IQ
5 hours 9 min ago - Web Hosting IQ
5 hours 10 min ago - Reply to comment | Linux Journal
8 hours 11 min ago - play with linux? i think you mean work-around linux
16 hours 37 min ago - Where is Epistle?
16 hours 43 min ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.






Comments
I found some interesting C/C+
I found some interesting C/C++ code at the link below. I thought you might be interested in it.
http://home.earthlink.net/~meshellwg/w/www/html/software.html
Click on the "c-cpp.zip" link