An Introduction to Rlab: A High Level Language for Scientific and Engineering Applications

Rlab stands for “our lab”. It is available to almost everyone who needs a computational tool for scientific and engineering applications, because it is freely available, and it runs on many platforms.
Input/Output

The user-end of the I/O system was designed to be as simplistic as possible, without restricting capability for those who need it. Files and processes are identified with strings. Each function capable of reading or writing will open and close files as necessary. However, open and close functions are provided for special circumstances. There are several variations of read and write functions that understand all of Rlab's internal data structures, and offer some degree of Matlab file compatibility. The getline and strsplt functions used for special ASCII input needs, a fread used for binary input, and a fprintf used for specially formatted output round out the I/O functions.

As you might expect, the strings "stdin", "stdout", and "stderr" point to their Unix system counterparts. Any function that performs file I/O can also perform process I/O through pipes, by simply replacing the filename string with a process-name string. A process-name string is a string that has a | as its first character. The rest of the string is any shell command. Rlab will create a pipe, forking and invoking the shell. The direction of the pipe is inferred from usage. This facility makes getting data to and from other programs rather simple. A good example is the Rlab-Gnuplot interface, which is written entirely in the Rlab language, using the process I/O capability to get data and commands to Gnuplot.

As a demonstration, we will explore process-I/O with a simple interface to the X-Geomview program. X-Geomview is a powerful 3-dimensional graphics engine, with an interactive GUI. X-Geomview can read data from files, and it can also read data/commands from stdin. X-Geomview uses Motif, but statically linked Linux binaries are available (in addition to the sources) from www.geom.umn.edu/software/geomview/docs/geomview.html.

In this example I will generate the data for, and plot, the classic sombrero. The code is listed in Listing 2.

The data for the example is completed by line 14; from there on, we are simply sending the data to the X-Geomview process. The variable GM holds a string, whose first character is a |, indicating a process to the remainder of the string should be opened. The following statements (lines 16-21) send object definition to X-Geomview, and lines 23-30 include the nested for-loops that send the polygon vertex coordinates to X-Geomview. A snapshot of the X-Geomview window containing the result is presented in Figure 3. Of course, a much better way to make this type of plot is to create a function that automates the X-Geomview interface (this will be included in the next release of Rlab).

Manipulating the Workspace

High level languages are great for prototyping a procedure, but often fall just short of useful when it comes time to use the program in a “production” situation. In this example we will pretend that we have just developed a method for computing time-frequency distributions. Actually we are going to use Rene van der Heiden's tfd function, which is derived from the information in Choi and Williams' paper, “Improved Time Frequency Representation of Multicomponent Signals Using Exponential Kernels”.

Now we want to use tfd to process a large amount of data. Since tfd executes reasonably fast, we would hate to have to re-code it in some other language just to be able to handle a large amount of data. Suppose you have many files of time-history data that you wish to “push” through tfd. Some of the files contain a single matrix of event data, while others contain several matrices of data. You would like to be able to write a program that could process all such files with a minimum of user intervention. The difficulty for some languages is the inability to manipulate variable names, and isolate data.

Rlab addresses this problem with lists. Lists allow the creation, and isolation of arbitrary data structures, and provide a mechanism for systematically manipulating workspace variables, and variable names. I showed earlier how list elements could be accessed with strings. Lists can also be indexed with a string variable, or for that matter, any expression that evaluates to a string.

The interesting thing I have not disclosed yet is that the entire workspace can be treated as a list! Access to the workspace is granted through the special symbol $$. You can use $$ as the name of a list-variable. For example, you could use the cosine function like: $$.["cos"](pi), or: $$.cos(pi). The first method offers the most flexibility. Now that we know about this special feature, we can handle our problem with relative ease. The program will read each file that contains data (they match the pattern *.dat) one at a time, compute the time-frequency distribution, and save the distribution for each set of data in the workspace. When processing is complete, the new workspace will be saved in a single file for later use.

The program just described is contained in Listing 3. There are several things I should point out:

  • Line 1: The require statement statuses the workspace for the named function. If the function is found, nothing happens. If the function is not found, it is loaded from disk.

  • Line 10: The getline function reads ASCII text files, and automatically parses each line into fields (sort of like awk). The fields (either strings or numbers) are returned in a list. When getline returns an empty list (as detected by the length function), the while-loop terminates.

  • Line 12: Each filename is stored in a string array.

  • Line 13: The readb function reads all the data from each file, and stores it in the list, $$.[filenm[i]]. This is a variable in the workspace that has the same name as the filename. For instance, if the first file is “x1.dat”, then a list-variable will be created called “x1.dat”.

  • Line 24: Now we are going to operate on the data we have read. The program will loop over the strings in the array filenm.

  • Line 26: For each file (i), the program will loop over all the data that was in each file. The members function returns a string array of a list's element names.

  • Line 28: This is it! $$.[i] is a list in the workspace (one for each data file). $$.[i].[j] is the jth variable in the ith list. So we are computing the time-frequency distribution for every matrix in every file we have read. $$.[i].[j+"y"] creates a new variable (same as the matrix name, but with a “y” tacked on the end) in each list for each time-frequency distribution that is performed.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState