Why Experix? | Linux Journal

Software

by Bill McConnaughey

on February 24, 2003

In the late 1970s, while at Cornell University, I made our first "cell poker" for studying the mechanical properties of living cells. Another person made a data acquisition device that accepted simple commands over a serial line. For people to operate this device, we needed software that would translate a concept such as "poke" into machine actions. We needed the software to output a motor control signal while digitizing the sensor signals, make graphs of the data, derive the probe force from the position measurements and graph the results, and send the data to the department's mainframe for archiving and more analysis.

Later, we moved to Washington University. Over the years our data acquisition and control platform evolved from the VAX-terminal-signal-averager configuration to PCs with analog signal cards. I wrote what could be called the precursor to Experix in NS32000 assembly language, and it ran on a single-board computer added to an 8085 system with CAMAC modules. Then the CAMAC/8085 system was replaced with the PC and DAQ card, and it became apparent that the program itself needed to move to the PC. I wrote the first version of Experix in C with some assembly under extended DOS. Its command language is based on the stack and operator model, and much of that has carried over to the present version, with some changes and many additions.

The demise of DOS and advent of networking forced me to update the platform, and first I dabbled in Microsoft Windows. I quickly was put off by the unavailability of information on how to write kernel modules and the difficulty of graphics. I want to control everything, not rely on packages that I can't modify and that are guaranteed to become obsolete and unmaintainable. Then I found Linux, or it found me, and the program has been growing happily ever since. We still use the DOS program for the cell poker, the tissue stretcher, the fluorescence photo-bleaching and the pollen poker that I installed recently at the University of Montreal. It is long past time to move on.

I like command strings. I can't tolerate GUIs that fill the screen with icons intended to provide information about what they do by means of little pictures, but need to be equipped with more-or-less informative messages that pop up if I park the mouse there long enough. I lack the patience to search through a tree of pull-down menus when I could have typed a command in half a second. That's no way to run an experiment when you need to be thinking about the science and your sample is expiring. GUIs may be suitable for a system characterized and simplified to the point where it nearly all fits on one screen. It definitely does not work for a complicated, buggy, home-made rig where people always want to do something I didn't think about when I wrote the software. And don't get me started on the wiring-diagram concept of data collection/analysis. It's completely inaccessible to the users and not at all a natural way to approach most of the problems I have encountered.

The success we have had with the old DOS program proves the validity of my approach, but the platform is now seriously obsolete. I have built the command interpreter and stack management systems for Linux, copied or adapted most of the functions in the DOS program, and introduced concurrent execution of commands submitted though timers. The graphics, however, are still at a primitive stage, and the device driver interface is not really thought out. My purpose in releasing Experix at this point is to find programmers who want to help me fill in the missing pieces. You can download the Experix source and help files at biochem.wustl.edu/~elelab/bm.htm. It runs on Intel PCs, uses SVGAlib and can demonstrate most of the features discussed below.

Experix from the User's Point of View

Because Experix is not running yet in the real world, some of this information is still only theoretical or a description of how we operate with the DOS program.

The user goes to his directory and invokes a shell script that runs Experix and directs it to a startup command file. That file defines commands and variables needed for the task. The goal in writing this is to find out what is being done in the experiment; divide that into logically separate tasks, such as performing a measurement, analyzing and presenting the data, storing the data, reviewing old data and so on; and make up commands written in the Experix language that accomplish these tasks. The commands finally provided to the user should be as simple as possible and should have easily rememberable names and arguments, as well as useful help messages.

The program presents a command prompt. The user types a command string, using readline editing and history functions. The command string would typically be something short and simple, using a few of the commands defined in the command files, but in general, it may be arbitrarily long. Command execution proceeds in two phases. In the setup phase, some syntax checks are done, and jumps and jump labels are identified. If syntax errors are found or if there are jumps without corresponding labels, an error message is given and the command is aborted. In the second phase of command execution, tokens are identified and acted on sequentially. Command tokens cause actions in which data items from the Experix stack and Experix variables are used and changed. These are the general categories of command tokens:

help requests	present information
stack manipulators	manage the stack
jumps	change execution point in the command string
up-command-tail ops	examine command-tail arguments
command file exec	gets command strings from a file
numbers	push numbers onto the stack
numerical constants	push certain useful numbers onto the stack
arithmetic operators	do arithmetic operations on numerical stack objects
comparison operators	do arithmetic comparisons on numerical stack objects
math operators	do math operations on numerical stack objects
logical operators	do logical operations on integer stack objects
local variable ops	manipulate command string local variables
array element ops	work on elements, ranges and subspaces of arrays
array block ops	create, compose and decompose arrays
command sub-strings	"quoted strings", {bracked strings}, ''names
complex/polar ops	manipulate complex and polar numbers
conversion ops	convert value in one number type to another
miscellaneous	other types of operators
names of variables	push the value of the variable onto the stack
names of functions	perform the function (compiled code)
names of commands	perform the command (Experix command string)
references	pointers to variables, functions or commands

The stack is a data structure used mainly but not exclusively in a first-in-first-out way. It holds many kinds of objects: numbers in different integer and floating-point formats; complex and polar floating-point numbers; multidimensional arrays of all kinds of numbers; strings; and references to variables, functions and commands. Operators, functions and commands may use any number of arguments from the stack as well as variables in the program, and they may alter the stack and variables. The manner in which they do this is described in help files that are accessed by the help commands. They are "overloaded" so that, where it makes sense, the same operator or function works on all data types and on arrays as well as single numbers. Some functions have "side effects" (which are really their raison d'être), such as displaying text and/or graphs, reading and writing files and operating special devices.

When a token is the name of a command string, the present command string is suspended and the named one is submitted for execution. When it finishes (without error), execution of the suspended command string resumes. Execution of command tokens continues until the end of the command string is reached or until an error condition arises. In case of an error, diagnostic messages show the kind of error and the execution point in the command string and also in the suspended command strings that led to the present one. The user has an opportunity to accept the error, retry or skip the token that caused the error. (A debug option is intended but is not done yet.) Accepting the error means that the present and all suspended command strings are aborted. I also intend to add an auto-correcting feature whereby the command string can examine the error and attempt to recover before it calls on the user to resolve the problem.

When a command finishes and when a display is requested during a command, Experix displays a few of the stack levels, showing the data type, string length or array dimensions; the value or an excerpt of that; and some other information for pointers, including the beginning of the help string if one was included in the item's definition. When Experix is ready for another command it displays the prompt.

Commands may be entered into a queue to be done during idle time, when Experix otherwise is waiting for input. Commands also may be associated with timers or signals. When the timer expires or the signal is given, the associated command is placed in the timer/signal queue. Experix checks this queue between tokens of commands from the keyboard and idle queues, and it executes the timer/signal commands atomically; that is, it does not permit the insertion of other commands between their tokens. This allows an automatic process to be set up to run while other tasks are being done from the keyboard.

Console messages from Experix are dispatched by a function that receives the message and a route code. There are route codes for prompt, errors, warnings, stack display, command string display, help and other things. On a text-only display the route may be ignored and the messages simply printed as they are delivered. On a graphics display, the route determines the font, colors and screen region for the message. Many messages include color and reverse-video escape sequences for clarity in the console display. The dispatch function records the messages in a log file. They can be reviewed later to find out what the user is doing, what the pitfalls are and how to improve commands and make new ones.

The Structure of Experix

This section discusses the operation of Experix in graphics mode, using SVGAlib. I know this is a red flag to many people, because SVGAlib needs root privileges and it doesn't get along well with the newer graphical interfaces. I am using it because it is relatively easy to make the transition from Borland graphics. In addition, its efficiency is an asset when graphics have to be presented in real time during an experiment. I definitely am willing to look at alternatives.

Experix is started with a command-line argument that directs it to a screen layout file. It executers a process called svgaserv, which accepts commands through a fifo. All SVGAlib calls are done in svgaserv. The screen layout file contains svgaserv commands that start SVGAlib in the desired graphics mode and define a video display region (VDR) for each route code used by Experix's message dispatch function. The VDR contains the screen-relative coordinates, print font, number of print lines, colors, various mode settings and VDR-relative coordinates for data plotting. All screen displays are made by writing svgaserv commands, which include the VDR number, to the fifo.

Experix uses a thread and a synchronization flag to gather command input from readline calls. This method allows Experix to run commands from the idle and timer/signal queues (discussed below) while it is receiving command input from the user. Readline displays its prompt, key echos and cursor movements via stdout. In graphics mode, Experix directs stdout to a pipe, and it starts a thread that reads characters from that pipe and packages them into svgaserv commands. Thus, the command prompt and user input appear where they should, while other things may be going on in the display.

A command string is made of tokens, which are evaluated sequentially. A huge and ungainly block of code implements a decision tree designed to minimize the time needed to determine what each token is. This means the character sequences for operators are organized into trees. For example, stack operators begin with a backslash (\) and math functions begin with a period (.) (and are distinguished from numbers like .01 by having something other than a numeral after the period). This means that to calculate an exponential, one types .exp, which may take a little time to get used to. But it will be evaluated more quickly than would be the case if Experix had to search a long table of names, and its execution time is not be affected by the length of the command table. This is an undisciplined command language and a confusing programming language, and it tends to suffer from the natural desire to include the kitchen sink and all that molders therein. But by the same token it is easily extensible, and it does not force the user to think too much about syntax.

We use the term "operator" for tokens beginning with non-alphanumeric characters (except strings, .01 and the like), the term "function" for tokens that start with a letter and evaluate to a compiled code item, the term "variable" for tokens that start with a letter and evaluate to a data item, and the term "command" for tokens that start with a letter and evaluate to a command string. Functions, variables and commands are found by searching for the name in the command table, which may be arbitrarily long. Evaluation of a function causes the corresponding code to be executed. Evaluation of a variable causes its value to be pushed onto the stack. Evaluation of a command causes its command string to be submitted to the interpreter. If the name of a function, command or variable is prefixed with an apostrophe, a pointer to that item is pushed onto the stack. Such a pointer can be used subsequently to evaluate the item, avoiding the name lookup time. This is useful in a command loop that runs many times. If the name of a variable is postfixed with =, the value in stack level 1 is stored in that variable. Functions and commands use the stack and variables in a generally arbitrary way. Describing this is an essential part of documenting each function, command and variable.

Functions and operators are overloaded, so +, .exp and so on work on different data types and arrays as well as on single numbers. What they do is described in help files that are accessed with the ? operators. Thus, the token ?+ displays the file about binary operators, and ??+ spawns an edit session on that file. The editor is started in read-only mode to prevent accidents, but the user is encouraged to improve the help files and correct them where necessary. It is possible to change the path to the help files, so each user can have a private copy.

The command stack is a two-part data structure. Currently it is allocated at program start and is not extensible, and that setup seems to be adequate. Each stack entry consists of a 32-bit code in part A and a data item in part B. The code in part A shows the type of data and the amount of space that it occupies in part B. Part A grows upward in memory while part B grows downward, so a full stack uses the whole allocation independent of what kind of data is in it. Numbers are stored directly in the stack, and arrays and strings are stored by means of a pointer in part B to a structure that contains the length or dimensions and the data. A stack level is located by indexing directly to its code in part A and adding the data lengths of all higher levels in order to find the corresponding data in part B. We avoid using a pointer in part A to the data in part B, because it would have to be modified whenever the stack levels are rearranged.

Experiments often have a need for doing certain things at particular times or in response to external signals. Experix accomplishes this by means of a function that associates a command with a timer or signal. The timer/signal thread sleeps until it is awakened by a 1/10 second timer (using SIGALRM) or by one of several other signals. Then, for SIGALRM it queues any timer commands that have become due, and for the other signals it queues whatever command is associated with that signal. While the interpreter is running a command from the user, and while it is waiting for one, it checks the timer/ signal queue and runs whatever it finds there. The user can submit a command string loop that takes a long time to run, and while it is running, timer/signal commands are done between its tokens. These are done atomically, that is, Experix does not insert another command between tokens of one from the timer/signal queue. Timer/signal commands are analogous to interrupt handlers, except that they work entirely within the Experix context. They should be designed to run quickly, and they must leave the stack unaltered. They can put commands in the idle queue, which is consumed when there is no user command to do (analogous to the bottom half of an interrupt handler). Timed commands are suitable for monitoring and servicing parts of an experiment where exact timing is not needed. (Because they have a 1/10 second limit on time resolution and latency due to the operating system and execution time of whatever operator, function or timer/signal command may be running when they are queued).

Stuff to Work on

This current release of Experix can demonstrate nearly all of the features described above. The file dist/xpx/daq.xpx is a good one to try for a tour of the main program features. The highest priorities now are the device file interface, the data acquisition drivers and data graphing. Numerous other issues and bugs need to be resolved as well.

email: mcconnau@biochem.wustl.edu

Load Disqus comments