Linux Programing Hints

Perl is often considered a scripting language for systems administrators. Jim demonstrates that it is useful to applications and scientific programmers as well—as a prototyping tool.

If you are like many Linux users you may have heard of Perl, but have been reluctant to learn another language. This was my situation several months ago. A friend suggested I give Perl a try. Since I already knew C, Perl was a snap to learn. I soon found myself doing all sorts of text reformatting using Perl. My friends and coworkers were impressed, but skeptical. Could Perl cut the mustard on big files where a ton of data had to be read, massaged and written?

Their skepticism subsided, however, when I wrote a Perl program (I prefer to call them programs rather than scripts just to separate them from shell scripts) to load over three and a half million lines of US Postal data. Now I actually teach Perl (but I digress).

In this article I would like to suggest a use for Perl which is often overlooked—Perl as a prototyping tool. Most C programmers spend a fair amount of time managing memory, and I am no exception. Memory management is a necessary function, especially if you want to keep your C programs tight—not using more memory than necessary—and well behaved—not crashing with the resulting core dumps. The problem with managing memory yourself is that it can divert attention from the program's purpose, which is typically to get an algorithm running.

With Perl, not only do you get solid memory handling routines, you get them in an interpreted/compiled environment and, thanks to the Free Software Foundation, they will not cost you a penny. In fact, if you have Linux you probably already have Perl as well. Let me illustrate how Perl can be used as a prototyping tool with two examples, a simple Monte Carlo calculation and a more substantial geometric problem.

A Monte Carlo Estimate Of pi

Monte Carlo techniques use probabilistic methods to make estimates. Typically, one or more random numbers are substituted into a function and the resulting value is tested for validity. The program keeps track of both the number of satisfactory tests and the total number of tests. The result is the ratio of satisfactory to total tests and this ratio is monitored as the number of tests is increased.

# Monte Carlo calculation of pi
srand(time | $$);
for($i = 1, $inside = 0.0; $i <= 1000000; $i++)
   $x = rand;
   $y = rand;
   $inside++ if $x * $x + $y * $y < 1.0;
   printf "After %7d points pi ~= %9.7lf\n", $i, 4.0 * $inside / $i if
      ($i % 10000) == 0;

Figure 1. A monte Carlo Calculation of pi

Let us get our feet wet with a short and simple Perl program. Figure 1 is a program which estimates pi using a Monte Carlo calculation. Consider a circle inscribed inside a square of side two centered at the origin. The area of the square is four and, since the circle has a radius of one, its area is pi. The ratio of the area of the circle to that of the square is thus pi / 4, and this is also the chance that a random point within the square is also inside the circle. This program repeats a loop a million times, each time calling Perl's rand function twice, once for x and once for y. The distance of the x, y pair from the origin is calculated and, if it is less than one, it is counted as a successful test. (Technically, this program uses only the parts of the circle and the square in the first quadrant for simplicity, but, by symmetry, the ratio of areas is the same as if the whole of both figures had been used.)

A Linux tip—if you give your Perl programs a unique extension, like .pl, it is easy to make them stand out in ls type listings. Adding the line:

.pl 01;33 # perl programs (yellow)

to the file /etc/DIR_COLORS will make the names of all files with .pl extensions in listings appear in yellow. See the man pages for ls and the file /etc/DIR_COLORS for details.

Note, first of all, how short the Perl program is. Also, note how much it resembles a C program, especially the for loop and the printf function. There are important differences, however. Variables, like $i, $x, etc. are used when needed without a specific declaration. It is not even clear to the casual observer what the types of the variables are. And what is with these if statements after the statements they control? And rand is used like a function, but there are no parentheses—maybe it is a keyword.

All of these differences are features of Perl. Perl keeps track of your variables for you. The variables are really strings internally, but they get converted to doubles when needed such as when the distance of the random point from the origin is calculated and compared to one. You needn't concern yourself with any of these details, however. The if test at the end of a line is a handy equivalent to C's if block (the C style is OK in Perl also) but a lot shorter. You will find yourself using Perl's if style all the time once you get the hang of it. Perl's if has three relatives (unless, while and until) which also can be used before a block or at the end of a series of comma separated statements. Finally, rand really is a function—in this case the parentheses are optional. This is the case with many functions, including the printf at the bottom of the for loop.

If you run the program in Figure 1, you will get an estimate for pi after every 10,000 tests. I call the program and run it by typing its name. The first line is the path to the Perl program on my system. Change this path if yours is different. You can also test a program for syntax errors with a -c command line switch, i.e.

perl -c

Perl will also provide warnings, such as when you assign to a variable and never use the variable again. Use the -w command line switch to turn on warnings. This is a handy way to uncover spelling errors which can easily crop up in an environment without explicit variable declarations. I usually use both tests simultaneously during development, i.e.

perl -cw