Shell Scripting with a Distributed Twist: Using the Sleep Scripting Language

HOWTOs

by Raphael Mudge

on October 1, 2008

No one who isn't lazy writes scripts. Scripts save valuable system administrator time. In this article, I introduce the Sleep scripting language, which is a Perl-inspired language built on the Java platform. Although Java is sometimes a bad word in our community, Sleep can help you, because a Java-based language has several benefits. Scripts work on different platforms, data has the same form everywhere, and tools to solve any problem are available through the Java class library or open-source extensions.

With Sleep, you can save time on task automation and distributed computing. Sleep can help, whether you have one box or 10,000. Here, I introduce the language and its syntax, accessing the filesystem, talking to local and remote processes, and distributed computing with mobile agents.

Getting Started

You can use Sleep right away if you already have Java installed. Make sure the Java you use is the Sun Java. Any version 1.4.2 or later will do. Sleep does not run with the GNU Java that some Linux distributions use by default:

$ java -version
java version "1.5.0_13"
Java(TM) 2 Runtime Environment, Standard Edition

Installation is easy. Visit the home page (see Resources), and download the sleep.jar file. This file has everything you need to execute Sleep scripts:

$ wget http://sleep.dashnine.org/download/sleep.jar

You can execute a script on the command line with the following:

$ cat >tryit.sl
println("I am $__SCRIPT__ with " . @ARGV);
$ java -jar sleep.jar tryit.sl "hello icecream" 34
I am tryit.sl with @('hello icecream', '34')

Sleep scripts also are happy to exist as UNIX script files:

#!/path/to/java -jar /path/to/sleep.jar
println("Hello Icecream!");

$ chmod +x script
$ ./script
Hello Icecream!

Sleep Basics

Sleep and Perl have a lot in common. Variables are scalars, and scalars store strings, numbers, functions or Java objects:


# Set some variables
$w = "foo";
$x = 3.14 * 12;
$y = &someFunction;
$z = [java.awt.Color RED];

Like Perl, Sleep comments begin with a # and end with a newline.

Variable names inside double-quoted strings are replaced with their value at runtime. For example, "this is a $x" will use the current value of $x. To avoid this behavior, prefix a variable with a backslash. Double-quoted strings can format variables to a small degree. Use "$[20]x" to pad the value of $x with spaces until it is 20 characters wide. A negative number prefixes the value with spaces. The $+ operator brings together the left and right values in a string. For example, "a $+ b" is "ab".

Like Perl, Sleep has arrays and hashes. An array refers to values by a numerical index:

@a = @("a", "b");
@a[2] = "c";
push(@a, "d");

println(@a);

@('a', 'b', 'c', 'd')

Hashes store and get values with a string key. Think of these as a dictionary. The keys are not kept in order:

%b = %(a => "apple", b => "bat");
%b["c"] = 'cat';

println(%b);

%(a => 'apple', c => 'cat', b => 'bat')

Scripts can create hashes of hashes, arrays of hashes, arrays of arrays, and any other combination you can imagine. These data structures offer a flexible way for storing data. And, these structures are more than hashes and arrays. Scripts can use arrays as sets, stacks, queues and lists. Combinations of arrays and hashes can make finite-state machines, graphs and trees. You can make nearly any data structure you'll need.

Sleep provides a gamut of flow control options. The for loop, while loop and foreach loop are all here. If statements work as you would expect. Sleep differentiates strings and numbers for comparisons. Here, I use the Sleep console to show the difference:

$ java -jar sleep.jar
>> Welcome to the Sleep scripting language
> ? "3" eq 3.0
false
> ? "3" == 3.0
true

The assignment loop is found a lot in Sleep scripts. This loop evaluates a statement and assigns the result to a variable before executing the loop body. The loop keeps going while the result is not $null, which is the empty value—it is equal to an empty string, the number zero and a NULL reference all at once. Most functions return $null when they are finished. This script iterates over each line of a file:

$handle = openf("/etc/passwd");

while $entry (readln($handle))
{
   println($entry);
}

Sleep uses the same functions to work on files, processes and sockets. A scalar that holds a file, process or socket is a handle. The &readln function reads a line of text from a handle. The &println function prints a line of text. Likewise, &readb reads some bytes from a handle. And, &writeb writes bytes. The following is a Sleep version of the UNIX copy command:

global('$source $dest $handle');

($source, $dest) = @ARGV;

$handle = openf($source);
$data = readb($handle, -1);
closef($handle);

$handle = openf("> $+ $dest");
writeb($handle, $data);
closef($handle);

$ java -jar sleep.jar cp.sl a.txt b.txt

Notice the value @ARGV. This array holds the script's command-line arguments. The &closef function closes a handle.

Scripts declare named functions with the sub keyword. Arguments are available as $1 to $n:

sub foo
{
   println("$1 and $2");
}

foo("bar", "baz");

bar and baz

Sleep functions are first-class types. This means you can assign them to variables and pass them as arguments to functions. A script can refer to a named function with &functionName. Scripts also can use anonymous functions—anonymous functions? Yes. An anonymous function is a block of code enclosed in curly braces:

$var = { println("hi $1"); };

# call the function in $var
[$var: "mom"];

# call an anonymous function
[{ println("hi $1"); }: "dad"];

hi mom
hi dad

Sleep invokes functions and talks to Java through object expressions. An object expression encloses an object, an optional message and arguments in square brackets:

[$object message: arg1, arg2, ...];

The example below shows nested object expressions:

[[System out] println: "Hello World"];

which is equal to this Java statement:

System.out.println("Hello World");

When calling into Java, the message is the name of a method or field that belongs to the object. Arguments are converted to Java types as necessary, and some conversions are automatic. Nearly anything will convert to a string. However, a string will not convert to an int. Casting is possible, but I don't cover that topic here.

Now that you know a little about the Sleep language, it helps to see it in action. Next, I present several scenarios and Sleep-based solutions to them.

Filesystem Fun (the Biggest File)

My home directory has many files. I'm a digital packrat, and I'm always low on disk space. I really have no idea what is on my disk. To help, I wrote a script to find the largest files within a directory and its subdirectories:

global('$size $file @files %sizes');

sub processFile
{

This script creates a data structure of files and their sizes, sorts it, and presents the results to the user. The &processFile function does most of the work, and it expects a file as an argument:


   if (-isDir $1)
   {
      filter(&processFile, ls($1));
   }

If the argument is a directory, the &ls function will provide the contents of the directory as an array. &filter expects a function and an array as arguments. &filter calls the function on each item in the array. I use &filter to call &processFile on the argument's subdirectories and files:


   else if (lof($1) > (1024 * 1024))
   {
      %sizes[$1] = lof($1);
   }
}

The hash %sizes stores each filename and size. The key is the filename, and the size is the value. The &lof function returns the length of a file in bytes. I ignore files smaller than 1MB in size. I have so many files that this script exhausts the memory of Java before finishing. I could set Java to use a larger heap size with java -Xmx1024M -jar sleep.jar. Below, I chose to fix my script:

processFile(@ARGV[0]);

I call &processFile on the first command-line argument to kick off the script. When this function returns, the %sizes hash will contain an entry for each file in the specified directory and its subdirectories:


@files = sort({ return %sizes[$2] <=> %sizes[$1]; },
                                  keys(%sizes));

The &sort function processes the keys of %sizes and places them in order from largest to smallest size. Much like Perl, Sleep's &sort can use any criteria given by an anonymous function:

foreach $file (sublist(@files, 0, 50))
{
   $size = lof($file);
   println("$[20]size $file");
}

This script ends with a foreach loop to print out the 50 largest files.

And, lo and behold! I solved my problem. I found four copies of a Christmas movie I made on my Macintosh three years ago. Thanks to the script, I recovered several gigabytes of disk space.

Local Processes (PS. I Love You)

Recently, I had to watch this movie about a guy who sent letters to his wife after he passed away. I'm not really into the romantic-morbid genre; however, I thought I could show the people in my life how much I care about them. Here is a script that sends a random fortune to someone every 24 hours:

include("sendemail.sl");

while (1)
{
   sendemail($to => "rsmudge@gmail.com",
      $from => "raffi@hick.org",
      $subject => "P.S. I love you",
      $message => "This made me think of you:\n\n" .
                  join("\n", `fortune`)
   ); 

   # sleep for 24 hours
   sleep(24 * 60 * 60 * 1000);
}

I use `fortune` to execute the fortune command and collect its output into an array. Then, I combine this with the rest of the message body to make a thoughtful message. This script uses the $variable => value syntax to pass named arguments to &sendemail.

Backticks are one way to execute a process. I show the other way in the sendemail.sl code.

Sending E-Mail

I use the sendmail program to send e-mail. The sendemail.sl file contents are:

sub sendemail
{
   local('$handle');
   $handle = exec("/usr/sbin/sendmail -t $to");

Sleep executes processes with the &exec function. Scripts interact with processes as if they were files. As an aside, you can pass arguments with spaces to &exec. Use an array instead of a string. For example, exec(@("/usr/sbin/sendmail", "-t", $to)) would work in this example:

   println($handle, 
"TO: $to
FROM: $from
SUBJECT: $subject
$message");

Here, I send the e-mail message to the sendmail process over STDIN. Later in this article, I cover how to use Sleep for distributed tasks. Don't combine this e-mail example with that—I don't like spammers:

   closef($handle);
}

The last step is to close the handle. Having successfully automated my personal life, let's turn our attention to work matters.

Remote Processes (Automate SSH)

System administration is all about reaching out and touching everything. And, doing that requires automation. Sleep can automate SSH sessions with ease. Here is the &ssh_cmd function in action:

debug(7);

include("ssh.sl");

global('@output');

@output = ssh_cmd($user => "root",
                  $pass => "123456",
                  $host => "foo.example.com",
                  $command => "cat /etc/shadow");

printAll(@output);

This script authenticates to foo.example.com via SSH, executes "cat /etc/shadow", and prints the result on the local machine. Before we go further, there is something you should know. Sleep doesn't have an &ssh_cmd function. We have to build it.

Adding SSH to Sleep

Perl has the CPAN for modules. Sleep scripts can take advantage of the Java class library to add functionality. Here, I walk you through the code for ssh.sl:

import com.trilead.ssh2.* from: 
                          trilead-ssh2-build213.jar;

Sleep uses import to get access to classes in another package. Unlike Java, Sleep can import directly from a third-party Java archive file at runtime. This is useful for trying things out quickly. Here I use the Trilead SSH for Java library to add SSH to Sleep:

sub ssh_cmd
{
   local('$conn $sess $data $handle @data');

   # create a connection
   $conn = [new Connection: $host, 22];
   [$conn connect];

This code creates a new com.trilead.ssh2.Connection object. Next, I call the connect method on this object to set up an SSH connection:

   # authenticate
   [$conn authenticateWithPassword: $user, $pass];

Then, I call the authenticateWithPassword method on the connection. The Java library expects two string parameters. Sleep is smart enough to convert scalars to Java types as necessary:

   # execute the command
   $sess = [$conn openSession];
   [$sess execCommand: $command];

Here, I create an SSH session from the connection with the openSession method. This method returns a com.trilead.ssh2.Session object. Sleep places the object into a scalar variable. If you want to execute more than one command, create a session for each command as I've done here:

   # wire up a Sleep I/O handle for STDOUT
   $handle = [SleepUtils getIOHandle: 
                      [$sess getStdout], $null];

The next thing to do is get the output from the session. Sleep has a class called SleepUtils with useful functionality. One of the methods constructs an I/O handle from Java input and output stream objects. Here, I made a readable I/O object from [$sess getStdout]. To write values, replace $null with the STDIN value for the session. This is available as [$sess getStdin]:

   # read output into an array
   @data = readAll($handle);

From this point, you can manipulate the remote process like any other handle. Below, I read the entire contents of the handle into the array @data:

   # close it all down
   closef($handle);
   [$sess close];
   [$conn close];

   return @data;
}

The last step is to close down the session and connection. The &ssh_cmd function returns the contents of @data.

Run This Example

To execute this code, create ssh.sl from the example above, download trilead-ssh2-build212.jar, and re-use the SSH automation code for your own purposes. Place all these files in the same directory. Then, type:

$ java -jar sleep.jar yourscript.sl

Distribute Tasks with Mobile Agents

Programs that move from computer to computer are mobile agents. Agent programming is a way of thinking about distributed computing. Some tasks fit very well into the mobile agent paradigm. For example, if you have to search all files in a network for some string, it makes no sense to download every single file and search it. It is much more efficient to move the search code to each computer and let the searching happen locally. Mobile agents make this possible.

Mobile agents also save you from the need to define a client and server protocol. You can place the entire interaction between two or more computers into a single function and let it start hopping around to complete the task.

So, what does a mobile agent look like? A mobile agent is a function that calls &move to relocate itself. Here is a syslog patrol agent. This agent patrols your network, checking the syslog dæmon on each box. If the dæmon is down, it tries to restart it. After each patrol, the agent starts over again:

debug(7);

include("agentlib.sl");

Before this script can do anything, I include the agent library file (I dissect this file in the next section):

sub syslog_patrol
{
   local('$host @computers @proc $handle');

   $handle = openf("computers.txt");
   @computers = readAll($handle);
   closef($handle);

The first task is to get a list of all computers. For this, I read in the contents of computers.txt. I assume each line has the hostname or IP address of a computer ready to receive my agents:

   $handle = $null;

When an agent moves, it takes its variables, call stack and program counter with it. Sleep has to serialize this data to move a function. Serialization is the process of converting data to bytes. Scripts cannot serialize I/O handles. To prevent a disaster, I set the handle to $null before moving:

   while (size(@computers) > 0)
   {
      $host = @computers[0];

The next task is to loop through each host. In this script, I use a list iteration approach. This approach removes the first item from @computers with each execution. @computers gets smaller and smaller until nothing is left. The item we want to work with always is at the front. I use list iteration here because foreach loops are not serializable:

      move($host);

This one function call is all it takes to relocate the agent. The statement after this function will execute from $host with its variables and state intact. In this example, I don't have any error handling. I assume the host is up and that the agent can move itself there. Error handling isn't hard to add, and the Sleep documentation provides more on this topic:

      @proc = filter({ 
              return iff("*syslogd" iswm $1);
           }, `ps ax`);

This code gets a list of all processes that match the wild card "*syslogd*". &filter applies the anonymous function to each item in the array given by `ps ax`. And, &filter collects the non-$null return values of these operations and puts them into an array. This is Sleep's version of grep. I can use the size of the @proc array to check whether syslog is running:

      if (size(@proc) == 0)
      {
         chdir('/etc/rc.d/init.d');
         `./syslog start`;
      }

Here, I check whether syslog is running. To start it, I change directories, and execute the syslog dæmon:

      @computers = sublist(@computers, 1);
   }

The last step of the loop is to remove the first item from @computers. I use &ublist to do this:

   sendAgent($home, lambda($this, \$home));
}

At the end of the patrol, I send the agent back to the starting computer. I use &lambda to make a fresh copy of the agent function with no saved state. I pass the $home variable into the copy so it knows where to go when it restarts:


sendAgent(@ARGV[0], lambda(&syslog_patrol, 
                               $home => @ARGV[0]));

This code launches the agent into the system. I assume @ARGV[0] is the hostname of the home system with the computers.txt file.

Adding Agent Support

It should be no surprise that Sleep doesn't have &move. Again, we have to build it. Isn't that half the fun? The agentlib.sl file has two functions: &move and &sendAgent:

inline move 
{ 
   callcc lambda({
      sendAgent($host, $1); 
   }, $host => $1); 
}

&move is an inline function. An inline function executes with the parent's variable scope, and commands, such as return, callcc and yield affect the parent. They are useful for hiding flow control tricks made possible with callcc. callcc is like a goto. It pauses the current function and calls the specified anonymous function with the current function as an argument. A paused function resumes execution the next time a script calls it. So, why is this exciting to us? Sleep's paused functions are serializable. This means a script can write a paused function to a socket or a file:

sub sendAgent 
{ 
   local('$handle'); 
   $handle = connect($1, 8888); 
   writeObject($handle, $2); 
   closef($handle); 
}

For example, the &sendAgent function writes a paused function to a socket. This function expects a hostname and a function as arguments. It connects to the host with &connect, writes the function with &writeObject, and closes the handle. One piece of magic is missing. It makes no sense to send agents without receiving them.

Receiving Agents

Middleware is software that receives agents. It sits between the operating system and the agents. The following code makes up middleware.sl:

include("agentlib.sl");

The agent middleware must include the agentlib.sl file. This gives it and the agents it executes access to &sendAgent and &move:

while (1) 
{ 
   local('$handle $agent');

   $handle = listen(8888, 0);

The middleware executes in an infinite loop listening for connections on port 8888. The &listen function waits for a new connection:

   $agent = readObject($handle); 
   closef($handle);

The &readObject function reads an object in from a handle. Here, I assume I am reading a function from the handle:

   fork({ [$agent]; }, \$agent); 
}

The last step is to execute the agent itself. &fork executes code in an isolated thread. I make the agent available in the thread by giving it to &fork. The code I use here executes the agent. When the thread starts, the agent resumes execution from where it left off.

Run This Example

To execute this example, place a copy of middleware.sl and agentlib.sl on each computer. Then, execute the middleware with:

$ java -jar sleep.jar middleware.sl

On the first computer, make a script with the &syslog_patrol agent. Create a computers.txt file that lists each IP address with the agent middleware. Then, run your script with:

$ java -jar sleep.jar syslog_agent.sl [local ip address]

Now you have a syslog agent patrolling your network. Don't you feel safe?

What's Next?

Sleep is a language for the Java platform built with the UNIX programming philosophy. Sleep allows you to use existing tools to create solutions to problems. I've shown you how to solve a few system administration problems with Sleep. These examples offer a starting point for you to use the language.

When evaluating a new language, I look for how easily I can bring in external functionality, solve a problem or two and process data. Sadly, I wasn't able to cover data parsing in this article. But, that's okay, Sleep supports all this stuff. You can read the documentation to get a feel for regular expressions, pack and unpack, and &parseDate.

To make the most of these examples, I recommend you run them. Links to the documentation and examples are available in the Resources section. Good luck, and enjoy the language.

Resources

Examples from This Article: sleep.dashnine.org/ljexamples.tgz

The Sleep Home Page: sleep.dashnine.org

The Sleep 2.1 Manual: www.amazon.com/dp/143822723X or sleep.dashnine.org/documentation.html

Trilead SSH for Java: www.trilead.com/Products/Trilead_SSH_for_Java

Raphael Mudge is an entrepreneur and computer scientist based out of Syracuse, New York. He also wrote Sleep. You can find links to his other work at www.hick.org/~raffi.

Load Disqus comments