The New KornShell—ksh93

The KornShell, written by David Korn at Bell Telephone Laboratories, combined the best features of both of these shells, and added the ability to edit and reenter the current and previous commands using the same keystrokes as either the vi or the Emacs editor as the user desired.
Expanded Name Space

In ksh93 a variable is defined by a name=value pair. The variable name space is hierarchical with . (dot) delimiters. The expanded name space permits an aggregate definition for a variable.

The lsc script will produce multi-column output. We visualize the output as a table consisting of rows and columns. A common definition for row and column is provided by the definition of a compound variable named cell.

cell=(
  # maximum number of cells
  integer maximum=0
  # maximum width based on entries
  integer width=0
  # current index within the cell
  integer index=0
  # content of the cell
  typeset entries
)

This defines the variable cell, with aggregate members maximum, width, index, and entries. A reference of ${cell.index} provides the value associated with the index aggregate. Using the eval command we can create additional variables with the same aggregates. We can, for example, define variables row and col to have the same definition as cell:

eval row="$cell"
eval col="$cell"
Internationalization Support

ksh93 provides support for internationalization. Double-quoted strings preceded by a $ are checked for message substitution. If the string appears in the message catalog, then ksh93 will substitute the string with the corresponding string from the message catalog. Otherwise, the string is unchanged.

In the lsc example, we display an error message of "not found" for any command line arguments that are not readable directories. The error message we provide is defined with internationalization support (see line 33 of Listing 1). If the shell variable LANG is defined to some locale other than POSIX, ksh will attempt to replace the error message using internationalization support. Otherwise, the message remains unchanged.

Executing ksh -D on a shell script will output all messages identified for internationalization. In the lsc script, for example, ksh -D will output the following message.

"${video[reverse]} not found ${video[reset]}"
KornShell Development Kit (KDK)

ksh93 is extensible through the KornShell Development Kit (KDK). You can write your own built-in functions in C and load them into the current shell environment through the builtin command. This feature is available on operating systems with the ability to load and link code into the current process at run time.

A built-in command is executed without creating a separate process. Instead, the command is invoked as a C function by ksh. If this function has no side effects in the shell process, then the behavior of this built-in is identical to that of the equivalent stand-alone command. The primary difference in this case is performance: the overhead of process creation is eliminated. For commands of short duration, the effect can be dramatic. For example, on SUN OS 4.1 wc on a small file of about 1000 bytes runs about 50 times faster as a built-in command than as a separate process.

In addition, built-in commands that have side effects on the shell environment can be written. Using the API, available through the KornShell Development Kit, you can extend the application domain for shell programming. For example, an X-Windows extension that makes heavy use of the shell variable namespace was added as a group of built-in commands. The result is a windowing shell that can be used to write X-Windows applications.

While there are definite advantages to adding built-in commands, there are some disadvantages as well. Since the built-in command and ksh share the same address space, a coding error in the built-in program may affect the behavior of ksh, perhaps causing it to core dump or hang. Debugging is also more complex since the built-in's code is now a part of a larger entity. The isolation provided by a separate process guarantees that all resources used by the command will be freed when the command completes; this guarantee does not apply to built-ins. Also, since the address space of ksh will be larger, this may increase the time it takes ksh to fork() and exec() a non-builtin command [though not on more advanced operating systems like Linux, which conserve memory and time by doing “copy-on-write” when they fork—ED]. It makes no sense to add a built-in command that takes a long time to run or that is run only once, since the performance benefits will be negligible. Built-ins that have side effects in the current shell environment have the disadvantage of increasing the coupling between the built-in and ksh making the overall system less modular and more monolithic.

Despite these drawbacks, in many cases extending ksh by adding built-in commands makes sense and allows reuse of the shell scripting capability in an application-specific domain.

In the lsc example, we need to determine the maximum string size within a list of strings. This is required to determine the initial number of columns in the multi-column display. We will also use this to determine the maximum width for a column of entries. A typical shell implementation would be given as:

(( max_stringSize = 0 ))
for fileName in *
do
if (( max_stringSize < ${#fileName} ))
then
   (( max_stringSize = ${#fileName} ))
fi
done

(See Arithmetic Expressions, below, for an explanation of (( and )).)

To improve performance, we can re-write this function in C. In a simple example, the shell equivalent function required 0.58 seconds of CPU. The C built-in function provided 0.08 seconds of CPU for the same task. The function name is preceded with “b_” to indicate that it is a built-in function. When compiled, the strlenList.o object is then archived into a shared library. To reference the strlenList function, we must load it into the current ksh environment through the builtin command (see line 29 of Listing 1).

#pragma prototyped
#include "shell.h"
#include "stdio.h"
int b_strlenList(int argc, char **argv,
                 void *extra)
{
    register int max, n = 0
    char **cp = NULL;
    cp=argv;
    while ( *(++cp) ) {
        n = strlen(*cp);
        max = max < n ? n : max;
    }
    fprintf(stdout,"%d\n", max);
    return(0);
}
______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState