Stuttgart Neural Network Simulator
In contrast to conventional computer systems which are programmed to perform a specific function, a neural network is trained. Training involves presenting the network with a series of inputs and the outputs expected in each case. The errors between the expected and actual outputs are used to adjust the weights so that the error is reduced. This process is typically repeated until the error is zero or very small.
Training methods vary greatly from one application to the next, and there is no single universal solution. As an example, we turn to a general discussion of what is known as gradient descent or steepest descent. Here, the error for each training pattern is quantified as:
where Ep is the error for pattern p and
With this in mind, minimizing Ep involves moving the weights in the direction of the negative gradient -
where nu is some constant between 0 and 1. The function F is typically chaotic and highly nonlinear. That being the case, the actual gradient component may be a very large value that may cause us to overshoot the solution. The constant nu can be used to suppress this.
I have included the above discussion mostly for the benefit of readers with some basic knowledge of multi-variable calculus. For others, it is really only important to know that, through an iterative process, a neural network is adapted to fit its problem domain. For this reason, neural networks are considered to be part of a larger class of computing systems known as adaptive systems.
Although the actual implementation and operation of a neural network can be accomplished on a variety of platforms ranging from dedicated special-purpose analog circuits to massively parallel computers, the most practical is a conventional workstation. Simulation programs could be written from scratch, but the designer can save much time by using one of several available neural network prototyping tools. One such tool is the Stuttgart Neural Network Simulator (SNNS).
The easiest way to get started with SNNS is to experiment with one of the example networks that comes with the distribution. One of these is a solution to the character recognition problem discussed at the beginning of this article.
Upon invoking SNNS, the manager (Figure 4) and banner (Figure 5) windows appear.
Selecting the file option from the manager window presents the user with a file selection box (Figure 6).
The file selector uses extensions of .net, .pat, etc. to filter the file names in the selected directory. We will load the letters_untrained.net file, since we want to see the training process in action. We will also load the letters.cfg configuration file and the letters.pat file which contains training patterns.
After the files are loaded, selecting the display option in the manager window will present the user with a graphical display of the untrained network (Figure 7).
This window shows the input units on the left, a layer of hidden units in the center and the output units on the right. The output units are labeled with the letters A through Z to indicate the classification made by the network. Note that in the Figure 7 display, no connections are showing yet because the network is untrained at this point.
Selecting the control option from the manager window presents the user with the control window (Figure 8). Training and testing are directed from the control window. Training basically involves the iterative process of inputting a training vector, measuring the error between the expected output and the actual output, and adjusting the weights to reduce the error. This is done with each training pattern, and the entire process is repeated until the error is reduced to an acceptable level. The button marked ALL repeats the weight adjustment process for the entire training data set for the number of times entered in the CYCLES window. Progress of the training can be monitored using the graph window (Figure 9).
In the graph window, the horizontal axis shows the number of training cycles and the vertical axis displays the error.
Figure 10 shows a partially trained network. In contrast to the untrained network in Figure 7, this picture shows some connections forming. This picture was taken at the same time as Figure 9. After enough training repetitions, we get a network similar to that shown in Figure 11. Notice that the trained network, when the letter A is input on the left, the corresponding output unit on the right is activated with a 1.
As a quick check to see if the network can generalize, a modified version of the training data set is tested with one of the dots in the A matrix set to zero instead of one, while an erroneous dot is set to one instead of zero. Figure 12 demonstrates that the distorted version of the letter A is still recognized.
As pointed out in the section on training, many different methods can be used to adjust the connection weights as part of the training process. The proper one to use depends on the application and is often determined experimentally. SNNS can apply one of many possible training algorithms automatically. The training algorithm can be selected from the drop-down menu connected to the control window.
SNNS reads network definition and configuration data from ASCII text files, which can be created and edited with any text editor. They can also be created by invoking the bignet option from the manager window. bignet enables the creation of a network by filling in general characteristics on a form. Refinements can be made by manually editing the data files with a text editor or by using other options within SNNS. Training and test data files are also plain ASCII text files.
Other notable features of SNNS include:
Remote Procedure Call (RPC)-based facility for use with workstation clusters
A tool called snns2c for converting a network definition into a C subroutine
Tools for both 2-D and 3-D visualization of networks