Stuttgart Neural Network Simulator
Conventional algorithmic solution methods require the application of unambiguous definitions and procedures. This requirement makes them impractical or unsuitable for applications such as image or sound recognition where logical rules do not exist or are difficult to determine. These methods are also unsuitable when the input data may be incomplete or distorted. Neural networks provide an alternative to algorithmic methods. Their design and operation is loosely modeled after the networks of neurons connected by synapses found in the human brain and other biological systems. One can also find neural networks referred to as artificial neural networks or artificial neural systems. Another designation that is used is “connectionism”, since it deals with information processing carried out by interconnected networks of primitive computational cells. The purpose of this article is to introduce the reader to neural networks in general and to the use of the Stuttgart Neural Network Simulator (SNNS).
In order to understand the significance of the ability of a neural network to handle data which is less than perfect, we will preview at this time a simple character-recognition application and demonstrate it later. We will develop a neural network that can classify a 7x5 rectangular matrix representation of alphabetic characters.
In addition to being able to classify the representation in Figure 1 as the letter “A”, we would like to be able to do the same with Figure 2 even though it has an extra pixel filled in. As a typical programmer can see, conventional algorithmic solution methods would not be easy to apply to this situation.
A neural network consists of an interconnected network of simple processing elements (PEs). Each PE is one of three types:
Input: these receive input data to be processed.
Output: these output the processed data.
Hidden: these PEs, if used in the given application, provide intermediate processing support.
Connections exist between selected pairs of the PEs. These connections carry the output in the form of a real number from one element to all of the elements to which it is connected. Each connection is also assigned a numeric weight.
PEs operate in discrete time steps t. The operation of a PE is best thought of as a two-stage function. The first stage calculates what is called the net input, which is the weighted sum of its input elements and the weights assigned to the corresponding input connections. For the jth PE, the value at time t is calculated as follows:
where j identifies the PE in question, xi(t) is the input at time t from the PE identified by i, and wi,j are the weights assigned to the connections from i to j.
The second stage is the application of some output function, called the activation, to the weighted sum. This function can be one of any number of functions, and the choice of which one to use is dependent on the application. A commonly used one is known as the logistic function:
which always takes on values between 0 and 1. Generally, the activation Aj for the jth PE at time t+1 is dependent on the value for the weighted sum netj for time t:
In some applications, the activation for step t+1 may also be dependent on the activation from the previous step t. In this case, the activation would be specified as follows:
In order to help the reader make sense out of the above discussion, the illustration in Figure 3 shows an example network.
This network has input PEs (numbered 1, 2 and 3), output PEs (numbered 8 and 9) and hidden PEs (numbered 4, 5, 6 and 7). Looking at PE number 4, you can see it has input from PEs 1, 2 and 3. The activation for PE number 4 then becomes:
If the activation function is the logistic function as described above, the activation for PE number 4 then becomes
A typical application of this type of network would involve recognizing an input pattern as being an element of a finite set. For example, in a character-classification application, we would want to recognize each input pattern as one of the characters A through Z. In this case, our network would have one output PE for each of the letters A through Z. Patterns to be classified would be input through the input PEs and, ideally, only one of the output units would be activated with a 1. The other output PEs would activate with 0. In the case of distorted input data, we should pick the output with the largest activation as the network's best guess.
The computing system just described obviously differs dramatically from a conventional one in that it lacks an array of memory cells containing instructions and data. Instead, its calculating abilities are contained in the relative magnitudes of the weights between the connections. The method by which these weights are derived is the subject of the next section.
In addition to being able to handle incomplete or distorted data, a neural network is inherently parallel. As such, a neural network can easily be made to take advantage of parallel hardware platforms such as Linux Beowulf clusters or other types of parallel processing hardware. Another important characteristic is fault tolerance. Because of its distributed structure, some of the processing elements in a neural network can fail without making the entire application fail.
|Happy Birthday Linux||Aug 25, 2016|
|ContainerCon Vendors Offer Flexible Solutions for Managing All Your New Micro-VMs||Aug 24, 2016|
|Updates from LinuxCon and ContainerCon, Toronto, August 2016||Aug 23, 2016|
|NVMe over Fabrics Support Coming to the Linux 4.8 Kernel||Aug 22, 2016|
|What I Wish I’d Known When I Was an Embedded Linux Newbie||Aug 18, 2016|
|Pandas||Aug 17, 2016|
- Happy Birthday Linux
- ContainerCon Vendors Offer Flexible Solutions for Managing All Your New Micro-VMs
- Updates from LinuxCon and ContainerCon, Toronto, August 2016
- New Version of GParted
- Puppet and Nagios: a Roadmap to Advanced Configuration
- Blender for Visual Effects
- What I Wish I’d Known When I Was an Embedded Linux Newbie
- NVMe over Fabrics Support Coming to the Linux 4.8 Kernel
- A New Project for Linux at 25
- Tor 0.2.8.6 Is Released
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide