Parallel Processing using PVM
PVM is free software which provides the capability for using a number of networked (TCP/IP) machines as a parallel virtual machine to perform tasks for which parallelism is advantageous. We use PVM in one of our computer laboratories at Eastern Washington University (EWU), mixing Linux and Sun machines together, to comprise various virtual machines. PVM comes with a rich selection of examples and tutorial material allowing a user to reach a reasonable level of proficiency in a relatively short time. There is no new programming language to learn, presuming one already knows C, C++ or FORTRAN. With two or more Linux boxes networked one can run PVM and investigate parallel programming. With only one Linux box, one can still run applications and emulate the parallelism.
PVM was developed by Oak Ridge National Laboratory in conjunction with several universities, principal among them being the University of Tennessee at Knoxville and Emory University. The original intent was to facilitate high performance scientific computing by exploiting parallelism whenever possible. By utilizing existing heterogeneous networks (Unix at first) and existing software languages (FORTRAN, C and C++), there was no cost for new hardware and the costs for design and implementation were minimized.
A typical PVM consists of a (possibly heterogeneous) mix of machines on the network, one being the “master” host and the rest being “worker” or “slave” hosts. These various hosts communicate by message passing. The PVM is started at the command line of the master which in turn can spawn workers to achieve the desired configuration of hosts for the PVM. This configuration can be established initially via a configuration file. Alternatively, the virtual machine can be configured from the PVM command line (master's console) or during run time from within the application program.
A solution to a large task, suitable for parallelization, is divided into modules to be spawned by the master and distributed as appropriate among the workers. PVM consists of two software components, a resident daemon (pvmd) and the PVM library (libpvm). These must be available on each machine that is a part of the virtual machine. The first component, pvmd, is the message-passing interface between the application program on each local machine and the network connecting it to the rest of the PVM. The second component, libpvm, provides the local application program with the necessary message-passing functionality, so that it can communicate with the other hosts. These library calls trigger corresponding activity by the local pvmd which deals with the details of transmitting the message. The message is intercepted by the local pvmd of the target node and made available to that machine's application module via the related library call from within that program.
The PVM home page is at http://www.epm.ornl.gov/pvm/pvm_home.html. From there one can download the PVM software, obtain the tutorial, get current PVM news, etc. The software, tarred and zipped, only comprises about 600KB. The README file found therein is sufficient to get up and running, but the value of the tutorial must be stressed. The tutorial can be downloaded from the PVM home page and is also available in book form.
With the tutorial as guide, one can work through the selection of examples packaged with the software. The example source files are well documented internally so that one can become comfortable with the usual PVM library calls. The tutorial has a very nice chapter explaining each of the library calls clearly and in detail. This process was essentially how we started using PVM at EWU.
The network we used at EWU has a variety of machines, architectures and Unix flavors. The first challenge was learning how to configure such a mix of machines to form a PVM. It becomes less straightforward as the heterogeneity increases with the complicating factor being the diverse Unix flavors. One of the methods for configuring a PVM uses the Unix rsh (remote shell) command. This relies on the existence of a .rhosts file on the target machine. From the master one starts the PVM daemon (pvmd) on a slave in order to incorporate the slave into the virtual machine—the master uses rsh to do this. However, the target (slave) machine only allows the master access if the master is listed in the target's .rhosts file. The various flavors of Unix do not agree in the finer details for the syntax of the .rhosts file, nor could we always find pertinent documentation. There are several other methods for configuring PVM and, in most cases, we found it a black art. Nevertheless, we persevered and documented our experience on an embryonic web site for a PVM WebCourse (http://knuth.sirti.org/cscdx/) for those who discover similar problems.
It is instructive to look at an example (see Listings 1 and 2) to get the flavor of the programming. The executable for the source in Listing 1 is started at the command line of the master host which, in turn, spawns the second executable on a slave. For simplicity, no error checking is included. However, PVM provides easy ways to check for failure on such library calls as pvm_spawn. The example, adapted from one in the tutorial, merely measures the time it takes for a simple interchange of passed messages, master to slave and slave to master. In particular, the master sends an array to the slave, the slave doubles each of the elements and sends it back. The master prints out the initial array, the final array and the array's round trip time.
PVM has found a regular place in our curriculum. It provides a way to investigate parallel programming in an inexpensive yet realistic way. Thinking through an involved parallel-programming exercise is a new and sometimes difficult evolution for those accustomed only to sequential programming. Once the students have worked through the tutorial material, they move on to problems in which they take a design role and, finally, to substantive projects.
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Validate an E-Mail Address with PHP, the Right Way
- RSS Feeds
- Readers' Choice Awards
- Tech Tip: Really Simple HTTP Server with Python
1 hour 31 min ago
- Reply to comment | Linux Journal
2 hours 3 min ago
- All the articles you talked
4 hours 27 min ago
- All the articles you talked
4 hours 30 min ago
- All the articles you talked
4 hours 31 min ago
8 hours 56 min ago
- Keeping track of IP address
10 hours 47 min ago
- Roll your own dynamic dns
16 hours 47 sec ago
- Please correct the URL for Salt Stack's web site
19 hours 12 min ago
- Android is Linux -- why no better inter-operation
21 hours 27 min ago