Getting Started with Condor

How to try Condor for multiplatform distributed computing.

Cluster computing emerged in the early 1990s when hardware prices were dropping and PCs were becoming more and more powerful. Companies were shifting from large mini-computers to small and powerful micro-computers, and many people realized that this would lead to a large-scale waste of computing power, as computing resources were being fragmented more and more. Organizations today have hundreds to thousands of PCs in their offices. Many of them are idle most of the time. However, the same organizations also face huge computation-intensive problems and thus require great computing power to remain competitive—hence the stable demand for supercomputing solutions that largely are built on cluster computing concepts.

Many vendors offer commercial cluster computing solutions. By using free and open-source software, it is possible to forego the purchase of these expensive commercial cluster computing solutions and set up your own cluster. This article describes such a solution, developed by University of Wisconsin, called Condor.

The idea behind Condor is simple. Install it on every machine you want to make part of the cluster. (In Condor terminology, a Condor cluster is called a pool. This article uses both terms interchangeably.) You can launch jobs from any machine, and Condor matches the requirements of the job with the capabilities offered by the idle computers currently available. Once it finds a suitable idle machine, it transfers the job to it, executes it and retrieves the results of the execution. One of the features of Condor is that it doesn't require programs to be modified to run on the cluster.

In practice, however, Condor is more complicated. Condor is installed in different configurations on each machine. Each Condor pool has a central manager. The central manager, as the name implies, is the central manager of the cluster. It manages the detection of new idle machines and coordinates the matchmaking between job requirements and available resources. Machines in a Condor pool also can have Submit and Full Install configurations. Submit machines are those machines that can only submit jobs, but can't run any jobs; Full Install machines are machines that can do both, submit and execute.

Requirements and Installation

Condor does not require the addition of any new hardware to the network; the existing network itself is sufficient. Condor runs on a variety of operating systems, including Linux, Solaris, Digital Unix, AIX, HP-UX and Mac OS X as well as MS Windows 2000 and XP. It supports various architectures, including Intel x86, PowerPC, SPARC and so on. However, jobs developed on one specific architecture, such as Intel x86, will run only on Intel x86 computers. So, it is best if all the computers in a Condor pool are of a single architecture. It is possible, however, for Java applications to run on different architectures.

In this article, we cover the installation from basic tarballs on Linux, although distribution/OS-specific packages also may be available from the official site or sources. (See the Condor Project site for more details, www.cs.wisc.edu/condor/downloads.)

Download the tarball from the Project site, and uncompress it with:

tar -zvf condor.tar.gz

The condor_install script, located in the sbin directory, is all you need to run to set up Condor on a machine. Before you run this script, add a user named condor. For security reasons, Condor does not allow you to run jobs as root; thus, it is advisable to make a new user to protect the system.

One of the first questions the script asks is how many machines are you setting up to be part of the pool? This is important if you have a shared filesystem. If you do, the installation script will prompt you for the names of those machines, and the installation of Condor on those machines will be handled by the software itself. If a shared filesystem does not exist, you have to install Condor manually on each system. Also, if you want to be able to use Java support, you need to have Sun's Java virtual machine installed prior to installing Condor. The install script provides plenty of help and annotation on each question it asks, and you always can turn to Condor's comprehensive user manual and its associated mailing lists for help.

The variable $CONDOR is used from now on to denote the root path where condor has been installed (untarred).

After the installation, start Condor by running:

$CONDOR/bin/condor_master

This command should spawn all other processes that Condor requires. On the central manager, you should be able to see five condor_ processes running after entering:

ps -aux | grep condor

On the central manager machine, you should have the following processes:

  • condor_master

  • condor_collector

  • condor_negotiator

  • condor_startd

  • condor_schedd

All other machines in the pool should have processes for the following:

  • condor_master

  • condor_startd

  • condor_schedd

And, on submit-only machines you will see:

  • condor_master

  • condor_schedd

After that, you should be able to see the central manager machine as part of your Condor cluster when you run condor_status:

$CONDOR/bin/condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime Mycluster
LINUX INTEL Unclaimed Idle 0.115 3567 0+00:40:04
Machines Owner Claimed Unclaimed Matched Preempting
INTEL/LINUX 1    0    0    1    0    0
Total 1   0   0   1   0   0

If you now run condor_master on the other machines in the pool, you should see that they are added to this list within a few minutes (usually around five minutes).

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Open source?

ewan's picture

The Condor FAQ is pretty clear that source is not distributed freely.

How is Condor in any way 'Open Source'?

You just send a request

Anonymous's picture

You just send a request email, and they give you access to a website with the source.

And the license allows you to do anything you want with the source (redistribute, make derivative works), ala BSD style.

Right source code is NOT

anonymous's picture

Right source code is NOT available from the website, however it is STILL opensource, because if you request it from them and have a good reason to do, like extending it for something, they will not deny the request.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState