Linux Clustering with Ruby Queue: Small Is Beautiful
Here, I walk though the actual sequence of rq commands used to set up an instant Linux cluster comprised of four nodes. The nodes we are going to use are called onefish, twofish, redfish and bluefish. Each host is identified in its prompt, below. In my home directory on each of the hosts I have the symbolic link ~/nfs pointing at a common NFS directory.
The first thing we have to do is initialize the queue:
redfish:~/nfs > rq queue create created <~/nfs/queue>
Next, we start feeder daemons on all four hosts:
onefish:~/nfs > rq queue feed --daemon -l=~/rq.log twofish:~/nfs > rq queue feed --daemon -l=~/rq.log redfish:~/nfs > rq queue feed --daemon -l=~/rq.log bluefish:~/nfs > rq queue feed --daemon -l=~/rq.log
In practice, you would not want to start feeders by hand on each node, so rq supports being kept alive by way of a crontab entry. When rq runs in daemon mode, it acquires a lockfile that effectively limits it to one feeding process per host, per queue. Starting a feeder daemon simply fails if another daemon already is feeding on the same queue. Thus, a crontab entry like this:
15/* * * * * rq queue feed --daemon --log=log
checks every 15 minutes to see if a daemon is running, and it starts a daemon if and only if one is not running already. In this way, an ordinary user can set up a process that is running at all times, even after a machine reboot.
Jobs can be submitted from the command line, from an input file or, in Linux tradition, from standard input as part of a process pipeline. When using an input file or stdin, the format is either YAML (such as that produced as the output of other can rq commands) or a simple list of jobs, one job per line. The format is auto-detected. Any host that sees the queue can run commands on it:
onefish:~/nfs > cat joblist echo 'job 0' && sleep 0 echo 'job 1' && sleep 1 echo 'job 2' && sleep 2 echo 'job 3' && sleep 3 onefish:~/nfs > cat joblist | rq queue submit - jid: 1 priority: 0 state: pending submitted: 2004-11-12 20:14:13.360397 started: finished: elapsed: submitter: onefish runner: pid: exit_status: tag: command: echo 'job 0' && sleep 0 - jid: 2 priority: 0 state: pending submitted: 2004-11-12 20:14:13.360397 started: finished: elapsed: submitter: onefish runner: pid: exit_status: tag: command: echo 'job 1' && sleep 1 - jid: 3 priority: 0 state: pending submitted: 2004-11-12 20:14:13.360397 started: finished: elapsed: submitter: onefish runner: pid: exit_status: tag: command: echo 'job 2' && sleep 2 - jid: 4 priority: 0 state: pending submitted: 2004-11-12 20:14:13.360397 started: finished: elapsed: submitter: onefish runner: pid: exit_status: tag: command: echo 'job 3' && sleep 3
We see in YAML format, in the output of submitting to the queue, all of the information about each of the jobs. When jobs are complete, all of the fields are filled in. At this point, we check the status of the queue:
redfish:~/nfs > rq queue status --- pending : 2 running : 2 finished : 0 dead : 0
From this, we see that two of the jobs have been picked up by a node and are being run. We can find out which nodes are running our jobs using this input:
onefish:~/nfs > rq queue list running | egrep 'jid|runner' jid: 1 runner: redfish jid: 2 runner: bluefish
The record for a finished jobs remains in the queue until it's deleted, because a user generally would want to collect this information. At this point, we expect all jobs to be complete so we check each one's exit status:
bluefish:~/nfs > rq queue list finished | egrep 'jid|command|exit_status' jid: 1 exit_status: 0 command: echo 'job 0' && sleep 0 jid: 2 exit_status: 0 command: echo 'job 1' && sleep 1 jid: 3 exit_status: 0 command: echo 'job 2' && sleep 2 jid: 4 exit_status: 0 command: echo 'job 3' && sleep 3
All of the commands have finished successfully. We now can delete any successfully completed job from the queue:
twofish:~/nfs > rq queue query exit_status=0 | rq queue delete --- - 1 - 2 - 3 - 4
Ruby Queue can perform quite a few other useful operations. For a complete description, type rq help.
Making the choice to roll your own always is a tough one, because it breaks Programmer's Rule Number 42, which clearly states, "Every problem has been solved. It is Open Source. And it is the first link on Google."
Having a tool such as Ruby is critical when you decide to break Rule Number 42, and the fact that a project such as Ruby Queue can be written in 3,292 lines of code is testament to this fact. With only a few major enhancements planned, it is likely that this code line total will not increase much as the code base is refined and improved. The goals of rq remain simplicity and ease of use.
Ruby Queue set out to lower the barrier scientists had to overcome in order to realize the power of Linux clusters. Providing a simple and easy-to-understand tool that harnesses the power of many CPUs allows them to shift their focus away from the mundane details of complicated distributed computing systems and back to the task of actually doing science. Sometimes small is beautiful.
Practical Task Scheduling Deployment
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.View Now!
|The Firebird Project's Firebird Relational Database||Jul 29, 2016|
|Stunnel Security for Oracle||Jul 28, 2016|
|SUSE LLC's SUSE Manager||Jul 21, 2016|
|My +1 Sword of Productivity||Jul 20, 2016|
|Non-Linux FOSS: Caffeine!||Jul 19, 2016|
|Murat Yener and Onur Dundar's Expert Android Studio (Wrox)||Jul 18, 2016|
- Stunnel Security for Oracle
- The Firebird Project's Firebird Relational Database
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- SUSE LLC's SUSE Manager
- Managing Linux Using Puppet
- My +1 Sword of Productivity
- Non-Linux FOSS: Caffeine!
- SuperTuxKart 0.9.2 Released
- Google's SwiftShader Released
- Doing for User Space What We Did for Kernel Space
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide