PyCon DC 2004
The Twisted talk essentially was its own mini-conference. The Twisted developers turned out in force, conducting several sprints on different Twisted topics, presenting a whole track's worth of session talks (including software that can be used both with Twisted and standalone) and drawing over thirty people to their BOF.
Divmod presented its Twisted-powered commercial Web mail site, whose source code is available at divmod.org if you'd like to compete with them. divmod.org distributes Atop (discussed above), Lupy (a full-text indexer), Nevow (discussed above), Pyindex (an indexer using Metakit), Quotient (the webmail server), Reverend (a Bayesian filter) and Stoom (Voice over IP). In addition to Web mail, divmod.com provides POP/IMAP access, integrated spamfiltering (you can override the Bayesian score), flexible searching (by phone number, URL, image, or "question" in the message), blogging, an image manager, contact list, calendar, to-do list and Internet phone (VoIP). They're working on blog-via-email and also multiplayer games. The cost will be somewhere under $10/month for Web mail and basic services, with additional fees for VoIP-to-landline, advanced games and the rest. As an open-source project, divmod.com also will be soliciting feature contributions from its users. A few attendees wondered what Chandler is trying to do that divmod.com isn't already doing. However, I couldn't find a link on divmod.com to sign up. Maybe signup-via-web is not quite ready yet, or maybe it needs a big "subscribe!" button.
Twisted is amazingly flexible and Pythonic, but that darn asynchronous event loop makes thread-happy people run in terror. You can't simply call functions that block, because blocking halts the entire server and makes everybody else's request wait. So you have to turn your program inside out to avoid blocking. Last conference I got excited about Twisted, and every few months thereafter I tried to understand this inside-out method of programming, but every time I hit a brick wall. Now, after talking with the developers at this conference (Donovan Preston, Nevow's inventor, was especially helpful) and reading the Deferred HOWTO three friggin' times, I think I'm finally getting somewhere.
First, there's probably a Twisted library for everything you need to do, so you can avoid the issue for a long time. Second, you mentally have to divide your code into chunks at the natural blocking points. Each chunk becomes a separate function, and you use callLater or addCallback to switch to the next function. This implies that for each blocking point you have to wrap two sections of code into functions. The first section is the slow function that runs after the pause; let's call the new function the "deferred job". The second portion is in the calling routine that depends on the slow function's return value; this is the "callback".
Let's say the slow function does a database lookup and the deferred job is the portion that comes after the I/O. The slow function uses callLater to schedule the deferred job and returns a Deferred instance (d) to the calling routine:
from twisted.internet import deferred, reactor # 'reactor' is the event loop instance that's running. d = defer.Deferred() # reactor.callLater(SECONDS_AS_FLOAT, FUNC, *args, **kw) reactor.callLater(2.5, deferredAction, d, arg2, arg3, kwarg='value') return d
callLater is like UNIX's at command: it schedules a background job to run at a future time. The job has no knowledge of its context, but because you can pass as many arguments as you want, it has all the info it needs. Of course, the job can be a bound method if you want to sneak in other information too. The job needs d, so pass it as an argument or sneak it in as an instance attribute.
The calling routine gets d instead of the result. How lovely, just what we didn't want. The calling routine has to register callback function(s) that will operate on the result.
d.addCallback(myCallback) return # Nothing more to do since I'll never get the result myself.
So d is the link between all the functions, which don't otherwise know about one another. Time passes and the deferred job is called. It produces a result and calls:
d.callback(result) # Oh, so *that's* why I needed 'd'.
Note: the deferred job does not return the result. d then calls all the callbacks that have been registered. The first one is called with one argument: the result. Any subsequent callbacks are called with the return value of the previous callback; this allows generic filter functions to serve as callbacks. The callbacks must completely dispose of the result by printing it or saving it somewhere, because it will be thrown away after the last callback finishes.
My two biggest hurdles with all this were, "What if the callback needs more info from the caller?" and "How does the slow function know when the result will be ready?" Let's look at these separately.
"What if the callback needs more info from the caller? I called the slow function because I wanted the result. I don't mind waiting, but why should I wait for somebody else to get the result? That's the biggest load of crap I've ever heard. I've got my local variables all nicely set, and the callback will need them." Because you can't pass arguments to the callback, you have to sneak the external data in as default values, nested-scope variables or instance attributes. If there's a lot of external data to sneak, using a bound method for the callback looks increasingly attractive.
"How does the slow function know when the result will be ready? It has to choose a certain number of seconds, but how does it know when a database query or socket read will be ready? Won't the deferred job just have to block anyway waiting for it?" The answer is so obvious I felt like a fool when I realized it. The deferred job does a non-blocking poll to see if the result is ready. If it's not, the job schedules itself to run again with all its same arguments after another interval. It avoids calling d.callback until the result is known. If the result is never known, the callbacks never are called, but that's what we want. There are optional errbacks and timeouts (which we won't discuss here) to handle exceptional situations.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Paranoid Penguin - Building a Secure Squid Web Proxy, Part IV
- SUSE LLC's SUSE Manager
- Google's SwiftShader Released
- Managing Linux Using Puppet
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- My +1 Sword of Productivity
- Non-Linux FOSS: Caffeine!
- SuperTuxKart 0.9.2 Released
- Parsing an RSS News Feed with a Bash Script
- Doing for User Space What We Did for Kernel Space
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide