PyCon DC 2004

Showing off new software, planning for the future and running sprints in DC.
Twisted

The Twisted talk essentially was its own mini-conference. The Twisted developers turned out in force, conducting several sprints on different Twisted topics, presenting a whole track's worth of session talks (including software that can be used both with Twisted and standalone) and drawing over thirty people to their BOF.

Divmod presented its Twisted-powered commercial Web mail site, whose source code is available at divmod.org if you'd like to compete with them. divmod.org distributes Atop (discussed above), Lupy (a full-text indexer), Nevow (discussed above), Pyindex (an indexer using Metakit), Quotient (the webmail server), Reverend (a Bayesian filter) and Stoom (Voice over IP). In addition to Web mail, divmod.com provides POP/IMAP access, integrated spamfiltering (you can override the Bayesian score), flexible searching (by phone number, URL, image, or "question" in the message), blogging, an image manager, contact list, calendar, to-do list and Internet phone (VoIP). They're working on blog-via-email and also multiplayer games. The cost will be somewhere under $10/month for Web mail and basic services, with additional fees for VoIP-to-landline, advanced games and the rest. As an open-source project, divmod.com also will be soliciting feature contributions from its users. A few attendees wondered what Chandler is trying to do that divmod.com isn't already doing. However, I couldn't find a link on divmod.com to sign up. Maybe signup-via-web is not quite ready yet, or maybe it needs a big "subscribe!" button.

That Darn Asynchronous Interface

Twisted is amazingly flexible and Pythonic, but that darn asynchronous event loop makes thread-happy people run in terror. You can't simply call functions that block, because blocking halts the entire server and makes everybody else's request wait. So you have to turn your program inside out to avoid blocking. Last conference I got excited about Twisted, and every few months thereafter I tried to understand this inside-out method of programming, but every time I hit a brick wall. Now, after talking with the developers at this conference (Donovan Preston, Nevow's inventor, was especially helpful) and reading the Deferred HOWTO three friggin' times, I think I'm finally getting somewhere.

First, there's probably a Twisted library for everything you need to do, so you can avoid the issue for a long time. Second, you mentally have to divide your code into chunks at the natural blocking points. Each chunk becomes a separate function, and you use callLater or addCallback to switch to the next function. This implies that for each blocking point you have to wrap two sections of code into functions. The first section is the slow function that runs after the pause; let's call the new function the "deferred job". The second portion is in the calling routine that depends on the slow function's return value; this is the "callback".

Let's say the slow function does a database lookup and the deferred job is the portion that comes after the I/O. The slow function uses callLater to schedule the deferred job and returns a Deferred instance (d) to the calling routine:


from twisted.internet import deferred, reactor  
# 'reactor' is the event loop instance that's running.
d = defer.Deferred()
# reactor.callLater(SECONDS_AS_FLOAT, FUNC, *args, **kw)
reactor.callLater(2.5, deferredAction, d, arg2, arg3, kwarg='value')
return d

callLater is like UNIX's at command: it schedules a background job to run at a future time. The job has no knowledge of its context, but because you can pass as many arguments as you want, it has all the info it needs. Of course, the job can be a bound method if you want to sneak in other information too. The job needs d, so pass it as an argument or sneak it in as an instance attribute.

The calling routine gets d instead of the result. How lovely, just what we didn't want. The calling routine has to register callback function(s) that will operate on the result.


d.addCallback(myCallback)
return  # Nothing more to do since I'll never get the result myself.

So d is the link between all the functions, which don't otherwise know about one another. Time passes and the deferred job is called. It produces a result and calls:


d.callback(result)   # Oh, so *that's* why I needed 'd'. 

Note: the deferred job does not return the result. d then calls all the callbacks that have been registered. The first one is called with one argument: the result. Any subsequent callbacks are called with the return value of the previous callback; this allows generic filter functions to serve as callbacks. The callbacks must completely dispose of the result by printing it or saving it somewhere, because it will be thrown away after the last callback finishes.

My two biggest hurdles with all this were, "What if the callback needs more info from the caller?" and "How does the slow function know when the result will be ready?" Let's look at these separately.

"What if the callback needs more info from the caller? I called the slow function because I wanted the result. I don't mind waiting, but why should I wait for somebody else to get the result? That's the biggest load of crap I've ever heard. I've got my local variables all nicely set, and the callback will need them." Because you can't pass arguments to the callback, you have to sneak the external data in as default values, nested-scope variables or instance attributes. If there's a lot of external data to sneak, using a bound method for the callback looks increasingly attractive.

"How does the slow function know when the result will be ready? It has to choose a certain number of seconds, but how does it know when a database query or socket read will be ready? Won't the deferred job just have to block anyway waiting for it?" The answer is so obvious I felt like a fool when I realized it. The deferred job does a non-blocking poll to see if the result is ready. If it's not, the job schedules itself to run again with all its same arguments after another interval. It avoids calling d.callback until the result is known. If the result is never known, the callbacks never are called, but that's what we want. There are optional errbacks and timeouts (which we won't discuss here) to handle exceptional situations.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

LeoN a pure Python SubEthaEdit clone

Anonymous's picture

There is a previous pure python SubEthaEdit clone that has an full featured alpha release

http://ryalias.freezope.org/souvenirs/leon

Re: PyCon DC 2004

Anonymous's picture

Since Mike didn't mention it, I'll point out -- the VoIP software that Anthony Baxter presented is called Shtoom, and it's hosted at Divmod:

http://www.divmod.org/Home/Projects/Shtoom/index.html

-- Christopher Armstrong
http://radix.twistedmatrix.com/

Re: PyCon DC 2004

Anonymous's picture

I've read half a dozen writeups on PyCon 2004. Mike's is the clear
winner among them all. Thanks! I really feel like I've gotten some of the essence of an interesting event.

About Fuse

Anonymous's picture

There is a previous pure python SubEthaEdit clone that has an full featured alpha release

http://ryalias.freezope.org/souvenirs/leon

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState