Event-Driven Programming with Twisted and Python
In the beginning, there were forking servers and then came threaded servers. Although they manage a few concurrent connections well, when network sessions reach into the hundreds or even thousands, forking and threading servers spawn too many separate, resource-consuming processes to be efficient. Today, there is a better way, asynchronous servers. A new breed of frameworks for third-generation languages is taming the once complex world of event-driven programming.
A rising star in the Python community has been Twisted, which makes asynchronous programming simple and elegant while providing a massive library of event-driven utility classes. In this article, I discuss asynchronous event-driven programming and how it's done in Twisted. Because reading about code only gets you so far, I cite examples from a real Twisted application developed for this article: a simple proxy server that blocks unwanted cookies, images and connections. Instructions on how to get the complete source code are in the on-line Resources.
The Twisted Project has been gaining popularity as a powerful and increasingly stable way of implementing networked applications. At its core, Twisted is an asynchronous networking framework. But unlike other such frameworks, Twisted boasts a rich set of integrated libraries for handling common protocols and programming tasks, such as user authentication and even remote object brokering. One of the philosophies behind Twisted is breaking down traditional separations among toolkits, as the same server that serves Web content could resolve DNS lookups. Although the package itself is quite large, applications need not import all the components of Twisted, so run-time overhead is kept to a minimum.
As with Python, Twisted's user base has been expanding from its academic roots to the commercial and government sectors. At Zoto, we're using Twisted in a distributed photo storage and management application, because it enables us to develop scalable network software quickly in a famously productive language, Python. Programming day to day, I appreciate Twisted for its impressive toolkit and supportive community. And as with all community-oriented open-source projects, Twisted is a safe business bet, because its existence doesn't hinge on the continued support of any single company or institution.
Have you ever been standing in the express lane of a grocery store, buying a single bottle of water, only to have the customer in front of you challenge the price of an item, causing you and everyone behind you to wait five minutes for the price to be verified? Plenty of explanations of asynchronous programming exist, but I think the best way to understand its benefits is to wait in line with an idle cashier. If the cashier were asynchronous, he or she would put the person in front of you on hold and conduct your transaction while waiting for the price check. Unfortunately, cashiers are seldom asynchronous. In the world of software, however, event-driven servers make the best use of available resources, because there are no threads holding up valuable memory waiting for traffic on a socket. Following the grocery store metaphor, a threaded server solves the problem of long lines by adding more cashiers, while an asynchronous model lets each cashier help more than one customer at a time.
This isn't to say there aren't benefits to a threaded model. For instance, with microthreads, the amount of resources used by any particular thread is reduced substantially. There's an inherent complexity in asynchronous programming, especially when you need to do many blocking operations in succession. In Python, however, the benefits of threading are diminished by Python's Global Interpreter Lock (GIL). Threaded programming in Python is refreshingly simple, because all internal Python operations are thread-safe. To add an item to a list or set a dictionary key, no locks are required, so as to avoid race conditions among threads. Unfortunately, this is implemented through an interpreter-wide lock that Python's interpreter uses liberally. So, although two threads safely can append to the same list at the same time, if they're appending to two different lists, the same lock is used. Because threaded Python applications suffer a resulting performance hit, asynchronous single-thread programming is all the more desirable for a language such as Python.
- March 2015 Issue of Linux Journal: System Administration
- High-Availability Storage with HA-LVM
- DNSMasq, the Pint-Sized Super Dæmon!
- Localhost DNS Cache
- Real-Time Rogue Wireless Access Point Detection with the Raspberry Pi
- Days Between Dates: the Counting
- The Usability of GNOME
- You're the Boss with UBOS
- Multitenant Sites
- Linux for Astronomers