PyCon DC 2004

by Mike Orr

PyCon DC 2004 was held March 24-26, 2004, (plus four days of sprints beforehand) at George Washington University in Washington, DC. The 364 registrants were a 40% increase over previous conferences. One person even came on the last day and paid the full registration ($300) fee in spite of being offered a discount, so eager was he to support Python.

This was the second attendee-run conference put on by the DC crowd. They organized it using the ultimate in iconoclast project management tools, a wiki ("the people's organizer"). MoinMoin was supplemented by a mailing list and IRC. Steve Holden, who introduced himself by saying "my name is not important", also said, "I'm responsible for this mess". Behind this classic British understatement lies a capable leader, a veteran of PyCon DC 2003. The organizers burnt the midnight oil for several months doing the thousand and one little tasks necessary to make the conference run smoothly: making this year's food better than last year's (including options for vegetarians), providing Net access within GWU's wireless policy, approving papers and scheduling tracks, running a registration Web site, scouting out low-cost hotels and restaurants, coordinating with the sponsors and more.

A few things didn't go off as planned. The paper review schedule wasn't coordinated with the registration schedule, necessitating the extension of the early-bird registration discount. Insufficient attention was given to the Open Space sessions and Lightning Talks. The GWU caterers didn't return messages as responsively as last year. Nevertheless, the show started on time, enough registrars were on hand to prevent check-in from becoming swamped, the speakers were easy to see and hear, the schedule (printed on a color printer) was easy to read and two rooms were available throughout for sprinting and BOFing.

The Python Software Foundation (PSF) was responsible financially for the conference and ran it as a fundraiser. It was extremely successful; the preliminary estimate I heard was "five figures". The reason for this was the unexpected surge in registration during the last month, due to Trevor Toenjes' marketing efforts, which netted a hundred more registrants than anticipated. The PSF now is deciding how to spend this money to pursue its mission: holding Python's intellectual property and keeping it freely available, supporting Python development and related open-source projects and promoting Python to the unconverted. Possible ideas include grants toward more action-oriented events (for example, non-conference sprints, software-project meetings) and promoting Python to project managers (mid-level managers who are somewhat clueful technically). But it will take some time to decide because the PSF is run by volunteers with their own day jobs. One of their ideas already has been implemented, though: this year's sprints were underwritten by the PSF. Guido's time machine strikes again.

The Zen of Python

Last year, Guido presented Tim Peter's "Zen of Python". This year it was on the back of everybody's T-shirt. I also learned about this little-known module in the standard library:



$ python
Python 2.3.3 (#1, Apr  6 2004, 18:13:12)
[GCC 2.95.4 20011002 (Debian prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
>>>


Of course, this harkens back to the Import This! challenge of Python10 (2002). Now you can. this.py is a nice little ROT-13 puzzle for the YAPyH's out there. Those holding out for this to replace self someday, however, may be disappointed.

Sprints

So what is a sprint? A sprint is a group of people hacking together on the same software project. Some sprints require a minimum level of experience; others are open to anybody who wants to get involved. Sprinters can contribute in a wide variety of ways, not only coding (new features, troubleshooting, regression tests) but also user documentation, developer documentation, squashing bugs, brainstorming design ideas, doing a teach-in, preparing promotion materials and so on. Having several sprint groups in the same room means that whenever you need help on some esoteric topic you can shout, "Is there somebody here who's an expert on ___?", and likely there is.

2003 had twice as many sprint groups as last year. There were sprint groups for the Python core, Zope, Twisted, Chandler, Plone, Docutils and Guido van Robot (a language for teaching programming fundamentals). One side benefit of sprinting is the opportunity to see Python luminaries at work, often on projects different from what they are known for.

I participated in the Docutils sprint. I had a longstanding grudge with reST: its inability to output an HTML fragment exactly corresponding to the input text, without the HTML header and style crap around it. David Goodger, Docutils maintainer and sprint coach, said this was a symptom of a larger problem: the inability to extract the parsed parts of a document individually for any custom application. He teamed me up with Reggie Dugard, a sprinter with no experience but a keen desire to get involved. I helped Reggie design a function to return the parts, and he later finished the implementation. Other Docutils sprinters worked on output formats, integration with Epydoc and MoinMoin and a syntax for flagging indexable entries in the text.

At least one sprint group already has their sights set on EuroPython and is essentially doing a revolving sprint. They'll reconvene at the next available conference and continue where they left off, then go to the next conference and so on. Some projects apparently are getting most of their development done in these sprints. That inspired those of us in Seattle to try to host a regional sprint later this year. Our wiki link is below if you'd like to participate.

Keynote #1: Mitch Kapor

Mitch Kapor designed Lotus 1-2-3, a spreadsheet that arguably was the killer application that brought PCs into the business world in the 1980s. Later, he founded the Electronic Frontier Foundation, a watchdog group for civil liberties in cyberspace. He now runs the Open Source Applications Foundation (OSAF), whose main project is Chandler, a Python-powered personal information manager (PIM).

In the 1980s Mitch was fascinated by the Apple II. But he saw a big gap between what the Apple II, TRS-80 and PET could do vs. what people wanted them to do. Software, hardware and the Internet later evolved, GUIs appeared and evolved, but the critical parts still remain missing: computers don't do what the users want and they still require the users to be sysadmins--even those systems that claim they don't.

Mitch started considering open source in the late 90s when working as a consultant; he was forced to recommend Microsoft Exchange on NT for a five-person office. Exchange was way overkill for the situation and required introducing an NT server where none was before, but there was no feasible alternative. "The road was littered with corpses of companies that had tried to compete with Microsoft's product", he said. Mitch was familiar with free software, because Richard Stallman had lived up the street from him in the 80s and had picketed his house. At the time Mitch was torn. He felt Stallman was right, but he also had to think about his company's interests.

By the 90s, however, he began to consider developing open-source applications himself. One factor was the unexpected success of Linux. Companies now are betting mission-critical applications on non-commercial software: "The business community still hasn't appreciated how revolutionary this will be." Open source will make software in general cheaper: "There will still be a lot of paid programmers but probably fewer billionaires." It can't be contained by any one company because the entry barrier is so low. Open source is catching on in the Third World faster than in the industrialized countries: "In five years, maybe ten at most, it will be a fait accompli." Remember that, "ten to fifteen years ago large corporations never thought mainframes and terminals would be replaced by little microprocessors."

Chandler is inspired by Lotus Agenda. It's an e-mail client, calendar groupware and a contact manager; it also can handle freeform notes and other types of information. Mitch focused on e-mail because that's the center of many people's organizational life--they use their inbox as a de facto to-do list.

Mitch chose Python over Java feeling that Java (particularly Swing) was not well-designed for end-user applications. He knew about Perl but chose Python because two people convinced him that Python has better developer productivity and its flexibility would allow users to add functionality themselves later. The latter proved correct when users came up with the idea of a spiral calendar view and implemented it.

Chandler uses wxPython, Berkeley DB, dbXML, Jakarta Lucene (full-text indexing) and OpenSSL. "Python is the glue holding all these together", he said. To use Lucerne they converted its Java interface to Python.

Mitch identified two challenges Python needs to face:

  • Performance. You can never do enough to improve it, in spite of Moore's law.

  • Security. Now that the Internet is mainstream, people with too much time on their hands and not enough judgment are doing mischief. Security should be easy for the user. Sandboxes, e-mail filters and the like should be built into the infrastructure.

Pythoneers took these challenges to heart. Guido's subsequent keynote and several of the session talks acknowledged these goals and showed how their projects will be working on them during the coming year.

Mitch also had several challenges for open source in general:

  • Desktop. Mitch would like to use a Linux-based desktop that's not a poor second to Windows and the Mac. Now that overlapping projects exist, each with its own history, the project managers must coordinate and realize there's a bigger world outside their own projects.

  • Intellectual property reform. The laws have not caught up with software reality. Patent laws in particular must be overhauled or they will throw a monkey wrench into open-source's progress. The Open Source community is not well organized to defend itself. We have to hope it doesn't hit soon. We'll need foundations and friendly corporations to do the heavy lifting, so they should take the lead immediately and make sure we have a response prepared.

  • The free-rider problem. Too many companies depend on free software without contributing back. This puts a damper on what these projects can do, and it even threatens the sustainability of some projects. Companies don't necessarily have to contribute code; they can sponsor developers or support foundations.

  • Implications beyond software. The open-source model is applicable to other business practices besides software.

Keynote #2: Guido van Rossum

Python's creator said, "What will the future bring for Python? Darned if I know. Maybe somebody in the future can bring back the keys to my time machine." In the meantime Guido declared Python 2.2 an ex-release, said 2.3.3 is current and 2.4 will be "better, faster, and bigger". It will be faster thanks to new library modules and more C code. The latest summary of completed and planned changes is in PEP 320: Python 2.4 Release Schedule, and a gentler introduction is in "What's New in Python 2.4" (URLs below). The full details of completed changes are in the file Misc/NEWS in the Python source, and Misc/HISTORY details all the past changes since Python 0.9.0.

Guido said the current dilemmas for 2.4 are deciding on a function decorator syntax, getting people used to generator expressions and speeding up function calls by shrinking the stack frames.

A decorator is a function that takes another function as an argument and returns a modified version of that function. Python's classmethod and staticmethod builtins are decorators. Python needs a generic syntax to apply decorators in the def line, remembering that a function can have multiple decorators. Five syntaxes are under consideration:



def foo [classmethod] (cls, arg1, arg2):
    # Guido's favorite, and preferred by PyCon hand vote.
    # Disadvantage: args separated from function name.

[classmethod] def foo(cls, arg1, arg2)
    # Disadvantage: 'def' not at left column.

def [classmethod] foo(cls, arg1, arg2):
    # Disadvantage: function name not near left column.

def foo(cls, arg1, arg2) [classmethod]:
    # Disadvantage: decorator hidden behind long argument list.

def foo(cls, arg1, arg2):
    [classmethod]
    # Disadvantage: decorator not on 'def' line.


Generator expressions are like list interpolations, but they create an iterator rather than a list. That's good because usually you're going to iterate over it once anyway, so a list would waste memory. The syntax is quite unobtrusive; simply put the keywords in a normal parenthesized expression:

sum(x * 2 for x in range(10))

If there's a comma on either side, an extra pair of parentheses is required. This eliminates ambiguity in argument lists.

The long-awaited Python 3.0 has been put off again, this time until his son Orlijn goes to college (he's now 1 1/2) or until Guido retires or goes on a sabbatical. His top wishlist items for 3.0 are:

  • convert functions returning lists into generators wherever feasible. ("Iterators are the force of the future.") Users will have to type list(range(10)) if they really need a list.

  • add interfaces.

  • make the module hierarchy in the standard library deeper.

  • learn from IronPython, Starkiller and PyPy.

IronPython is a rewrite of Python in C# for Microsoft's Common Language Runtime. Every port of Python to a different virtual machine tests the robustness of the language and standard libraries and often reveals some efficiency tricks that can be ported back to CPython. Starkiller is discussed below; it's a statically typed C++ compiler for Python programs. PyPy is a rewrite of Python in Python. (Yes, some people actually do this.) Q&A on a variety of topics is in the SubEthaEdit notes.

Homage should be paid to last year's most controversial topic, if-then-else expressions (PEP 308). Guido did not see it as important because the if statement exists, but he was willing to add it if the community overwhelmingly favored it and agreed on a syntax. The community could do neither, so Guido rejected it forevermore.

Keynote #3: Bruce Eckel

Bruce Eckel gave a humorous talk about static vs. dynamic typing. You can read more about it in the SubEthaEdit notes and in Bruce's Web log (see Resources). He quoted Andrew Dalke's comment about typing: "If you only care if it quacks, you don't need to check that it's a duck." Bruce maintains that even with static typing you still have to do checks at runtime for things the compiler can't catch, so why obscure the code with types and casts? He thinks one's time would be better spent writing a robust automated test suite (unittest) than on debugging type casts.

Nevow

To me, the three most interesting talks were those discussing Nevow, Quixote and Atop. Nevow is the successor to Twisted Woven, a high-level Web application framework. Because I've worked with Webware and Cheetah extensively, I was curious about the differences. Nevow, like most other non-PSP/PHP servers, is based on servlets. Something parses the URL and invokes a servlet instance somewhere, and a servlet method returns or writes the HTML output. In Webware, that something that parses the URL is Webware itself. In Nevow, as in twisted.web that I described last year, your application class registers itself as the manager of a certain URL part (=directory), and then your

.locateChild(self, request, parts) 

method has to handle or delegate everything to the right of that part.

Nevow's most intriguing aspect is its built-in model-view-controller (MVC) framework, which strictly separates generic logic calculations (the data) from the specific view desired (the rendering). A servlet (a Page subclass) associates itself with an HTML template. The template might contain placeholder tags like this:



<span nevow:data="currentMonth" nevow:render="month">
<h1><nevow:slot name="label">Bogus label</nevow:slot></h1>
...
</span>


(Here and throughout I've used modified examples from the presentations.) Nevow sees "currentMonth" and calls a method .data_currentMonth in your servlet. That method returns some values needed in the <span>, possibly based on GET parameters or other criteria. Nevow then sees "month" and calls .render_month(self, context, data). That method receives the data returned by the previous method and fills the slot with its actual value:

context.fillSlots('label', monthName) 

This replaces the entire slot tag. Cheetah fans note that the template is not pulling values from a dictionary; rather, the method is pushing values into the template. The rendering method also can extract another type of tag, the pattern, which it can use in a for loop to generate repetitive sections, for example, table rows.

Nevow also has some Pythonic ways to generate HTML tags (what the W3C DOM should have been but isn't), a form handler (again MVC) and an abstraction for server-client Javascript interaction.

Quixote

Quixote is another Web application framework. Its servlet lookup technique is very Pythonic: you place your servlet hierarchy in an importable Python package. Quixote processes the URL parts from left to right, using getattr() to find each part. This allows wide flexibility: each part can be a submodule, class, instance or anything else that has attributes. Eventually Quixote should find something callable: a function, a method or an instance with a .__call__ method. It calls that with a request data structure, and the return value is the HTML string (or an instance of a streaming class). At each step three special attributes in the parent affect the behavior:

._q_public (list of strings, required) 

Must list the subattribute. If the subattribute is missing or ._q_public is missing, Quixote pretends it couldn't find the subattribute. That's to prevent accidentally publishing private objects.

._q_access (function/method, optional) 

May raise AccessError to forbid the request.

._q_index (function/method, optional) 

Saves the day if Quixote falls off the end of the URL without finding something callable; akin to index.html.

._q_lookup (function/method, optional) 

Wildcard attribute if no specific attribute matches; akin to Python's .__getattr__().

But the most interesting aspect of Quixote is its template system, PTL. It's useful not only in Web servlets but in a wide variety of applications. Unlike Nevow and most template systems that have placeholders in the text, PTL embeds the text as string literals in a function. For instance:


# example.ptl
def cell [html] (content):
    '<td>'
    content
    '</td>'

def ordinary():   # An ordinary Python function.
    return "Result."

To use it:


import quixote;  quixote.enable_ptl
import example
print example.cell("Acme & Co.")   # Prints "<td>Acme &amp; Co.</td>".

enable_ptl installs an import hook, which tells import how to load *.ptl files, compile them and write *.ptlc files. [html] is a decorator as described in Guido's keynote above. Because Python doesn't yet have a decorator syntax built in, PTL has to fake it. The PTL compiler captures the literal result of each expression or string--what Python's interactive mode would have printed--and concatenates them into a return value. This is something I've often wished Python or Cheetah could do, and here it is. PTL seems more suited for templates with smallish blocks of text and a lot of calculations than for templates with multi-page static text and only a few placeholders.

The [html] decorator automatically HTML-escapes expression results and arguments but does not escape literals. This is usually what you want, because results may come from an untrusted source, but literals are presumably correct. The return value is a pseudo string, an htmltext instance, used to protect it from further escaping should it be passed to another [html] function. There's also another decorator, [plain], which does all the concatenation goodies without the escaping and is suitable for your non-HTML applications.

Atop

I went to the Atop talk because the summary said BSDDB. I thought, "Well, anything about Berkeley DB will be mildly interesting." It turned out to be majorly interesting, because Atop is an object database built on top of Berkeley DB. How did they know I recently had been looking for Python object databases besides ZODB?

The session paper is not on-line, but the SubEthaEdit notes are. All serializable objects must subclass or be Item. Every item has a unique numeric ID; there's no physical nesting of objects. However, a Pool acts like a list and gives the illusion of nesting. In reality it contains pointers to the various raw items. Pools can be queried, for instance:


pool = store.getItemByID(7)  # 'store' is an open database.
for item in pool.queryIndex('name', startKey='Bob'):
    # Loop through all elements whose 'name' attribute is >= 'Bob'.
    print item.name

Berkeley DB is reliable, fast, easy to install and fully integrated with Python. Several other projects use it, including MySQL (as an optional table format) and Subversion. However, it's extremely difficult to use correctly, and the dangers include data corruption. Fortunately, Atop takes care of these problems so you don't have to.

Atop currently is distributed as part of divmod.org's Quotient package, a Twisted server that's described next.

Twisted

The Twisted talk essentially was its own mini-conference. The Twisted developers turned out in force, conducting several sprints on different Twisted topics, presenting a whole track's worth of session talks (including software that can be used both with Twisted and standalone) and drawing over thirty people to their BOF.

Divmod presented its Twisted-powered commercial Web mail site, whose source code is available at divmod.org if you'd like to compete with them. divmod.org distributes Atop (discussed above), Lupy (a full-text indexer), Nevow (discussed above), Pyindex (an indexer using Metakit), Quotient (the webmail server), Reverend (a Bayesian filter) and Stoom (Voice over IP). In addition to Web mail, divmod.com provides POP/IMAP access, integrated spamfiltering (you can override the Bayesian score), flexible searching (by phone number, URL, image, or "question" in the message), blogging, an image manager, contact list, calendar, to-do list and Internet phone (VoIP). They're working on blog-via-email and also multiplayer games. The cost will be somewhere under $10/month for Web mail and basic services, with additional fees for VoIP-to-landline, advanced games and the rest. As an open-source project, divmod.com also will be soliciting feature contributions from its users. A few attendees wondered what Chandler is trying to do that divmod.com isn't already doing. However, I couldn't find a link on divmod.com to sign up. Maybe signup-via-web is not quite ready yet, or maybe it needs a big "subscribe!" button.

That Darn Asynchronous Interface

Twisted is amazingly flexible and Pythonic, but that darn asynchronous event loop makes thread-happy people run in terror. You can't simply call functions that block, because blocking halts the entire server and makes everybody else's request wait. So you have to turn your program inside out to avoid blocking. Last conference I got excited about Twisted, and every few months thereafter I tried to understand this inside-out method of programming, but every time I hit a brick wall. Now, after talking with the developers at this conference (Donovan Preston, Nevow's inventor, was especially helpful) and reading the Deferred HOWTO three friggin' times, I think I'm finally getting somewhere.

First, there's probably a Twisted library for everything you need to do, so you can avoid the issue for a long time. Second, you mentally have to divide your code into chunks at the natural blocking points. Each chunk becomes a separate function, and you use callLater or addCallback to switch to the next function. This implies that for each blocking point you have to wrap two sections of code into functions. The first section is the slow function that runs after the pause; let's call the new function the "deferred job". The second portion is in the calling routine that depends on the slow function's return value; this is the "callback".

Let's say the slow function does a database lookup and the deferred job is the portion that comes after the I/O. The slow function uses callLater to schedule the deferred job and returns a Deferred instance (d) to the calling routine:


from twisted.internet import deferred, reactor  
# 'reactor' is the event loop instance that's running.
d = defer.Deferred()
# reactor.callLater(SECONDS_AS_FLOAT, FUNC, *args, **kw)
reactor.callLater(2.5, deferredAction, d, arg2, arg3, kwarg='value')
return d

callLater is like UNIX's at command: it schedules a background job to run at a future time. The job has no knowledge of its context, but because you can pass as many arguments as you want, it has all the info it needs. Of course, the job can be a bound method if you want to sneak in other information too. The job needs d, so pass it as an argument or sneak it in as an instance attribute.

The calling routine gets d instead of the result. How lovely, just what we didn't want. The calling routine has to register callback function(s) that will operate on the result.


d.addCallback(myCallback)
return  # Nothing more to do since I'll never get the result myself.

So d is the link between all the functions, which don't otherwise know about one another. Time passes and the deferred job is called. It produces a result and calls:


d.callback(result)   # Oh, so *that's* why I needed 'd'. 

Note: the deferred job does not return the result. d then calls all the callbacks that have been registered. The first one is called with one argument: the result. Any subsequent callbacks are called with the return value of the previous callback; this allows generic filter functions to serve as callbacks. The callbacks must completely dispose of the result by printing it or saving it somewhere, because it will be thrown away after the last callback finishes.

My two biggest hurdles with all this were, "What if the callback needs more info from the caller?" and "How does the slow function know when the result will be ready?" Let's look at these separately.

"What if the callback needs more info from the caller? I called the slow function because I wanted the result. I don't mind waiting, but why should I wait for somebody else to get the result? That's the biggest load of crap I've ever heard. I've got my local variables all nicely set, and the callback will need them." Because you can't pass arguments to the callback, you have to sneak the external data in as default values, nested-scope variables or instance attributes. If there's a lot of external data to sneak, using a bound method for the callback looks increasingly attractive.

"How does the slow function know when the result will be ready? It has to choose a certain number of seconds, but how does it know when a database query or socket read will be ready? Won't the deferred job just have to block anyway waiting for it?" The answer is so obvious I felt like a fool when I realized it. The deferred job does a non-blocking poll to see if the result is ready. If it's not, the job schedules itself to run again with all its same arguments after another interval. It avoids calling d.callback until the result is known. If the result is never known, the callbacks never are called, but that's what we want. There are optional errbacks and timeouts (which we won't discuss here) to handle exceptional situations.

Starkiller

The most entertaining talk was "Faster than C: Static Type Interface with Starkiller", by Michael Salib. That was only the fourth best title of the conference, however. The top three were "'Scripting Language' My Arse: Using Python for Voice over IP", by Anthony Baxter; "Flour and Water Make Bread", by David Ascher; and "Two Impromptus, or How Python Helped Us Design Our Kitchen", by Andrew Koenig.

The Starkiller talk was enjoyable because Mike is one of two speakers who doesn't pull any punches--he calls a spade a shovel. His justification for writing a Python-to-C++ compiler was, "For the 15% of applications where speed matters, Python is slow. Sure, you can write it in C++, but C++ sucks. That's why we're using Python." Which actually makes a lot of sense if you think about it.

He went on to explain how Python's dynamic features that we love so dearly is the reason Python is so hard to optimize. "Python has lots of runtime choice points, but thirty years of compiler optimization research depends on eliminating runtime choices." By "runtime choices" he's referring to the fact that a variable may change type, attributes may be added after startup and the like, and all this happens after a traditional optimizer would have finished and said sayonara. "But dynamic capability is good because Python kicks ass." So Starkiller follows the 80/20 rule by optimizing what it can and leaving the rest. In particular, it refuses to optimize functions that contain eval, exec or dynamic module loading. But that's okay because most users don't use them.

Once those cases are eliminated, we examine the assignment statements to determine the types:


x=3;  y=x;  z=y;  z=4.3   # x is int.  y is int.  z is int or float. 

The types can be traced similarly through function arguments and return values. What about polymorphic functions? Starkiller handles them the same way C++ does: by generating distinct same-name functions for all argument combinations (aka overloading). Mike offers a few benchmarks to demonstrate the speed of this approach but also cautions, "All benchmarks are lies."

If you want to play with Starkiller, you're out of luck because there's no public download yet. There actually were several talks this conference presenting software that's not yet available, either because it's not robust enough or it's waiting for legal paperwork. I didn't see that in previous conferences. A few attendees commented, "Well, it's not as useful as a talk on something that's available, but on the other hand it's good to learn about cutting-edge research as soon as possible." The conference reviewers seemed to have done their job of approving pre-alpha talks only if they covered an area that was central to Python and long on Python's wishlist. Mike had his own reason for not revealing the code. "If you kill me now, you'll never get it."

Mike offered an obligatory acknowledgment. "Who owns Starkiller? MIT! Who paid for Starkiller's development? You did! Pat yourselves on the back! Thank you, taxpayers!!!" He then begs the audience not to tell DARPA that Starkiller is a Python-to-C++ converter rather than the sun-destroying weapon they think he's building. The presentation slides have a few more zingers too; they're available under the Session Papers link on the PyConDC2004 Aftermath link in Resources.

Finally, Mike ended with an indictment of the sun. "Destroy the sun! We hatessss it! It burns! The pale yellow face mocks us, keeps us from hearing the machine. It causes global warming. It causes sunburns. DARPA says the sun is bad, it warms our enemies. It weakens our dependence on foreign oil. There's only one logical conclusion: we must destroy the sun."

Guido had one question during Q&A. He asked, "Is it difficult to have so much attitude all the time?"

(There's a discrepancy about when Guido asked that question. My notes say it was during this talk. The SubEthaEdit notes say it was during Anthony Baxter's VoIP talk, which was almost as feisty. But of course my notes are right.)

Other Talks

Ah yes, that VoIP talk. Voice over IP may be Python's next killer application. Internet telephony currently is growing slowly, but it has the potential to become really big when it reaches critical mass. As VoIP becomes ubiquitous, people will need a new generation of applications and phones. Stoom, which runs under Twisted, is one small step in that direction. It needs a better UI, but it proves that Python is up to the task, even though VoIP requires generating sound packets exactly 20 milliseconds apart.

By the way, the title of the VoIP paper is slightly modified from the original. It's currently "'Scripting Language' My Arse: Using Python for Voice over IP." Originally it was "'Scripting Language' My Shiny Metal Arse: Using Python for Voice over IP".

Zope 3 hasn't changed much since last year; it's just further developed. It still aims to be friendlier to application developers than Zope 2, more Pythonic, more modularized, more specific in its API use (for example, less implicit acquisition), with better documentation early and a better integration of Web-based and filesystem-based application development methods. This will make applications more portable between Zope and other environments and make individual Zope features more accessible to non-Zope applications.

The PyPy talk showed that a Python virtual machine can be written in 16,000 lines of Python. The prompt looks like this: >>>>, with one extra > for each recursive level of PyPy. PyPy is markedly slower than Python and exponentially slower recursively: the innermost interpreter interprets the code, the next interpreter interprets the interpreter interpreting the code. But the PyPy developers have faith that they eventually can make it faster than CPython.

Pyrex stands alongside ctypes as an indispensable part of the C extension writer's toolkit. Pyrex compiles ordinary Python to C, but for better optimization you can use the cdef statement to declare variables as certain C types. Expressions using those variables are compiled directly to C, bypassing Python's slow object infrastructure.

Several other talks are worth mentioning but I don't have the space. Browse the Session Papers link on PyConDC2004Aftermath site and see which topics night interest you.

Macintosh / SubEthaEdit

A surprising number of Macintosh laptops were spotted at this conference. But then I remembered that there was a surprising number of Macintoshes at last year's conference too. Bob Ippolito gave a talk called "60 Minutes with MacPython". I haven't had a Mac since I sold my Classic in 1993 to get a machine that could run Linux, I'll spare you a clueless interpretation and refer you to the session paper and SubEthaEdit notes. Bob has some warnings about which Python to use for OS X 10.2 vs 10.3, the dreaded resource fork and more. MacPython also has libraries for Cocoa, the NeXT framework that was ported to the Mac. Bob says Cocoa is a good API; it's been around for 15 years. wxPython works on the Mac but it doesn't put the widgets in quite the correct place.

One thing I didn't realize until after the conference (when my friend Brian Dorsey pointed it out) is that many of those Mac laptops were running SubEthaEdit. SubEthaEdit is a distributed text editor; you may prefer to think of it like a multiplayer game, CVS on steroids or a real-time wiki. All the SubEthaEdit programs running in wireless range of one another get together and state which files they have open. Anybody can update anybody's document simultaneously in real time. The background color of the text shows who edited what portion. When done in a lecture hall (for instance, a talk at a Python conference), the result is the best of everybody's notes all in one. This is the next generation of notetaking and will no doubt be a staple at future Python conferences.

Some of these Mac addicts went to dinner together on the first day of the conference and started thinking, "Why don't we write an open-source equivalent to SubEthaEdit in Python?" The result is Fuse, which is partially working. The developers' mailing list stalled March 30, but a couple people checked in April 11 to say they're still working on it.

Plans for Next Year

Success has its drawbacks, and the biggest one is that we may outgrow GWU next year if attendance approaches 500. The organizers are shopping for a venue that can handle a thousand people, so we won't have to move again after a couple years. We may be too late for next year because conferences normally book 18 months in advance. If we do stay at GWU one more year, we'll have to cap attendance at its capacity. Priorities for a new location include:

  • proximity to public transportation, low-cost accommodations and a city to hang out in

  • facilities that include catering, network equipment and the like at a cost comparable to GWU

  • accessibility for off-hours activities, like BOFs

  • a place to hold the sprints before or after the conference

There are arguments both for and against remaining in DC. "lac" writes in the wiki that we can't assume that the number who attend because it's in DC are more than the number who don't attend because it's in DC. We just don't know. The West Coast has OSCON, but OSCON attracts a different clientele due to its higher price and different focus. There's a possibility of two regional conferences down the road, PyCon East and PyCon West. Is that better for the community or worse? Europe already is hosting its own conferences (EuroPython and Python UK), and maybe Asia will too.

Steve Holden has said he's willing to chair one more conference but he wants to retire after that. He's hoping to train a vice-chair next year who can take over the reigns the following year. PyCon cannot remain dependent on one person or it won't succeed long term.

Steve's greatest regret is that 50-100 people marked the volunteer box on the registration, but there was no mechanism in place to alert them when they were needed, so that manpower was lost.

There's discussion about whether to have the sprints before or after the conference, or both. The argument for after is the talks will draw new people to the same-topic sprints, and the sprinters already will know one another and maybe have decided their tasks so they can start running. The argument for before is that people will be burned out after the conference.

Open Space definitely needs better organization. Open Space, Lightning Talks and BOFs need to be planned into the schedule. Perhaps the first day's topics should be set before the conference to avoid losing that day. Also, what's the difference between an Open Space and a BOF anyway?

The brightest note is that Steve writes, "It's amazing how many people say that PyCon is the best conference they have [ever] attended, even people who are quite experienced conference-goers."

Oh and Steve, if you do step down from the chairman role, you'll still be there with your British understatement zingers, won't you?

Resources

Pythoninfo Wiki

PyConDC2004 Aftermath

SubEthaEdit Session Notes

PyCon DC 2004 Home Page

Python Home Page

Python Software Foundation

What's New in Python 2.4

Python 2.4 Release Schedule

Bruce Eckel's Weblog (the four entries for March 2004 are relevant to his keynote address)

Electronic Frontier Foundation: a watchdog group for civil liberties in cyberspace (non-Python)

PyCon DC 2003

Import This: the Tenth International Python Conference

It Fits Your Brain: the Ninth Annual International Python Conference

Chandler: a personal information manager (PIM).

divmod.com: a Twisted-powered commercial Web mail site, whose software is available open source at divmod.org.

Fuse: a Python replacement for SubEthaEdit (developers' mailing list).

Guido van Robot: a small programming language for teaching basic programming concepts. GvR was written by high school students. Not to be confused with Guido van Rossum (also GvR), the inventor of Python.

MoinMoin: a wiki wiki engine.

Nevow: a web application framework, successor to Twisted Woven.

Plone: a content management system (CMS) for Zope.

Python: the last programming language you'll have to learn.

Twisted: an asynchronous framework for all types of Internet applications.

Zope: a Web application universe, the Emacs of Web applications.

Upcoming Python Conferences

Python UK (April 16-17; Oxford, England)

EuroPython (June 7-9; Göteborg, Sweden)

OSCON (July 26-30; Portland, Oregon, USA)

Northwest Python Sprint (date TBD; Seattle, Washington, USA)

Mike has been a Python enthusiast since the mid 1990s. He worked for SSC from 1998 to 2003 doing Web application development and sysadmin stuff. Now he's working for a medical e-commerce site. He firmly believes the third article of the Zen of Python: "Simple is better than complex".

Load Disqus comments