Mailman, the GNU Mailing List Manager

You don't have to wait until pigs fly for good list management—just call the mailman.
The Message Pipeline

I've spent a lot of time improving the common path a mail message takes through the system. The biggest change has been to design a message pipeline, where each component in the pipeline does a little piece of the work necessary to deliver a message. For example, there are separate components to scan the message for potential spam, calculate the recipients of the message, archive it, gate it to Usenet, and deliver the message to an SMTP (simple mail transfer protocol) daemon.

Each component in the pipeline is really a Python module conforming to a specific API: the module must contain a function called “process” which takes a message object and a mailing list object. When a message is received by Mailman, it runs through a list of these modules, handing the message object off for each to process. If the module raises a Python exception, processing is stopped. This is used when messages must be held for the list administrator's approval (e.g., a posting to a moderated list).

This message pipeline means that Mailman is easily configurable and extensible in the way it handles incoming and outgoing messages. For example, there is a project contributor who has implemented a MIME attachment scanner module which can be dropped into the pipeline. This module can strip attachments from the message, post the attachments to an external archive (either the file system or a WebDAV server) and then rewrite the outgoing message to include a URL to the attachment instead of the attachment text. This module could also be used simply to discard messages with certain types of attachments (e.g., if you hate HTML mail as much as I do, you could just bounce or discard any message that contains a text/html MIME type), strip certain attachment types (e.g., binary attachments just get discarded) or scan attachments for potential viruses.

Currently, there is only one system-wide message pipeline for all Mailman lists at a site, but the plan is eventually to give individual list administrators the opportunity to configure their lists with optional modules. One application of this would be to run a “patches” mailing list which would have an optional module to scan a message for a context or unified diff, and if found, inject the diff into an issue-tracking system.

This streamlining of the message-delivery path has vastly improved the performance of Mailman. We're running the latest CVS snapshot on python.org and easily handling about 30,000 individual recipient deliveries per day, with an average of about 0.01 second per message through the system (from Mailman receipt to SMTP daemon hand-off). The lesson here is that for the best performance, you want to choose your MTA wisely, since it will have the biggest impact on throughput.

Bounce Pipeline

A similar pipeline architecture has been designed for bounce detection. Believe it or not, there's actually a standard for bounced messages, called Delivery Status Notification (DSN), described in RFC 1894. The problem is, of course, that it's complex, and many MTA authors disagree with or ignore this standard. This makes bounce detection (like spam detection) a black art. Mailman 1.1 comes with a hairy mess of regular expressions used to scan bounced messages, which get delivered to a different address than regular postings. If Mailman actually detects a bounce, and can extract the offending e-mail address from the bounced message, it increments a counter for that address. Enough bounces, and the address is automatically disabled or removed.

The problem was that updating the regular expressions was nearly impossible, so for Mailman 1.2 we now have a pipeline, similar in architecture to the delivery pipeline, that attempts to recognize just one style of bounce. We currently recognize RFC1894/DSN bounces, Postfix, Qmail, Yahoo! and a few other weirdos. Of course, we still recognize all the old bounce formats Mailman 1.1 recognized, and it's fairly easy to add new matchers—assuming the bounced message can actually be scanned intelligently. I recently added an Smail bounce detector in about five minutes and 20 lines of Python code.

Two other major improvements planned for the 1.2 release are internationalization and user databases.

Internationalization

We've had a large number of requests for making Mailman multi-lingual. Two contributors from Spain, Juan Carlos Rey Anaya and Victoriano Giralt, with help from Mads Kiilerich from Denmark, have sent me patches to accomplish this. The technical approach centers around gettext, where strings to be translated are marked in a special way. The developers then run a tool over the source tree and create template files which can be handed over to translators. Once their language-specific translation files are placed in the proper directory, the application can use these to look up the text string in the specified language.

For Mailman, a site administrator can install any language file they want to make available to their list administrators. It would be up to the list administrators to enable various languages for their lists and to choose a default language. When individual users are interacting with Mailman, they can choose their preferred language from those available to the list. In this way, mailing lists can support multiple languages through both their web and e-mail interfaces. Of course, messages posted to the list aren't translated (although a pipeline module could be implemented to feed the text through Babelfish if you were so inclined).

GNU gettext provides all the necessary tools to create multilingual C programs, but we had to adapt them a bit to work with Python. As with C, we mark Python strings to be translated with a wrapper function call. For example, if you wanted to make this line of code translatable,

subject = "You have been subscribed"

you would modify the line to look like this:

subject = _("You have been subscribed")
Most of the work of making an application like Mailman multilingual involves marking translatable text.

Python has a further complication: there are actually eight ways to define a “string”:

  • 'This is a Python string'

  • "This is a Python string"

  • '''This is a Python triple-quoted string'''

  • """This is a Python triple-quoted string"""

  • r' This is a Python raw string'

  • r"This is a Python raw string"

  • r'''This is a Python triple quoted raw string'''

  • r"""This is a Python triple quoted raw string"""

Briefly, the '' style and "" style strings are interchangeable, and useful when you don't want to escape one delimiter or the other. The first two string styles are limited to a single line. Triple-quoted strings allow you to embed newlines in the string, serving roughly the same purpose in Python as Perl's HERE documents. Raw strings have different rules for embedded backslashes and are used primarily for regular expressions.

GNU gettext comes with a tool called xgettext which scans your C files for translatable strings. Unfortunately, it doesn't understand Python's various string spellings, and while a few different approaches have been put forward, I favor allowing _() marking of any valid Python string. To accomplish this, I wrote a tool called pygettext.py which scans Python source code, looking for _() wrappers around any type of Python string. The output of pygettext.py is a standard gettext .pot file, so from that point on, the GNU tools can be used. pygettext.py will be a standard part of Python 1.6 and is available via the Python CVS tree at http://cvs.python.org/.

I expect to begin integrating and testing the internationalization patches to Mailman sometime within the next few weeks. Keep an eye on the Mailman CVS tree for details.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

phplist v/s mailman

sammy33's picture

I am using phplist for last many years. is this better than that never heard.
Sam
medical tourism

Re:

ektel's picture

Hello Barry,

Greeting from ektel,

that is nice review of mailman. i also run a website which is my private company website and as you said there are times when its hard to distinguish which mail is a spam and which is really of importance. This is a problem for companies that receive number of emails per day. I am not able to check the emails on daily basis so i check them once in a week and I do really need a good email filtering application. However my website is hosted on namecheap and i run my website through their c-panel. I would like to try out Mailman, do you think this is possible for my website because it is hosted on namecheap?

Regards,
Ramesh
EKTEL Telecommunication

Mailman sounds rad

Anonymous's picture

My inbox has practically imploded with e-mails as of late. Sorting through the spam has become a significant part of my mail routine. I have set up pretty decent filters through my e-mail provider, but I am looking for even more control. After reading your review on Mailman, I realize I can use the program for my business and not just my personal. Adding over 100 names to our newsletter list takes up a lot of time -- that feature alone is worth the money! email marketing software

I've spent a lot of time

Jery Cols's picture

I've spent a lot of time improving the common path a mail message takes through the system. The biggest change has been to design a message pipeline, where each component in the pipeline does a little piece of the work necessary to deliver a message.

Mike @ online casino

What an interesting article

Rickys's picture

What an interesting article on Mailman for a good list management! Gives us an in-depth write up and guide about its advantages and the different options with their features so that it will be easy for us to compare and judge for ourselves! Once we have decided, then it is instructions about how the software is built! I am sure it will be taken advantage of because of its easy and adaptability to the web and the simple way it works with the GNU-ip pbx configure which is familiar to most people in the field!

re

Anonymous's picture

I've spent a lot of time improving the common path a mail message takes through the system. The biggest change has been to design a message pipeline, where each component in the pipeline does a little piece of the work necessary to deliver a message. For example, there are separate components to scan the message for potential spam, calculate the recipients of the message, archive it, gate it to Usenet, and deliver the message to an SMTP (simple mail transfer protocol) daemon.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix