Busting Spam with Bogofilter, Procmail and Mutt, Revisited

Adjusting the macros to reflect bogofilter's reversed switches.

In November 2002, I wrote an article called “Busting Spam with Bogofilter, Procmail and Mutt”. It contained a particularly handy configuration that took out a lot of the work of marking mail as spam or non-spam in bogofilter. The basic premise is you configure your mailer to mark all mail you reply to or save as non-spam, because most people save or reply to only the good stuff.

The article gained quite a lot of Google juice and has become a number-one hit for several common search queries on the topic. This would be good news, except for one little detail listed in the bogofilter changelog:



----8<----
0.11.0   2003-03-03
        * Separated message registration options from unregistration
          options.  '-S' and '-N' have been changed and now just do
          unregistration.  To move a message from one wordlist to the
          other, use '-S -n' or '-N -s' (as appropriate)
----8<----


This note means Eric Raymond changed the command-line switches to bogofilter, so they now have the exact opposite effect. What was -S is now -Ns, and what was -N is now -Sn.

This means that people who stumble across my November 2002 article with a March 2003 or later version of bogofilter will find it a rather frustrating experience. The macros listed at the beginning of the article now mark saved mails as spam, and the X key deletes a piece of mail as non-spam.

This disastrous change violates the Rule of Least Surprise espoused in Raymond's recent publication, The Art of UNIX Programming. The section on the Rule of Least Surprise quotes Henry Spencer warning against programs that appear to do things in a familiar fashion when they actually do something very different. The bogofilter command accepts the same command-line switches now as it did in November 2002, but then, suddenly, March 3, 2003, was opposite day.

Admittedly, bogofilter has not yet reached version 1.0, so a few changes here and there are to be expected. Regardless, this stealthy little switch almost certainly has caused people to miss legitimate mail that was mis-filed as spam.

I added a second refinement to the bogofilter macros after my wife complained she no longer could tag a collection of messages to be saved all at once. The problem occurs because mutt macros do not propagate the state of the tag prefix (usually the semicolon key on a standard configuration) down to the macro's component commands. Thus, the tagged messages were filed in bogofilter, but only the message at the cursor was saved.

As a workaround, the following macros insert the tag prefix into their index versions. This means if you have not tagged any messages, these macros beep at you before they operate on the current message. Also, if you tag messages but try to save an individual message, they save all the tagged messages. This is not ideal, but the mutt developers have shown no interest in providing better tag prefix hooks in mutt macros.



----8<----
macro index s "<enter-command>unset wait_key\n<tag-prefix><pipe-entry>bogofilter -MSn\n<enter-command>set wait_key\n<tag-prefix><save-entry>"
macro pager s "<enter-command>unset wait_key\n<pipe-entry>bogofilter -MSn\n<enter-command>set wait_key\n<save-entry>"

macro index r "<enter-command>unset wait_key\n<tag-prefix><pipe-entry>bogofilter -Mn\n<enter-command>set wait_key\n<tag-prefix><reply>"
macro pager r "<enter-command>unset wait_key\n<pipe-entry>bogofilter -Mn\n<enter-command>set wait_key\n<reply>"

macro index g "<enter-command>unset wait_key\n<tag-prefix><pipe-entry>bogofilter -Mn\n<enter-command>set wait_key\n<tag-prefix><group-reply>"
macro pager g "<enter-command>unset wait_key\n<pipe-entry>bogofilter -Mn\n<enter-command>set wait_key\n<group-reply>"

macro index l "<enter-command>unset wait_key\n<tag-prefix><pipe-entry>bogofilter -Mn\n<enter-command>set wait_key\n<tag-prefix><list-reply>"
macro pager l "<enter-command>unset wait_key\n<pipe-entry>bogofilter -Mn\n<enter-command>set wait_key\n<list-reply>"

macro index X "<enter-command>unset wait_key\n<tag-prefix><pipe-entry>bogofilter -MNs\n<enter-command>set wait_key\n<tag-prefix><delete-message>"
macro pager X "<enter-command>unset wait_key\n<pipe-entry>bogofilter -MNs\n<enter-command>set wait_key\n<delete-message>"
----8<----


Fortunately, the section on procmail is still correct, and should cause no problems on a lightly loaded system. If you find that lots of bogofilter processes are weighing down your system, you may wish to change the first stanza to use a lockfile, like so:



----8<----
:0fw: bogofilter.lock
| bogofilter -u -e -p
----8<----


The other two stanzas in my original configuration can stay as they are.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Filtering in the background?

eyolf's picture

Thanks for your two articles - I use your macros and it works excellently.
One question, though: If I tag-save many messages, it takes quite some time to get the screen back again. Would it be possible to send the bogofiltering part to the background? This would also require that choosing a folder to move to in mutt would come before and not after the filtering. This would have two major advantages:

1. if I accidentally press the "save" key instead of some other key, I may end up marking a lot of messages as spam or ham which shouldn't be marked. A changed order would give me one extra chance to abort.
2. I wouldn't have to sit and stare at the screen while Mr. Bayes-Bogo works before I can tell Mr Mutt where to put the mails.

I may have a look at these macros myself some day, but if someone knows off the top of their head how to do this, it would save me some head-scratching.

Just a question !

Anonymous's picture

Hi,

I've got just a question... How do you delete e-mails marked as SPAM by bogofilter and classified in your zztrash folder ?

In that case is it better to use the normal delete command (d key) in mutt or is it better to hit the X key to reinforce bogofilter detection ?

Re: Busting Spam with Bogofilter, Procmail and Mutt, Revisited

akosmin's picture

Please fix the HTML so viewers don't have to scroll left and right just to read the article.

Re: Busting Spam with Bogofilter, Procmail and Mutt, Revisited

Anonymous's picture

thats the long lines in the macro thing. either they are line split or the "missformat" will stay as it is.

Re: Busting Spam with Bogofilter, Procmail and Mutt, Revisited

Anonymous's picture

Same here, it looks really broken and annoying (Mozilla 1.2.1 as
packaged in Red Hat 9)

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState