Busting Spam with Bogofilter, Procmail and Mutt, Revisited

Adjusting the macros to reflect bogofilter's reversed switches.

In November 2002, I wrote an article called “Busting Spam with Bogofilter, Procmail and Mutt”. It contained a particularly handy configuration that took out a lot of the work of marking mail as spam or non-spam in bogofilter. The basic premise is you configure your mailer to mark all mail you reply to or save as non-spam, because most people save or reply to only the good stuff.

The article gained quite a lot of Google juice and has become a number-one hit for several common search queries on the topic. This would be good news, except for one little detail listed in the bogofilter changelog:



----8<----
0.11.0   2003-03-03
        * Separated message registration options from unregistration
          options.  '-S' and '-N' have been changed and now just do
          unregistration.  To move a message from one wordlist to the
          other, use '-S -n' or '-N -s' (as appropriate)
----8<----


This note means Eric Raymond changed the command-line switches to bogofilter, so they now have the exact opposite effect. What was -S is now -Ns, and what was -N is now -Sn.

This means that people who stumble across my November 2002 article with a March 2003 or later version of bogofilter will find it a rather frustrating experience. The macros listed at the beginning of the article now mark saved mails as spam, and the X key deletes a piece of mail as non-spam.

This disastrous change violates the Rule of Least Surprise espoused in Raymond's recent publication, The Art of UNIX Programming. The section on the Rule of Least Surprise quotes Henry Spencer warning against programs that appear to do things in a familiar fashion when they actually do something very different. The bogofilter command accepts the same command-line switches now as it did in November 2002, but then, suddenly, March 3, 2003, was opposite day.

Admittedly, bogofilter has not yet reached version 1.0, so a few changes here and there are to be expected. Regardless, this stealthy little switch almost certainly has caused people to miss legitimate mail that was mis-filed as spam.

I added a second refinement to the bogofilter macros after my wife complained she no longer could tag a collection of messages to be saved all at once. The problem occurs because mutt macros do not propagate the state of the tag prefix (usually the semicolon key on a standard configuration) down to the macro's component commands. Thus, the tagged messages were filed in bogofilter, but only the message at the cursor was saved.

As a workaround, the following macros insert the tag prefix into their index versions. This means if you have not tagged any messages, these macros beep at you before they operate on the current message. Also, if you tag messages but try to save an individual message, they save all the tagged messages. This is not ideal, but the mutt developers have shown no interest in providing better tag prefix hooks in mutt macros.



----8<----
macro index s "<enter-command>unset wait_key\n<tag-prefix><pipe-entry>bogofilter -MSn\n<enter-command>set wait_key\n<tag-prefix><save-entry>"
macro pager s "<enter-command>unset wait_key\n<pipe-entry>bogofilter -MSn\n<enter-command>set wait_key\n<save-entry>"

macro index r "<enter-command>unset wait_key\n<tag-prefix><pipe-entry>bogofilter -Mn\n<enter-command>set wait_key\n<tag-prefix><reply>"
macro pager r "<enter-command>unset wait_key\n<pipe-entry>bogofilter -Mn\n<enter-command>set wait_key\n<reply>"

macro index g "<enter-command>unset wait_key\n<tag-prefix><pipe-entry>bogofilter -Mn\n<enter-command>set wait_key\n<tag-prefix><group-reply>"
macro pager g "<enter-command>unset wait_key\n<pipe-entry>bogofilter -Mn\n<enter-command>set wait_key\n<group-reply>"

macro index l "<enter-command>unset wait_key\n<tag-prefix><pipe-entry>bogofilter -Mn\n<enter-command>set wait_key\n<tag-prefix><list-reply>"
macro pager l "<enter-command>unset wait_key\n<pipe-entry>bogofilter -Mn\n<enter-command>set wait_key\n<list-reply>"

macro index X "<enter-command>unset wait_key\n<tag-prefix><pipe-entry>bogofilter -MNs\n<enter-command>set wait_key\n<tag-prefix><delete-message>"
macro pager X "<enter-command>unset wait_key\n<pipe-entry>bogofilter -MNs\n<enter-command>set wait_key\n<delete-message>"
----8<----


Fortunately, the section on procmail is still correct, and should cause no problems on a lightly loaded system. If you find that lots of bogofilter processes are weighing down your system, you may wish to change the first stanza to use a lockfile, like so:



----8<----
:0fw: bogofilter.lock
| bogofilter -u -e -p
----8<----


The other two stanzas in my original configuration can stay as they are.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Filtering in the background?

eyolf's picture

Thanks for your two articles - I use your macros and it works excellently.
One question, though: If I tag-save many messages, it takes quite some time to get the screen back again. Would it be possible to send the bogofiltering part to the background? This would also require that choosing a folder to move to in mutt would come before and not after the filtering. This would have two major advantages:

1. if I accidentally press the "save" key instead of some other key, I may end up marking a lot of messages as spam or ham which shouldn't be marked. A changed order would give me one extra chance to abort.
2. I wouldn't have to sit and stare at the screen while Mr. Bayes-Bogo works before I can tell Mr Mutt where to put the mails.

I may have a look at these macros myself some day, but if someone knows off the top of their head how to do this, it would save me some head-scratching.

Just a question !

Anonymous's picture

Hi,

I've got just a question... How do you delete e-mails marked as SPAM by bogofilter and classified in your zztrash folder ?

In that case is it better to use the normal delete command (d key) in mutt or is it better to hit the X key to reinforce bogofilter detection ?

Re: Busting Spam with Bogofilter, Procmail and Mutt, Revisited

akosmin's picture

Please fix the HTML so viewers don't have to scroll left and right just to read the article.

Re: Busting Spam with Bogofilter, Procmail and Mutt, Revisited

Anonymous's picture

thats the long lines in the macro thing. either they are line split or the "missformat" will stay as it is.

Re: Busting Spam with Bogofilter, Procmail and Mutt, Revisited

Anonymous's picture

Same here, it looks really broken and annoying (Mozilla 1.2.1 as
packaged in Red Hat 9)

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix