Getting Rid of Spam

Here's a way to filter unwanted mail using procmail.

If you regularly receive a high volume of mail (e.g., you subscribe to several mailing lists) or are the target of unsolicited bulk e-mail (UBE), mail filtering may have crossed your mind. procmail is a flexible tool that allows you to process incoming e-mail and perform user-definable actions, such as filtering, prioritizing or informing you of new mail. procmail is part of a larger toolset that also includes formail, a program that can handle tasks such as recognizing duplicate messages, digest bursting, header extraction/addition and the generation of auto-replies.

Setting It Up

If you do not have procmail on your system already, you can pick up the latest copy at ftp://ftp.informatik.rwth-aachen.de/pub/packages/procmail/procmail.tar.gz. The current version as of this writing is 3.11pre7. (Don't let the “pre” scare you; this version is very stable.) The package compiles out-of-the-box for Linux, but you may want to change a few compilation parameters (like installing into /usr/local instead of /usr). Read the INSTALL file included in the archive to get the complete installation process.

Once you have compiled and installed procmail, you can start using it. Since it is designed to process incoming mail, you need to modify your ~/.forward file (or create one if you do not have one) to include the following line that will automatically invoke procmail:

"|IFS=' '&&/usr/local/bin/procmail -f-||exit 75 #

Include all quotes and do change YOUR_LOGIN_NAME. The reason for doing this is described in the FAQ that is bundled with the archive. Next, you will need to create a recipe file that procmail will use to filter your mail.

procmail Usage

When executed, one of the things procmail does is look for a file called .procmailrc in your home directory. This file holds all of the commands, called recipes, that tell procmail what to do with the incoming mail. A recipe has three parts: the start (:0, followed by optional flags and local lockfile), the optional conditions and the action to be taken if the conditions evaluate out to true. Each part of a recipe is on a separate line.

The flags tell procmail which part of the message to look at (headers, body or both) and whether the recipe is a delivering (terminating) or non-delivering recipe. The local lockfile ensures that concurrently running procmail processes do not interfere with each other when writing out to a mailbox and should be used only for delivering recipes. The optional conditions start with an * (asterisk). There can be more than one condition or no conditions as you feel necessary. Everything proceeding the * is passed to an internal regular expression (regex) engine, which is compatible with egrep. All conditions are logically ANDed together. If you choose to have no conditions, procmail defaults to a true result (which is what you would expect). The action line tells procmail what action to take if the all the conditions match. The action line can start with a ! (to forward the message), a | (to pass the message to a program), or a { (to start a nested block). Anything else is taken to be a mailbox name.

Here is an example of a recipe that will filter all mail coming from the Whitehouse into a mailbox of the same name:

:0:
  * ^From: .*whitehouse\.gov
  whitehouse

Let's analyze this one line at a time. The first line tells procmail that we have started a recipe (:0) and that we want procmail to determine the local lockfile (the second :). The next line is the condition that procmail must match in order for this recipe to be true. By default, procmail scans only the headers, which is what we want. The last line is the action, which tells procmail to write the message to a file called whitehouse.

For contrast, here is a non-delivering recipe:

:0 f
  * ^X-Face:
  | formail "I X-Face"

This one uses formail to strip out an unwanted X-Face header. Notice the lack of a local lockfile. Since this is a filter, a lockfile is unnecessary. procmail will still work if you place one there, but it will complain.

Unsolicited Mail

At the beginning of this article, I mentioned that procmail is useful to filter unwanted mail. UBE (or spam as it is more commonly known) has become an annoying trend and a nuisance. The volume of spam is believed to have increased substantially on Usenet, where people excessively post the same message to various newsgroups. Usenet spam is perceived to be “cancellable”, meaning a posted message can be deleted by the moderator before being read by too many people. To get around this type of cancellation, someone got the idea to send the message to you directly rather than posting it to Usenet where it might be deleted before you read it. Hence, UBE started to infiltrate users' in-boxes. Pioneers of this form of marketing quickly found out that many users disliked spam in any form, and often found their own mailboxes full of flames. They started to obscure headers to make it hard to find out where the message really came from.

Why is spam considered the bane of the Internet at large? Unlike the junk mail you receive in your postal box at home, spammers rarely pay as much to send the spam as the recipient does to receive it. The fact that they pay less than you is called “cost-shifting”. Another form of this shifting is the use of third-party computer resources by the spammer for sending their bulk e-mail without permission. By doing so, the spammer is costing that innocent company both time and money spent to clean up after the spammer. Another tactic widely used is the munging of the headers in such a way that uneducated recipients may waste the time of innocent third parties who had nothing to do with the spam in the first place. This type of deceitfulness can also be considered cost-shifting and has been ruled illegal in the U.S. Courts.

Cost-shifting is not the only argument against spam. There is no single removal point as each spammer generally runs their own list. To that end, they are not required to honor a removal list. As more and more people send spam, you will never be able to remove your address from every single list. Why should you have to if you didn't ask for it in the first place? Finally, a great deal of the spam is or could be considered illegal, such as pyramid schemes, multi-level marketing schemes and lotteries.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState