Beating Spam and Viruses with amavisd-new and Maia Mailguard
With spam and e-mail worms on the rise, it's boom-time for the makers of antispam and antivirus solutions. New anti-spam laws in Europe and the US have done little to solve the problem, and this situation has sent many people shopping for technological solutions: spam and virus filters.
Scanning and filtering content at every desktop is expensive and impractical, however. Ideally, the spam and virus problem should be tackled as close to the source as possible, to shield everyone downstream. This strategy lets an organization focus its resources on one place, typically the mail gateway.
Server-based solutions rarely come cheap, however. Most of these products are licensed on a per-mailbox basis, whether as add-on software for mail servers or as standalone content-filtering appliances. These solutions can cost thousands of dollars and often require annual subscription fees for access to updated virus signatures and spam patterns.
In this article, we take a look at an open-source content-filtering solution, amavisd-new, and a powerful extension of this project called Maia Mailguard.
Conceptually, amavisd-new is a mail filter—it receives mail from your mail gateway, scans the mail for viruses and spam, quarantines, rejects or discards offending items, and relays the rest to another mail server downstream for delivery. In practice, amavisd-new often is sandwiched between two mail servers running on the same host, particularly at smaller sites where hosting the mail server and content filters on a single machine is practical. Larger sites may choose to install amavisd-new, SpamAssassin and virus scanners together on a separate content-filtering machine. Massive sites may want a load-balanced array of such machines.
amavisd-new was written in Perl, with security and reliability in mind, and works well on virtually all UNIX platforms. It is an RFC-compliant mail handler, designed never to lose any mail. To that end, amavisd-new does not accept responsibility for a mail item until the downstream mail server has done so. This means any errors that occur while filtering the mail do not cause the mail to be lost; it remains in the upstream mail server's queue. amavisd-new offers four types of filtering: virus/malware scanning, spam filtering, banning dangerous attachment types and invalid mail headers.
amavisd-new is not a virus scanner; rather it's a framework that calls one or more virus scanners. More than 30 popular virus scanners currently are supported, including proprietary products from such vendors as Sophos, Symantec and Network Associates, as well as the open-source Clam Antivirus.
Both command-line and dæmonized virus scanners are supported, though dæmonized scanners are much more efficient than their command-line cousins. If your mail server processes a lot of mail, you don't want to have to load a command-line scanner into memory for each mail item and unload it afterward. A virus scanner that runs as a dæmon gets loaded once and then stays in memory, making the process much faster.
If you have multiple virus scanners installed, you can arrange them in primary and secondary groups. The secondary group is consulted if none of the primary scanners is operational.
Spam filtering is handled by amavisd-new by integrating it with SpamAssassin. amavisd-new calls SpamAssassin once per mail item, no matter how many recipients there are, so mailing-list postings don't consume any more resources than does mail addressed to a single recipient.
SpamAssassin provides a broad-spectrum approach to spam filtering, including feature recognition, DNSBL and SPF lookups, collaborative reporting networks and Bayesian learning mechanisms. All of these tests contribute a numeric score to a total for each mail item, and each user can specify a threshold score for deciding whether an item is spam or ham. This is an effective combination, as the strengths of one method make up for the weaknesses of another.
Feature recognizers check the headers or the body of the e-mail looking for patterns that human beings have identified as markers of spam or ham (non-spam mail). The fact that the Date: header contains a time 12 hours in the future or that the mail contains an image but no text might qualify as spam symptoms, whereas a message containing more than a thousand words is more likely to be ham.
SpamAssassin also can check the IP address of the connecting mail server or client against a number of DNS-based block lists (DNSBLs) to determine whether that address is a known spam source. Unlike the traditional use of DNSBLs, however, SpamAssassin does not consider a listing to be damning by itself; it simply adds a value to the mail's total score. This is a much more flexible approach, one that lets you adjust the scores assigned to each DNSBL according to how much you trust that list and the policies of its maintainers. The upcoming SpamAssassin 3.0 also adds support for Sender Policy Framework (SPF) lookups, which try to verify that the connecting host has the authority to send mail for its domain.
Collaborative reporting networks, such as Vipul's Razor, Pyzor and the Distributed Checksum Clearinghouse (DCC) offer another kind of resource for SpamAssassin to consult. The idea is that because spam is broadcast to millions of recipients, by the time you receive your copy, a lot of other people have received more or less identical copies. If a lot of those people already have reported that particular mail as spam, your own spam filter should be able to use that fact in its own decision-making process.
Last, but certainly not least, SpamAssassin offers a Bayesian learning mechanism, which essentially is an automated feature recognizer. Although the manually designed feature recognizers listed above rely on human beings to point out features that indicate spam or ham, the Bayesian approach tries to pick out these features automatically, based on an analysis of the spam and ham you've received already.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Stunnel Security for Oracle
- SourceClear Open
- SUSE LLC's SUSE Manager
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- My +1 Sword of Productivity
- Managing Linux Using Puppet
- Tech Tip: Really Simple HTTP Server with Python
- Non-Linux FOSS: Caffeine!
- Google's SwiftShader Released
- Doing for User Space What We Did for Kernel Space
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide