HEC Montréal: Follow-up on the Large-Scale Mail Installation
In order to produce valuable statistics, two tools were used: Spamity and pflogsumm. The former is a complete solution for extracting information from log files of a mail infrastructure based on Postfix and AMaViS. Spamity extracts all the relevant information and stores it in a database. A Web frontend is offered so users simply can log in to the Web application to see the mail rejected by the filtering policies. The nature of Spamity makes it a valuable tool to examine the spam and virus tendencies in order to tune the infrastructure over time to limit the delivery of UBEs and viruses. Spamity efficiently gathers the information related to the rejected messages and classifies it with regard to the following policies:
RBL: Message rejected by a real-time blackhole list.
RHSBL Client: Message rejected by a right-hand side block list.
Header Date: Message has a date from the distant past or future.
Header Subject: Message rejected by suspicious subject.
Header X-Mailer: Message rejected by suspicious mail user agent.
Header Content-Disposition: Message rejected by suspicious attachment.
Header Content-Type: Message rejected by suspicious attached file. The filter method specifies the file extension.
Body: Message rejected by suspicious body content.
Access Username: Message rejected by access username.
Virus: Message rejected by AMaViS together with the anti-virus solution used.
Spam: Message rejected by AMaViS together with SpamAssassin.
On the other hand, pflogsumm is a useful tool for providing a quick overview of Postfix activity. This allows an administrator to identify rapidly potential problems in a Postfix installation. Among the information reported by pflogsumm, we have:
Total number of received, delivered, forwarded, deferred, bounced and rejected messages
Per-day and per-hour message traffic and connection summaries
Various other summaries (warnings, fatal errors, panics) and more.
Using those two tools and some custom Perl scripts, we produced the different figures found in this article.
Figure 2 shows the weekly total number of mail considered to be UBE or containing viruses that were blocked since the beginning of 2004. The rules' efficiency also is shown in this figure.
As shown in Figure 2, the RBL policy is definitively the most effective one, followed by content analysis using SpamAssassin and message Subject header analysis. You also can note that the virus policy numbers are not as high as expected. This is easily understandable as the detection of viruses often is moved from AMaViS to Postfix's header checks (Content-Disposition, for example). This requires considerably less system resources, because we avoid both detailed analysis in SpamAssassin and a process fork, for each received message, for virus scanning using NAI VirusScan. The network analysts proceeded with such modifications after the 01-25 week for the MyDoom e-mail worm.
Furthermore, Figure 3 shows the usage of services offered by the mailstore, during the busiest week of the first three months (March 21-27).
As shown in Figure 3, POP3 is the most solicited service, followed by IMAP and the Web mail system, which also uses IMAP but was separated in the figure. During this week, peeks of 52 POP3 and 338 IMAP concurrent connections were observed coming from a total of 11,000 different users. The mailstore also is responsible for message deliveries in the user's mailboxes using the Local Mail Transfer Protocol (LMTP). Peaks of 75 concurrent delivery processes often were seen.
On the other hand, Figure 4 shows the amount of mail exchanged using the four SMTP servers for the entire month of March 2004.
As shown in Figure 4, 40 to 60% (55,000 messages per day, on average) of all received mail was rejected by various UBE and virus filtering techniques. This number actually is down from 80% in December 2003. At that time, HEC Montréal was receiving more than 125,000 spams per day. Currently, the average number of messages sent per day is 57,000, while the average number of received (from external servers) and delivered email per day is 35,000.
As you have seen from the different figures, the mail infrastructure certainly is a key component at HEC Montréal, as it is highly solicited. Overall, the mail infrastructure has been very fast and stable since it was deployed. Minor updates were performed by network analysts, mainly to keep up with the new e-mail worms.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- SUSE LLC's SUSE Manager
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- My +1 Sword of Productivity
- Managing Linux Using Puppet
- Non-Linux FOSS: Caffeine!
- Doing for User Space What We Did for Kernel Space
- SuperTuxKart 0.9.2 Released
- Google's SwiftShader Released
- Parsing an RSS News Feed with a Bash Script
- Rogue Wave Software's Zend Server
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide