HEC Montréal: Deployment of a Large-Scale Mail Installation
After putting the 11 servers for the new infrastructure in place, one of the remaining challenges was to migrate all users from the old infrastructure to the new one. About 35,500 users, 82,500 mailboxes and hundreds of thousands of messages (35GB of mail) had to be migrated. Furthermore, redirection scripts and vacation messages also had to be converted, and information such as preferences from the previous Webmail system had to be kept intact. In order to do this, we created a set of Perl scripts to take care of the entire migration in a way that would appear seamless for the users:
LDAP Init: populates the new LDAP server (based on OpenLDAP) using the values from the previous LDAP server (based on Netscape iPlanet). Included attributes are e-mail addresses and aliases, special folders and signature preferences for Webmail.
Create Users: creates all user accounts about to be migrated.
Load Sieve: creates Sieve scripts and uploads them to the mailstore by reading attributes from the previous LDAP server. Sieve scripts are used for automatic redirections and vacation messages.
Copy Mailboxes: copies all mailboxes for the users being migrated. All message flags are kept intact. The IMAP protocol is used a lot in this script. This script also updates the mailHost attribute on both LDAP servers so the mails are routed to the correct destination mailboxes.
Update Mailboxes: run the morning after the migration to move the remaining (if any) messages in the users' mailboxes. Mail could have been stuck in the queue of the SMTP servers, before the users' mailHost attributes were changed.
To minimize service interruptions for the users, we ran the scripts in the order listed once classes were finished at the end of the day. Few messages were rejected during the import process; those that were simply were retried by the source SMTP servers. In total, four nights were required to migrate all the information. Running the scripts took from four to seven hours, depending on the number of users located on each source server and the execution speed, which was mainly limited by the performance of the old AIX servers.
After the migration, we extensively monitored all services in order to discover any problems. As expected, we didn't have many. We mainly tuned the minimum preforks of Cyrus processes as well as their respective maximum children. We also tuned the SMTP servers for the default process limits and preforks for AMaViS. We also used temporary LDAP queries during the migration, so we had to replace them with optimized ones once the migration finished.
During a typical day, HEC Montréal receives over 125,000 e-mails, and 60% to 80% of the traffic is composed of UBEs. The internal SMTP servers also manage thousands of messages sent by users, distribution lists or other systems. About 300,000 POP3 connections (from 5,500 different users) and 60,000 IMAP connections (from 5,000 different users) are initiated every day on the main Cyrus server. Peaks of 225 concurrent IMAP connections and 50 concurrent POP3 connections frequently are encountered.
As mentioned earlier, the anti-UBE policies in place have proven to be effective. During the first week after the migration, the two mail exchangers blocked more than 600,000 unsolicited bulk e-mails. The week after, spammers were less aggressive and the systems blocked over a quarter of a million messages. The most effective policy is the RBL checks, followed by the content filtering checks (using SpamAssassin and Vipul's Razor) and, finally, the header and MIME header checks.
To extract those statistics, we installed Spamity, which parses mail logs from the four Postfix servers and updates a PostgreSQL database running on the test server. Thereafter, users or administrators can examine the mail that was blocked by anti-UBE policies by using a simple Web browser. Users also can perform searches for specific e-mail addresses or domain names and filter the results by anti-UBE policies.
As you have seen in this article, migrating from a proprietary solution to an open-source solution was a challenge. According to Emmanuel Vigne, Information Systems Director at HEC Montréal:
The key business benefits are huge, as we nearly eliminated UBE and greatly enhanced the architecture of our mail infrastructure. We moved from an architecture where all services were offered by four servers to an architecture where the services are offered by many servers. That allows us to minimize any potential outage and scale as the number of users grow. In case of a failure, only one specific service is affected, contrary to the situation before where thousands of users could no longer use the e-mail service in case of a single server failure.
Putting this new infrastructure in place allowed us to contribute to the Open Source community by developing a set of patches to correct bugs and/or add features to most components we installed.
As with any other system, this one will evolve over time. Interesting anti-UBE technologies are emerging, such as Sender Policy Framework (SPF) [see page 50] and Spamhaus Exploits Block List (XBL), and a new stable version of Cyrus is available with NNTP and mailbox annotations support. In addition, Postfix 2.1 is coming along nicely and should offer excellent connection/rate control with its new anvil server.
Finally, as this article was being written, a mirroring solution was being deployed for the SAN. This should offer storage redundancy and eliminate the single potential point of failure in the current infrastructure.
Resources for this article: /article/7456.
Ludovic Marcotte (email@example.com) holds a Bachelor's degree in Computer Science from the University of Montréal. He currently is a software architect for Inverse, Inc., an IT consulting company located in downtown Montréal.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- Managing Linux Using Puppet
- Non-Linux FOSS: Caffeine!
- Doing for User Space What We Did for Kernel Space
- SuperTuxKart 0.9.2 Released
- Google's SwiftShader Released
- Parsing an RSS News Feed with a Bash Script
- SourceClear Open
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide