Sending Mail via the Web

Mr. Lerner continues his look at building a simple, integrated mail system that can be accessed using a web browser.
Preventing Spam

The problem with the above form is it truly allows anyone to send mail to any address on the Internet. Furthermore, it allows the sender to pretend to be any address on the Internet. This is precisely the sort of tool spammers love to exploit. If you were to put our original version of send-mail.pl on your site, you would eventually discover someone was using your server and bandwidth to send their spam.

Several possible ways can be used to prevent this. One is to remove the possibility of sending mail to users or domains outside of a selected list. For instance, we can define a hash where the keys are approved e-mail addresses:

my %approved_recipient = ('reuven@lerner.co.il' => 1,
  'ljeditor@linuxjournal.com.com' => 1);

Using a hash allows us to check the status of any e-mail address in a constant time interval, regardless of the number of addresses. If we were to use an array, for example, we would potentially have to search through the entire array before we could be sure of an address's status, meaning that the time necessary to perform such a test would grow in proportion to the number of elements in the array.

We can thus check to see if an address is approved by inserting the following code:

if (!$approved_recipient{$recipient})
{
die "Unapproved address \"$recipient\": Mail" .
 " was not sent.\n";
}

A version of send-mail.pl with the above code can be found in Listing 2 in the archive file (ftp://ftp.linuxjournal.com/pub/lj/listings/issue62/3449.tgz).

We can similarly allow mail to be sent to any address within a particular domain by putting all of the approved domains inside an array:

my @approved_domains = ('lerner.co.il'
       'linuxjournal.com');

We then create a variable, $match_found, which defaults to 0:

my $match_found = 0;
$match_found will be set to 1 only if one of the approved domains matches the domain in $recipient. We check this with a short loop:
foreach my $domain (@approved_domains)
{
if ($recipient =~ m/$domain$/)
{
$match_found = 1;
last;
}
}
We use last to break out of the loop when we find a match, in order to save some time. If you know certain domains will receive mail more often than others, you should put them at the beginning of @approved_domains, since the earlier an item appears in that array, the sooner the match will be found.

We then send mail only if $match_found is true (i.e., non-zero). If $match_found is 0, we print an error message:

# If the domain was not approved
else
{
die "Mail was not sent: The recipient's domain " .
 "is not approved.\n";
}

The version of send-mail.pl in Listing 3 in the archive has these additions.

Checking for Errors

If we want our program to be robust, we must do more than check for security violations. We must check for input from the user that might not affect security, but might lead to bugs or other unpleasant surprises.

For instance, if we invoke send-mail.pl directly from a URL, for example

http://www.lerner.co.il/cgi-bin/send-mail.pl

the program will report that the mail was sent with a blank sender, recipient and message. This is bad for two reasons. First, no mail was sent, since necessary headers were not assigned any values, so the program is providing us with incorrect information. Second, we should never get to the point where blank data from the user is accepted as input for mail.

We can prevent this situation by ensuring send-mail is always invoked with POST, and that $sender, $recipient and $message are non-blank. If any of these is equivalent to the empty string, we exit prematurely from the program, telling the user each must have a non-blank value. Once again, using die is better in debugging environments than in production code, simply because of the style of error message it produces. There is no reason why you could not forward the user to an error message page, or print a nicely designed page describing what was missing, rather than simply dying.

Competing with Hotmail

Between send-mail.pl and read-mail.pl, we have created a small system to send and receive e-mail. Is this enough to compete with Hotmail creating our own small web-based mail service? The short answer is “no”, although the longer answer is that it is probably enough to suit most purposes.

Part of the problem is these two programs are run using CGI. While CGI is portable across platforms and languages, it is inherently slow, requiring the web server to create a new process each time the program is invoked. While this is more than adequate for lightly loaded machines, it quickly becomes a performance drain as the number of hits increases.

Each HTTP server has developed its own native interface that allows you to attach your program to the server process. Since Apache is free software, several such interfaces have been developed for it, including mod_perl and mod_php. The former allows you to write CGI-like programs in Perl, attaching them to the server process. This means your functionality becomes a subroutine within the server program, rather than an external program that must be invoked separately. The speed difference between a program running under mod_perl and the same functionality in a CGI program is staggering and should convince just about any die-hard CGI user to switch to mod_perl.

A site wishing to compete with Hotmail would probably want to use mod_perl or a similar server-specific API in order to get the maximum performance out of its hardware.

Aside from performance, another issue is where the mail is to be stored. The programs we have discussed, read-mail.pl and send-mail.pl, expect the user's mail to be stored on a POP server elsewhere on the Internet. Hotmail and similar services have their own POP servers for incoming mail, as well as their own MTAs (usually sendmail, although other MTAs are apparently better for high-volume mail servers) running on their systems.

However, Hotmail will allow you to retrieve mail only from their own POP servers, while read-mail.pl will allow you to retrieve mail from any POP server, including one that would normally not have a web interface. Whether you restrict mail checking by users to your own system, a number of servers within your organization or anywhere else is up to you.

Finally, services such as Hotmail survive due to advertising, and one of the most popular ways to advertise is to add a short note to the bottom of each message indicating which web-based mail service was used to send it. We can easily do that by concatenating our own footer to the message the user sends with these instructions:

my $footer = "-\nBrought to you by ReuvenMail\n";
my $message = $query->param("message");
$message .= $footer;
my %mail = (To => $recipient, From => $sender,
  Message => $message);

Now everyone will know which mail service you were using when you sent mail from your web-based system. This functionality is included in the final version of the program, Listing 3 in the archive file.

Finally, Hotmail has millions of members, which means it is relying on more than a single computer running Linux for mail delivery. Operating a single system for sending and receiving mail is not nearly as hard as creating a large, scalable system. If you are interested in truly competing with Hotmail, you will need capital investment and a good knowledge of networking protocols, in addition to Linux, Apache and the above programs.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState