Sending Mail via the Web

Mr. Lerner continues his look at building a simple, integrated mail system that can be accessed using a web browser.
Preventing Spam

The problem with the above form is it truly allows anyone to send mail to any address on the Internet. Furthermore, it allows the sender to pretend to be any address on the Internet. This is precisely the sort of tool spammers love to exploit. If you were to put our original version of send-mail.pl on your site, you would eventually discover someone was using your server and bandwidth to send their spam.

Several possible ways can be used to prevent this. One is to remove the possibility of sending mail to users or domains outside of a selected list. For instance, we can define a hash where the keys are approved e-mail addresses:

my %approved_recipient = ('reuven@lerner.co.il' => 1,
  'ljeditor@linuxjournal.com.com' => 1);

Using a hash allows us to check the status of any e-mail address in a constant time interval, regardless of the number of addresses. If we were to use an array, for example, we would potentially have to search through the entire array before we could be sure of an address's status, meaning that the time necessary to perform such a test would grow in proportion to the number of elements in the array.

We can thus check to see if an address is approved by inserting the following code:

if (!$approved_recipient{$recipient})
{
die "Unapproved address \"$recipient\": Mail" .
 " was not sent.\n";
}

A version of send-mail.pl with the above code can be found in Listing 2 in the archive file (ftp://ftp.linuxjournal.com/pub/lj/listings/issue62/3449.tgz).

We can similarly allow mail to be sent to any address within a particular domain by putting all of the approved domains inside an array:

my @approved_domains = ('lerner.co.il'
       'linuxjournal.com');

We then create a variable, $match_found, which defaults to 0:

my $match_found = 0;
$match_found will be set to 1 only if one of the approved domains matches the domain in $recipient. We check this with a short loop:
foreach my $domain (@approved_domains)
{
if ($recipient =~ m/$domain$/)
{
$match_found = 1;
last;
}
}
We use last to break out of the loop when we find a match, in order to save some time. If you know certain domains will receive mail more often than others, you should put them at the beginning of @approved_domains, since the earlier an item appears in that array, the sooner the match will be found.

We then send mail only if $match_found is true (i.e., non-zero). If $match_found is 0, we print an error message:

# If the domain was not approved
else
{
die "Mail was not sent: The recipient's domain " .
 "is not approved.\n";
}

The version of send-mail.pl in Listing 3 in the archive has these additions.

Checking for Errors

If we want our program to be robust, we must do more than check for security violations. We must check for input from the user that might not affect security, but might lead to bugs or other unpleasant surprises.

For instance, if we invoke send-mail.pl directly from a URL, for example

http://www.lerner.co.il/cgi-bin/send-mail.pl

the program will report that the mail was sent with a blank sender, recipient and message. This is bad for two reasons. First, no mail was sent, since necessary headers were not assigned any values, so the program is providing us with incorrect information. Second, we should never get to the point where blank data from the user is accepted as input for mail.

We can prevent this situation by ensuring send-mail is always invoked with POST, and that $sender, $recipient and $message are non-blank. If any of these is equivalent to the empty string, we exit prematurely from the program, telling the user each must have a non-blank value. Once again, using die is better in debugging environments than in production code, simply because of the style of error message it produces. There is no reason why you could not forward the user to an error message page, or print a nicely designed page describing what was missing, rather than simply dying.

Competing with Hotmail

Between send-mail.pl and read-mail.pl, we have created a small system to send and receive e-mail. Is this enough to compete with Hotmail creating our own small web-based mail service? The short answer is “no”, although the longer answer is that it is probably enough to suit most purposes.

Part of the problem is these two programs are run using CGI. While CGI is portable across platforms and languages, it is inherently slow, requiring the web server to create a new process each time the program is invoked. While this is more than adequate for lightly loaded machines, it quickly becomes a performance drain as the number of hits increases.

Each HTTP server has developed its own native interface that allows you to attach your program to the server process. Since Apache is free software, several such interfaces have been developed for it, including mod_perl and mod_php. The former allows you to write CGI-like programs in Perl, attaching them to the server process. This means your functionality becomes a subroutine within the server program, rather than an external program that must be invoked separately. The speed difference between a program running under mod_perl and the same functionality in a CGI program is staggering and should convince just about any die-hard CGI user to switch to mod_perl.

A site wishing to compete with Hotmail would probably want to use mod_perl or a similar server-specific API in order to get the maximum performance out of its hardware.

Aside from performance, another issue is where the mail is to be stored. The programs we have discussed, read-mail.pl and send-mail.pl, expect the user's mail to be stored on a POP server elsewhere on the Internet. Hotmail and similar services have their own POP servers for incoming mail, as well as their own MTAs (usually sendmail, although other MTAs are apparently better for high-volume mail servers) running on their systems.

However, Hotmail will allow you to retrieve mail only from their own POP servers, while read-mail.pl will allow you to retrieve mail from any POP server, including one that would normally not have a web interface. Whether you restrict mail checking by users to your own system, a number of servers within your organization or anywhere else is up to you.

Finally, services such as Hotmail survive due to advertising, and one of the most popular ways to advertise is to add a short note to the bottom of each message indicating which web-based mail service was used to send it. We can easily do that by concatenating our own footer to the message the user sends with these instructions:

my $footer = "-\nBrought to you by ReuvenMail\n";
my $message = $query->param("message");
$message .= $footer;
my %mail = (To => $recipient, From => $sender,
  Message => $message);

Now everyone will know which mail service you were using when you sent mail from your web-based system. This functionality is included in the final version of the program, Listing 3 in the archive file.

Finally, Hotmail has millions of members, which means it is relying on more than a single computer running Linux for mail delivery. Operating a single system for sending and receiving mail is not nearly as hard as creating a large, scalable system. If you are interested in truly competing with Hotmail, you will need capital investment and a good knowledge of networking protocols, in addition to Linux, Apache and the above programs.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix