Validate an E-Mail Address with PHP, the Right Way
Listing 5. Partial Test for Valid Local Part Content
if (!preg_match('/^(\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-])+$/',
str_replace("\\\\","",$local)))
{
// character not valid in local part unless
// local part is quoted
if (!preg_match('/^"(\\\\"|[^"])+"$/',
str_replace("\\\\","",$local)))
{
$isValid = false;
}
}
The regular expression in the outer test looks for a sequence of allowable or escaped characters. Failing that, the inner test looks for a sequence of escaped quote characters or any other character within a pair of quotes.
If you are validating an e-mail address entered as POST data, which is likely, you have to be careful about input that contains back-slash (\), single-quote (') or double-quote characters ("). PHP may or may not escape those characters with an extra back-slash character wherever they occur in POST data. The name for this behavior is magic_quotes_gpc, where gpc stands for get, post, cookie. You can have your code call the function, get_magic_quotes_gpc(), and strip the added slashes on an affirmative response. You also can ensure that the PHP.ini file disables this “feature”. Two other settings to watch for are magic_quotes_runtime and magic_quotes_sybase.
The two regular expressions in Listing 5 are appealing because they are relatively easy to comprehend and don't require repetition of the allowable character group, [A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-]. Here's a test for you. Why does the character group require two back-slash characters before the forward slash and one back-slash character before the single quote?
One deficiency of the outer test of Listing 5 is that it passes local part strings that include dots anywhere in the string. Requirement number two states that dots can't start or end the local part, and they can't appear together two or more times. We could address this by expanding the outer regular expression into form ^(a+(\.a+)+)$, where a is (\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~-]). We could, but that leads to a long, hard-to-read, repetitive expression that's difficult to believe in. It's clearer to add the simple checks shown in Listing 6.
Listing 6. Check for dot placement in the local part.
if ($local[0] == '.' || $local[$localLen-1] == '.')
{
// local part starts or ends with '.'
$isValid = false;
}
else if (preg_match('/\\.\\./', $local))
{
// local part has two consecutive dots
$isValid = false;
}
The local part is a wrap. The code now checks all local part requirements. Checking the domain will complete the e-mail validation. The code could check all of the labels in the domain separately, as does the whiskey-loving code shown in Listing 2, but, as hinted earlier, the solution presented here allows the DNS check to do most of the domain validation work.
Listing 7 makes a cursory check to ensure only valid characters in the domain part, with no repeated dots. It goes on to make DNS lookups for MX and A records. It makes the check for the A record only if the MX record check fails. The code in Listing 4 verified the length of the domain value.
Listing 7. Domain Checks
if (!preg_match('/^[A-Za-z0-9\\-\\.]+$/', $domain))
{
// character not valid in domain part
$isValid = false;
}
else if (preg_match('/\\.\\./', $domain))
{
// domain part has two consecutive dots
$isValid = false;
}
else if (!(checkdnsrr($domain,"MX") || checkdnsrr($domain, "A")))
{
// domain not found in DNS
$isValid = false;
}
So, is it good? You decide. But, it would be nice to test the logic to ensure that it at least is correct. Listing 8 contains a series of e-mail address test cases that any e-mail validation should pass.
Listing 8. Test the e-mail validation function.
<?php
require("validEmail.php"); // your favorite here
function testEmail($email)
{
echo $email;
$pass = validEmail($email);
if ($pass)
{
echo " is valid.\n";
}
else
{
echo " is not valid.\n";
}
return $pass;
}
$pass = true;
echo "All of these should succeed:\n";
$pass &= testEmail("dclo@us.ibm.com");
$pass &= testEmail("abc\\@def@example.com");
$pass &= testEmail("abc\\\\@example.com");
$pass &= testEmail("Fred\\ Bloggs@example.com");
$pass &= testEmail("Joe.\\\\Blow@example.com");
$pass &= testEmail("\"Abc@def\"@example.com");
$pass &= testEmail("\"Fred Bloggs\"@example.com");
$pass &= testEmail("customer/department=shipping@example.com");
$pass &= testEmail("\$A12345@example.com");
$pass &= testEmail("!def!xyz%abc@example.com");
$pass &= testEmail("_somename@example.com");
$pass &= testEmail("user+mailbox@example.com");
$pass &= testEmail("peter.piper@example.com");
$pass &= testEmail("Doug\\ \\\"Ace\\\"\\ Lovell@example.com");
$pass &= testEmail("\"Doug \\\"Ace\\\" L.\"@example.com");
echo "\nAll of these should fail:\n";
$pass &= !testEmail("abc@def@example.com");
$pass &= !testEmail("abc\\\\@def@example.com");
$pass &= !testEmail("abc\\@example.com");
$pass &= !testEmail("@example.com");
$pass &= !testEmail("doug@");
$pass &= !testEmail("\"qu@example.com");
$pass &= !testEmail("ote\"@example.com");
$pass &= !testEmail(".dot@example.com");
$pass &= !testEmail("dot.@example.com");
$pass &= !testEmail("two..dot@example.com");
$pass &= !testEmail("\"Doug \"Ace\" L.\"@example.com");
$pass &= !testEmail("Doug\\ \\\"Ace\\\"\\ L\\.@example.com");
$pass &= !testEmail("hello world@example.com");
$pass &= !testEmail("gatsby@f.sc.ot.t.f.i.tzg.era.l.d.");
echo "\nThe email validation ";
if ($pass)
{
echo "passes all tests.\n";
}
else
{
echo "is deficient.\n";
}
?>
Be sure to run the test to see the valid and rejected e-mail addresses, the double-escaping (\\) inside the PHP strings tends to obfuscate the addresses. You're challenged to subject your favorite e-mail validation code to this test. Be assured that the code in Listing 9 does pass!
Listing 9 contains a complete function for validating an e-mail address. It isn't as concise as many—it certainly isn't a one-liner. But, it is straightforward to read and comprehend, and it correctly accepts and rejects e-mail addresses that many other published functions incorrectly reject and accept. The function orders the validation tests roughly according to increasing cost. In particular, the more complex regular expression and, certainly, the DNS lookup, both come last.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Validate an E-Mail Address with PHP, the Right Way
- RSS Feeds
- Readers' Choice Awards
- Tech Tip: Really Simple HTTP Server with Python
- DynDNS
1 hour 6 min ago - Reply to comment | Linux Journal
1 hour 39 min ago - All the articles you talked
4 hours 2 min ago - All the articles you talked
4 hours 6 min ago - All the articles you talked
4 hours 7 min ago - myip
8 hours 32 min ago - Keeping track of IP address
10 hours 23 min ago - Roll your own dynamic dns
15 hours 36 min ago - Please correct the URL for Salt Stack's web site
18 hours 47 min ago - Android is Linux -- why no better inter-operation
21 hours 3 min ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




Comments
Great article, just a slight fix
This is terrific.
There is some sort of typo in the part of the code in Listing 9 where you check the A and MX DNS records, which make this break as written.
Changing:
if ($isValid && !(checkdnsrr($domain,"MX") ||
↪checkdnsrr($domain,"A")))
To:
if ($isValid && !((checkdnsrr($domain,"MX")) ||
(checkdnsrr($domain,"A"))))
seems to make it work.
your fix works for me too
Thanks for the awesome script!
I ran into the same error with that line, and your fix made it work for me too!
Your format validation code
Your format validation code will inappropriately permit an all numeric TLD.
“There is an additional rule that essentially requires that top-level domain names not be all- numeric.“ - RFC 3696 - 2
http://SimonSlick.com/VEAF/ValidateEmailAddressFormat.html
Sure your DNS lookup up would fail, but what is the point of validating the format if you are just going to do a DNS lookup anyway for a domain name that should have already been deemed invalid by the format validation code.
Format validation and existence verification (DNS lookup) serve different purposes, and just because a domain name does not exist does not mean the format is not valid.
There are so many holes in your code, whoever paid you for this write-up is highly deserving of a total refund. If you are going to title such an article as "... the Right Way", you could at least do it the Right Way.
The code at http://SimonSlick.com/VEAF/ValidateEmailAddressFormat.html is actually better, and even includes code for verifying actual existence of an eMailbox.
simonslick.com/veaf is busted
The code at simonslick.com is wrong -- it does not seem to match the RFC at all. Just try the examples given in this article as well as more common cases like:
foo+bar@example.com
foo%bar@example.com
foo <bar@example.com>
(foo) bar@example.com
Working Code & Extensive Regular Expressions
Working Code with Extensive use of Regular Expressions for validating email address format.
Check it out and see if you can find any faults.
http://SimonSlick.com/VEAF/ValidateEmailAddressFormat.html
Email address validation head-to-head
Yes, there are some faults with the Simon Slick code. It's also worth pointing out that both Simon Slick and Doug Lovell's code is copyright All Rights Reserved. You can't use it in your project.
I've written about some public-domain validation functions here: http://www.dominicsayers.com/isemail/
The Simon Slick code fails on some of the examples in RFC3696.
As some of these comments have pointed out, there are a lot of RFCs that cover this ground. For what it's worth, I believe my function complies with RFCs 1123, 2396, 3696, 4291, 4343, 5321 & 5322.
RFC Compliance
RFC Compliance
Backslash is not an RFC compliant component of an non-quoted email address local-part. May have been in the past, but not anymore, and has not be since the publication of RFC 2822 (2001). Move on folks.
This is also reinforced by RFC 3696 (2004).
http://tools.ietf.org/html/rfc3696
3. Restrictions on email addresses
Without quotes, local-parts may consist of any combination of
alphabetic characters, digits, or any of the special characters
! # $ % & ' * + - / = ? ^ _ ` . { | } ~
period (".") may also appear, but may not be used to start or end the
local part, nor may two or more consecutive periods appear. Stated
differently, any ASCII graphic (printing) character other than the
at-sign ("@"), backslash, double quote, comma, or square brackets may
appear without quoting. If any of that list of excluded characters
are to appear, they must be quoted.
Also see the RFC3696 Errata
http://www.rfc-editor.org/cgi-bin/errataSearch.pl?rfc=3696
These are not RFC compliant:
Fred\ Bloggs@example.com
Joe.\\Blow@example.com
And should have read as:
"Fred\ Bloggs"@example.com
"Joe.\\Blow"@example.com
Also, "the upper limit on address lengths (local-part@domain-part) should normally be considered to be 256."
And as someone already alluded to, the domain name is now, for quite some time I might add, allowed to begin with a digit.
You need to update your code, test data and this article.
RFC Compliance
Also the quoted string check appears would allow null (x00). According to RFC 2822 3.2.5. Quoted strings and 3.2.1. Primitive Tokens the permitted NO-WS-CTL characters are x01-x08, x0B, x0E-x1F, x7F. This does not include the null character x00.
RFC 2822
4.1. Miscellaneous obsolete tokens
The obs-char and obs-qp elements each add ASCII value 0.
Appendix B. Differences from earlier standards
Items marked with an asterisk (*) below are items which
appear in section 4 of this document and therefore can no longer be
generated.
12. ASCII 0 (null) removed.*
Challenge
So should the e-mail address someone@3com.com be accepted or not? It fails to satisfy requirement #7 above but I guess the code in Listing 9 would accept it.
Tom
It is working good
I tested it with many options and its working fine :D