Validate an E-Mail Address with PHP, the Right Way

June 1st, 2007 by Douglas Lovell in

Develop a working PHP function to validate e-mail addresses.
Your rating: None Average: 4.5 (121 votes)

The Internet Engineering Task Force (IETF) document, RFC 3696, “Application Techniques for Checking and Transformation of Names” by John Klensin, gives several valid e-mail addresses that are rejected by many PHP validation routines. The addresses: Abc\@def@example.com, customer/department=shipping@example.com and !def!xyz%abc@example.com are all valid. One of the more popular regular expressions found in the literature rejects all of them:

"^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)
↪*(\.[a-z]{2,3})$"

This regular expression allows only the underscore (_) and hyphen (-) characters, numbers and lowercase alphabetic characters. Even assuming a preprocessing step that converts uppercase alphabetic characters to lowercase, the expression rejects addresses with valid characters, such as the slash (/), equal sign (=), exclamation point (!) and percent (%). The expression also requires that the highest-level domain component has only two or three characters, thus rejecting valid domains, such as .museum.

Another favorite regular expression solution is the following:

"^[a-zA-Z0-9_.-]+@[a-zA-Z0-9-]+.[a-zA-Z0-9-.]+$"

This regular expression rejects all the valid examples in the preceding paragraph. It does have the grace to allow uppercase alphabetic characters, and it doesn't make the error of assuming a high-level domain name has only two or three characters. It allows invalid domain names, such as example..com.

Listing 1 shows an example from PHP Dev Shed (www.devshed.com/c/a/PHP/Email-Address-Verification-with-PHP/2). The code contains (at least) three errors. First, it fails to recognize many valid e-mail address characters, such as percent (%). Second, it splits the e-mail address into user name and domain parts at the at sign (@). E-mail addresses that contain a quoted at sign, such as Abc\@def@example.com will break this code. Third, it fails to check for host address DNS records. Hosts with a type A DNS entry will accept e-mail and may not necessarily publish a type MX entry. I'm not picking on the author at PHP Dev Shed. More than 100 reviewers gave this a four-out-of-five-star rating.

One of the better solutions comes from Dave Child's blog at ILoveJackDaniel's (ilovejackdaniels.com), shown in Listing 2 (www.ilovejackdaniels.com/php/email-address-validation). Not only does Dave love good-old American whiskey, he also did some homework, read RFC 2822 and recognized the true range of characters valid in an e-mail user name. About 50 people have commented on this solution at the site, including a few corrections that have been incorporated into the original solution. The only major flaw in the code collectively developed at ILoveJackDaniel's is that it fails to allow for quoted characters, such as \@, in the user name. It will reject an address with more than one at sign, so that it does not get tripped up splitting the user name and domain parts using explode("@", $email). A subjective criticism is that the code expends a lot of effort checking the length of each component of the domain portion—effort better spent simply trying a domain lookup. Others might appreciate the due diligence paid to checking the domain before executing a DNS lookup on the network.

Requirements

IETF documents, RFC 1035 “Domain Implementation and Specification”, RFC 2234 “ABNF for Syntax Specifications”, RFC 2821 “Simple Mail Transfer Protocol”, RFC 2822 “Internet Message Format”, in addition to RFC 3696 (referenced earlier), all contain information relevant to e-mail address validation. RFC 2822 supersedes RFC 822 “Standard for ARPA Internet Text Messages” and makes it obsolete.

Following are the requirements for an e-mail address, with relevant references:

  1. An e-mail address consists of local part and domain separated by an at sign (@) character (RFC 2822 3.4.1).

  2. The local part may consist of alphabetic and numeric characters, and the following characters: !, #, $, %, &, ', *, +, -, /, =, ?, ^, _, `, {, |, } and ~, possibly with dot separators (.), inside, but not at the start, end or next to another dot separator (RFC 2822 3.2.4).

  3. The local part may consist of a quoted string—that is, anything within quotes ("), including spaces (RFC 2822 3.2.5).

  4. Quoted pairs (such as \@) are valid components of a local part, though an obsolete form from RFC 822 (RFC 2822 4.4).

  5. The maximum length of a local part is 64 characters (RFC 2821 4.5.3.1).

  6. A domain consists of labels separated by dot separators (RFC1035 2.3.1).

  7. Domain labels start with an alphabetic character followed by zero or more alphabetic characters, numeric characters or the hyphen (-), ending with an alphabetic or numeric character (RFC 1035 2.3.1).

  8. The maximum length of a label is 63 characters (RFC 1035 2.3.1).

  9. The maximum length of a domain is 255 characters (RFC 2821 4.5.3.1).

  10. The domain must be fully qualified and resolvable to a type A or type MX DNS address record (RFC 2821 3.6).

Requirement number four covers a now obsolete form that is arguably permissive. Agents issuing new addresses could legitimately disallow it; however, an existing address that uses this form remains a valid address.

The standard assumes a seven-bit character encoding, not multibyte characters. Consequently, according to RFC 2234, “alphabetic” corresponds to the Latin alphabet character ranges a–z and A–Z. Likewise, “numeric” refers to the digits 0–9. The lovely international standard Unicode alphabets are not accommodated—not even encoded as UTF-8. ASCII still rules here.

Developing a Better E-mail Validator

That's a lot of requirements! Most of them refer to the local part and domain. It makes sense, then, to start with splitting the e-mail address around the at sign separator. Requirements 2–5 apply to the local part, and 6–10 apply to the domain.

The at sign can be escaped in the local name. Examples are, Abc\@def@example.com and "Abc@def"@example.com. This means an explode on the at sign, $split = explode("@", $email); or another similar trick to separate the local and domain parts will not always work. We can try removing escaped at signs, $cleanat = str_replace("\\@", "");, but that will miss pathological cases, such as Abc\\@example.com. Fortunately, such escaped at signs are not allowed in the domain part. The last occurrence of the at sign must definitely be the separator. The way to separate the local and domain parts, then, is to use the strrpos function to find the last at sign in the e-mail string.

Listing 3 provides a better method for splitting the local part and domain of an e-mail address. The return type of strrpos will be boolean-valued false if the at sign does not occur in the e-mail string.

Let's start with the easy stuff. Checking the lengths of the local part and domain is simple. If those tests fail, there's no need to do the more complicated tests. Listing 4 shows the code for making the length tests.

Now, the local part has one of two forms. It may have a begin and end quote with no unescaped embedded quotes. The local part, Doug \"Ace\" L. is an example. The second form for the local part is, (a+(\.a+)*), where a stands for a whole slew of allowable characters. The second form is more common than the first; so, check for that first. Look for the quoted form after failing the unquoted form.

Characters quoted using the back slash (\@) pose a problem. This form allows doubling the back-slash character to get a back-slash character in the interpreted result (\\). This means we need to check for an odd number of back-slash characters quoting a non-back-slash character. We need to allow \\\\\@ and reject \\\\@.

It is possible to write a regular expression that finds an odd number of back slashes before a non-back-slash character. It is possible, but not pretty. The appeal is further reduced by the fact that the back-slash character is an escape character in PHP strings and an escape character in regular expressions. We need to write four back-slash characters in the PHP string representing the regular expression to show the regular expression interpreter a single back slash.

A more appealing solution is simply to strip all pairs of back-slash characters from the test string before checking it with the regular expression. The str_replace function fits the bill. Listing 5 shows a test for the content of the local part.

The regular expression in the outer test looks for a sequence of allowable or escaped characters. Failing that, the inner test looks for a sequence of escaped quote characters or any other character within a pair of quotes.

If you are validating an e-mail address entered as POST data, which is likely, you have to be careful about input that contains back-slash (\), single-quote (') or double-quote characters ("). PHP may or may not escape those characters with an extra back-slash character wherever they occur in POST data. The name for this behavior is magic_quotes_gpc, where gpc stands for get, post, cookie. You can have your code call the function, get_magic_quotes_gpc(), and strip the added slashes on an affirmative response. You also can ensure that the PHP.ini file disables this “feature”. Two other settings to watch for are magic_quotes_runtime and magic_quotes_sybase.

The two regular expressions in Listing 5 are appealing because they are relatively easy to comprehend and don't require repetition of the allowable character group, [A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-]. Here's a test for you. Why does the character group require two back-slash characters before the forward slash and one back-slash character before the single quote?

One deficiency of the outer test of Listing 5 is that it passes local part strings that include dots anywhere in the string. Requirement number two states that dots can't start or end the local part, and they can't appear together two or more times. We could address this by expanding the outer regular expression into form ^(a+(\.a+)+)$, where a is (\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~-]). We could, but that leads to a long, hard-to-read, repetitive expression that's difficult to believe in. It's clearer to add the simple checks shown in Listing 6.

The local part is a wrap. The code now checks all local part requirements. Checking the domain will complete the e-mail validation. The code could check all of the labels in the domain separately, as does the whiskey-loving code shown in Listing 2, but, as hinted earlier, the solution presented here allows the DNS check to do most of the domain validation work.

Listing 7 makes a cursory check to ensure only valid characters in the domain part, with no repeated dots. It goes on to make DNS lookups for MX and A records. It makes the check for the A record only if the MX record check fails. The code in Listing 4 verified the length of the domain value.

So, is it good? You decide. But, it would be nice to test the logic to ensure that it at least is correct. Listing 8 contains a series of e-mail address test cases that any e-mail validation should pass.

Be sure to run the test to see the valid and rejected e-mail addresses, the double-escaping (\\) inside the PHP strings tends to obfuscate the addresses. You're challenged to subject your favorite e-mail validation code to this test. Be assured that the code in Listing 9 does pass!

Listing 9 contains a complete function for validating an e-mail address. It isn't as concise as many—it certainly isn't a one-liner. But, it is straightforward to read and comprehend, and it correctly accepts and rejects e-mail addresses that many other published functions incorrectly reject and accept. The function orders the validation tests roughly according to increasing cost. In particular, the more complex regular expression and, certainly, the DNS lookup, both come last.

Spread the word! There is some danger that common usage and widespread sloppy coding will establish a de facto standard for e-mail addresses that is more restrictive than the recorded formal standard. If you want to fool the spambots, adopt an e-mail address like, {^c\@**Dog^}@cartoon.com. Unfortunately, you might fool some legitimate e-commerce sites as well. Which do you suppose will adapt more quickly?

Douglas Lovell is a software engineer with IBM Research, author of The XSL Formatting Objects Developer's Handbook published by Sams, and Web site editor for iac52.org.

__________________________


Special Magazine Offer -- Free Gift with Subscription
Receive a free digital copy of Linux Journal's System Administration Special Edition as well as instant online access to current and past issues. CLICK HERE for offer

Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Anonymous's picture

need to update the link to ilovejackdaniels

On October 22nd, 2009 Anonymous (not verified) says:

ilovejackdaniels.com will die soon, the new link is here:

http://www.addedbytes.com/php/email-address-validation

Anonymous's picture

I agree: too much DNS, and is too slow

On October 14th, 2009 Anonymous (not verified) says:

First off,
DNS lookups are quite slow, but without them this script will validate many obviously wrong addresses (like "a@b").

With a google search, I found this function which is much faster and it does a pretty good job, especially for not relying on the DNS at all...

function emailcheck($email) {
		return preg_match('/^(?:[\w\!\#\$\%\&\'\*\+\-\/\=\?\^\`\{\|\}\~]+\.)*[\w\!\#\$\%\&\'\*\+\-\/\=\?\^\`\{\|\}\~]+@(?:(?:(?:[a-zA-Z0-9_](?:[a-zA-Z0-9_\-](?!\.)){0,61}[a-zA-Z0-9_]?\.)+[a-zA-Z0-9_](?:[a-zA-Z0-9_\-](?!$)){0,61}[a-zA-Z0-9_]?)|(?:\[(?:(?:[01]?\d{1,2}|2[0-4]\d|25[0-5])\.){3}(?:[01]?\d{1,2}|2[0-4]\d|25[0-5])\]))$/', $email);
}

Also,
instead of:
preg_match('/\\.\\./', $local)

use:
strpos($local,'..')!==false)

strpos() is much faster than preg_match.

Bill's picture

Relies too much on DNS

On October 12th, 2009 Bill (not verified) says:

DNS checking is resource intensive. If you take out the DNS checks, all of the bugs in the logic start to appear.

This routine will accept the following as a valid email address:

a@b.c
a@b
a@b.

It is doing zero validation of the host and domain extension. Instead it relies on DNS to do these checks, which is a waste of system resources.

Since this article was posted over two years ago and no corrections have been made, I would suggest looking for something better.

Falstaff Computing's picture

Awesome script, just what I was looking for!!

On October 9th, 2009 Falstaff Computing (not verified) says:

Great script with comments and explanations so you can learn and understand what the code is doing!! Excellent!!

Anonymous's picture

sorry this code have a

On September 24th, 2009 Anonymous (not verified) says:

sorry this code have a bug

just test : michael.good@gmail

wihout any .COM or .ANYTHING , the function recognize the email as valid !!!

VeNoMouS's picture

It is prob due to your dns

On September 27th, 2009 VeNoMouS (not verified) says:

It is prob due to your dns search suffix, is it set to .com? if so its appending .com onto domains that are not fqdn

Anonymous's picture

Thanks mate, this is working

On August 28th, 2009 Anonymous (not verified) says:

Thanks mate, this is working just perfectly for me ;-)

Thinks don't have to be perfect in my oppinion, as long as some scumbag spambots cannot spam with viagra_pills@for.free i'm happy :-D

David Schwartz's picture

This will reject 'postmaster'.

On August 25th, 2009 David Schwartz (not verified) says:

These validations will reject 'postmaster' which is, at least in some circumstances such as an SMTP RCPT line, required to be considered a valid email address.

D'Arcy Flynn's picture

I'm a little late, but I think I found a bug...

On August 6th, 2009 D'Arcy Flynn (not verified) says:

Hi there,
I am using this code to validate emails on the fly with the help of AJAX, on my site. I noticed that as I entered random email addresses the user could simply put
"myemail@a"
and it would consider it valid.
it didn't need the .____ attached.
so I added this to the middle of the validation:

else if (!preg_match('/\\./', $domain))
{
// domain has no dots
$isValid = false;
}

It fixed it.

Thanks for the code :)

Geoffrey Lee's picture

I wrote my own e-mail

On July 17th, 2009 Geoffrey Lee (not verified) says:

I wrote my own e-mail validation function after spending hours with RFC 2822. It passes all of the above test cases with NOYB's corrections. I would appreciate if you submit any bugs to: geoffreyj.lee at Gmail.

function validateEmail($input)
{
  $atom = '[a-zA-Z0-9!#$%&\'*+\-\/=?^_`{|}~]+';
  $quoted_string = '"[^"\\\\\r\n]*"';
  $word = "$atom(\.$atom)*";
  $domain = "$atom(\.$atom)+";
  return strlen($input) < 256
    && preg_match("/^($word|$quoted_string)@{$domain}\$/", $input);
}
Geoffrey Lee's picture

I realized that my

On July 19th, 2009 Geoffrey Lee (not verified) says:

I realized that my quoted-string regexp allowed too many characters. Here's the corrected version:

function validateEmailAddress($input)
{
  $atom = '[a-zA-Z0-9!#$%&\'*+\-\/=?^_`{|}~]+';
  $quoted_string = '"([\x1-\x9\xB\xC\xE-\x21\x23-\x5B\x5D-\x7F]|\x5C[\x1-\x9\xB\xC\xE-\x7F])*"';
  $word = "$atom(\.$atom)*";
  $domain = "$atom(\.$atom)+";
  return strlen($input) < 256 && preg_match("/^($word|$quoted_string)@${domain}\$/", $input);
}
SidAhmed 's picture

Question

On July 7th, 2009 SidAhmed (not verified) says:

Hello guys

i have connection internet and i installed xampp last version i would like to test validation of email this is the form :

Enter your email :

but its not working plz can u help me

replay urgent

alexanderdickson's picture

just thought i'd leave a

On June 21st, 2009 alexanderdickson (not verified) says:

just thought i'd leave a note to say your link to Dave Child's website, ilovejackdaniels.com has changed to addedbytes.com

Pawel B.'s picture

Wee fix

On May 8th, 2009 Pawel B. (not verified) says:

This should check the top-level domain as well. I thought everything is fine until someone typed email address like this xxx@yyy.p instead of xxx@yyy.pl . DNS checking is switched off on a server so I cannot validate email address using it.

Lets add:


elseif (strlen(substr($domain, strrpos($domain, '.')+1)) < 2 || strlen(substr($domain, strrpos($domain, '.')+1)) > 6) {
$isValid = false;
}

Top level domain AFAIK is at least 2 char long and 'museum' is longest at the moment. This should do the trick.

Thanks for this code BTW! Very useful.

Anonymous's picture

Wrong start!

On March 27th, 2009 Anonymous (not verified) says:

$isValid = true;
Should be $isValid = false;
If everything fails it should allways return false... pfff, spread the word!

Mihai's picture

R U Stupid?

On May 18th, 2009 Mihai (not verified) says:

The beginning of the code is perfect, "$isValid = true;" and not what you said!
It starts with the idea that the email is valid, and the checks are made!
If it doesn't pass, then it will return false!

Anonymous's picture

He has a good point and not stupidity...

On October 19th, 2009 Anonymous (not verified) says:

the first set should be false...

to prove that the email is true (and only true) is to pass through tests.. after all tests have been done and all passed, that's the only time you set and agree that the email is valid. It is rather right than saying, at first the email is already valid and go through all the tests and prove it wrong.

Johny Iversen's picture

What about other languages?

On February 28th, 2009 Johny Iversen (not verified) says:

Great article! Everything was neatly explained. I would say that Tom Burt is right though, the wording in the rule section seems to imply that a domain name can not begin with a number, which of course is wrong.

But what about if I wanted to do it in other languages like ASP.NET, or just plain javascript, is there any chance you will be working on examples for that too? :)

Anonymous's picture

This doesn't work with

On February 20th, 2009 Anonymous (not verified) says:

This doesn't work with emails such as "someone@somewhere.co.uk" or "someone@somewhere.mn" ....

Sacapuss's picture

A validator that can tell back the exact nature of the anomaly.

On February 8th, 2009 Sacapuss (not verified) says:

Hello!

First, I want to thank and congratulate the author of this article for its quality and desirability.

I admit that, wishing to write a form that tests the submited addresses, I have been searching for a long time in vain on the Web a document that clearly explains the email addresses syntax, and that it seems I found it here.

I want to write a mail addresses validator that can tell back the visitor the exact nature of the anomaly. Furthermore, I don't want to use regular expressions, often reading bad things about them, and... not knowing how to use them.

So I show you candidly the code I wrote, for submission to the fire of your critics. It is not perfect: particularly in the management of the escapement.

Here you have:

<?php // testor_email_0.php

$testable_mail = html_entity_decode( $mail ) ;
$butee = strlen( $testable_mail ) ;
$aro_pos = strrpos( $testable_mail, $aro ) ;

$nout = array( "nom d'utilisateur", "user name" ) ;
$nodo = array( "nom de domaine", "domain name" ) ;
$car = array( "caractère", "character" ) ;
$et = "être" ;

if( ! $testable_mail )
$avertissement = array( "$viv $adel[$lang_index]", "$svps[$lang_index] include your $adel[$lang_index]" ) ;

else if ( $aro_pos === FALSE )
$avertissement = array( "Votre $adel[$lang_index] doit comporter une arobase", "Your $adel[$lang_index] must include the at sign" ) ;

else if ( $aro_pos == 0 )
$avertissement = array( "Votre $adel[$lang_index] doit comporter un $nout[$lang_index]", "Your $adel[$lang_index] must have a $nout[$lang_index]" ) ;

else if ( $aro_pos == $butee - 1 )
$avertissement = array( "Votre $adel[$lang_index] doit comporter un $nodo[$lang_index]", "Your $adel[$lang_index] must have a $nodo[$lang_index]" ) ;

else if( $testable_mail{0} == $dot )
$avertissement = array( "Un point ne peut pas débuter votre $adel[$lang_index]", "A dot cannot begin your $adel[$lang_index]" ) ;

else if( $testable_mail{$butee - 1} == $dot )
$avertissement = array( "Un point ne peut pas terminer votre $adel[$lang_index]", "A dot cannot end your $adel[$lang_index]" ) ;

else
{
$segments = explode( $dot, $testable_mail ) ;
foreach( $segments as $segment )
if( ! strlen( $segment ) )
{
$avertissement = array( "Deux points ne peuvent pas $et contigus dans votre $adel[$lang_index]", "Two dots can not be contiguous in your $adel[$lang_index]" ) ;
break ;
}

include_once "Data/hilite.php" ;
$numeri_cars = range( "0", "9" ) ;

if( ! $avertissement  ) include "testor_email_1.php" ;
if( ! $avertissement ) include "testor_email_2.php" ;
}

if( $avertissement ) $a_servir = "mail" ;

?>
<?php // testor_email_1.php

$testable_str = substr( $testable_mail, 0,  $aro_pos ) ;
$butee = strlen( $testable_str ) ;

$gui = '"' ;
$gui_nombre = substr_count( $testable_str, $gui ) ;
$dir = dir_extraire( __file__ ) ;

/* longueur maximum */

$max = 64 ;
if( $butee > $max )
$avertissement = array( "Le nombre de car[$lang_index]s de votre $nout[$lang_index] ne peut excéder $max", "The number of car[$lang_index]s in your $nout[$lang_index] can not exceed $max" ) ;


/* point a la fin  */

else if( $testable_str{ $butee - 1 } == $dot )
$avertissement = array( "Un point ne peut $et contigu à l'arobase", "A dot cannot be contiguous to the at sign" ) ;


/* guillemets */

else if( $gui_nombre )
include "$dir/testor_email_guillemets.php" ;


/* defaut */

else
{
$zauts_str = "!, #, $, %, &, ', *, +, -, /, =, ?, ^, _, `, {, |, }, ~, $dot" ;
$zauts_list = explode( $vs, $zauts_str ) ;
$valides = array_merge( $lettres, $numeri_cars, $zauts_list ) ;

for( $i = 0; $i < $butee; $i++ )
{
$ze_car = $testable_str{$i} ;
if( ! in_array( $ze_car, $valides ) )
{
$ze_car = hilite( $ze_car ) ;
$avertissement = array( "Le $car[$lang_index] $ze_car ne peut pas figurer dans votre $nout[$lang_index]", "The $car[$lang_index] $ze_car cannot appear in your $nout[$lang_index]" ) ;
break ;
}
}
}

?>
<?php // testor_email_guillemets.php

if( $gui_nombre == 1 )
$avertissement = array( "Les guillemets doivent  se présenter par paire dans votre $nout[$lang_index]", "The double-quotes must show by pair in your $nout[$lang_index]" ) ;

else if( $gui_nombre == 2 )
{
if( $testable_str{0} != $gui || $testable_str{ $butee - 1 } != $gui ) 
$avertissement = array( "Les guillemets doivent se présenter aux extrémités de votre $nout[$lang_index]", "The double-quotes must show at the ends of your $nout[$lang_index]" ) ;
}

else
$avertissement = array( "Votre $nout[$lang_index] ne peut pas avoir de guillemets ailleurs qu'aux deux extrémités", "Your $nout[$lang_index] cannot have double-quotes anywhere else but at both ends" ) ;

?>

<?php // testor_email_2.php

$testable_str = substr( $testable_mail, $aro_pos + 1 ) ;
$butee = strlen( $testable_str ) ;

$max = 255 ;
if( $butee > $max )
$avertissement = array( "Le nombre de $car[$lang_index]s de votre $nodo[$lang_index] ne peut excéder $max", "The number of $car[$lang_index]s in your $nodo[$lang_index] can not exceed $max" ) ;

else if( $testable_str{0} == $dot )
$avertissement = array( "Le premier $car[$lang_index] de votre $nodo[$lang_index] ne peut pas $et un point", "Your $nodo[$lang_index] can not begin with a dot" ) ;

else
{
$segments = explode( $dot, $testable_str ) ;
foreach( $segments as $segment )
{

$segment_len = strlen( $segment ) ;
$max = 63 ;
if( $segment_len > $max )
{
$avertissement = array( "Le nombre de $car[$lang_index]s entre deux points dans votre $nodo[$lang_index] ne peut excéder $max", "The number of $car[$lang_index]s between two dots in your $nodo[$lang_index] can not exceed $max" ) ;
break ;
}

$ze_car = $segment{0} ;
if( ! in_array( $ze_car, $lettres ) )
{
$ze_car = hilite( $ze_car ) ;
$avertissement = array( "Le $car[$lang_index] $ze_car ne peut pas figurer immédiatement après un point ou l'arobase dans le $nodo[$lang_index] de votre $adel[$lang_index]", "The $car[$lang_index] $ze_car cannot show just after a dot or the at sign in the $nodo[$lang_index] of your $adel[$lang_index]" ) ;
break ;
}

$valides = array_merge( $lettres, $numeri_cars ) ;
$ze_car = $segment{$segment_len-1} ;
if( ! in_array( $ze_car, $valides ) )
{
$ze_car = hilite( $ze_car ) ;
$avertissement = array( "Le $car[$lang_index] $ze_car ne peut figurer : ni immédiatement avant un point dans le, ni à la fin du, $nodo[$lang_index] de votre $adel[$lang_index]",  "The $car[$lang_index] $ze_car cannot show, neither just before a dot in, nor at the end of, the $nodo[$lang_index] of your $adel[$lang_index]" ) ;
break ;
}

$valides[] = $tiret ;
$butee = $segment_len - 1 ;
for( $i = 1; $i < $butee; $i++ )
{
$ze_car = $segment{$i} ;
if( ! in_array( $ze_car, $valides ) )
{
$ze_car = hilite( $ze_car ) ;
$avertissement = array( "Le $car[$lang_index] $ze_car ne peut pas figurer dans votre $nodo[$lang_index]", "The $car[$lang_index] $ze_car cannot appear in your $nodo[$lang_index]" ) ;
break 2 ;
}
}

}
}

if( ! $avertissement )
{
$dot_pos = strrpos( $testable_str, $dot ) ;
if( $dot_pos === FALSE )
$avertissement = array( "Veuillez indiquer un domaine de niveau supérieur à votre $adel[$lang_index]", "$svps[$lang_index] indicate a top level domain to your $adel[$lang_index]" ) ;

if( ! $avertissement )
{
$tld = substr( $testable_str, $dot_pos + 1 ) ;
include "../Data/tlds.php" ;
if( ! in_array( $tld, $tlds ) )
$avertissement = array( "Le domaine de niveau supérieur que vous avez indiqué ne figure pas dans notre liste de référence", "The top level domain you indicate is not in our list" ) ;

if( ! $avertissement && ! checkdnsrr( $testable_str ) && ! checkdnsrr( $testable_str, "A" ) )
$avertissement = array( "Le $nodo[$lang_index] que vous avez indiqué n'est pas reconnu par internet", "The $nodo[$lang_index] you indicate is not recognized by internet" ) ;

}
}

?>

Thank you for your contribution,

Sacapuss

Misafir's picture

I appreciate

On February 3rd, 2009 Misafir says:

I appreciate your efforts in producing a comprehensive email validation function for php.

Dominic Sayers's picture

Yet Another Email Address Validator

On January 28th, 2009 Dominic Sayers (not verified) says:

I've had a go at this too. One reason being that the code here is All Rights Reserved by Linux Journal, so I don't think you can use it in your project.

Here's my effort: RFC-compliant email address validator

I've done more checking of the domain part, particularly allowing the IP address format even though it's discouraged by the RFCs.

I believe my function respects RFCs 1123, 2396, 3696, 4291, 4343, 5321 & 5322. Please let me know if you find any problems with it.

Archangel's picture

PHP 4.0.0 Update

On January 15th, 2009 Archangel (not verified) says:

The line:

if (is_bool($atIndex) && !$atIndex) {

can now be updated to read:

if ($atIndex === false) {

This looks a little cleaner, but may be harder to read or confuse older PHP developers.

John Kurlak's picture

It can be better written this way:

On January 8th, 2009 John Kurlak (not verified) says:

<?php
# Offers methods for validating user input

class Validate
{
	static function email($email)
	{
		$isValid = true;
		$atIndex = strrpos($email, '@');

		if (is_bool($atIndex) && !$atIndex)
		{
			return false;
		}
		else
		{
			$domain = substr($email, $atIndex + 1);
			$local = substr($email, 0, $atIndex);
			$validLocalLength = Validate::length($local, 1, 64);
			$validDomainLength = Validate::length($domain, 1, 255);
			$validStartFinish = !($local[0] == '.' || $local[$localLen - 1] == '.');
			$validLocalDots = !preg_match('/\\.\\./', $local);
			$validDomainCharacters = preg_match('/^[A-Za-z0-9\\-\\.]+$/', $domain);
			$validDomainDots = !preg_match('/\\.\\./', $domain);
			$validLocalCharacters = !(!preg_match('/^(\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-])+$/', str_replace("\\\\","",$local)) && !preg_match('/^"(\\\\"|[^"])+"$/', str_replace("\\\\","",$local)));
			$validMailRecord = checkdnsrr($domain, 'MX') || checkdnsrr($domain, 'A');

			return $validLocalLength && $validDomainLength && $validStartFinish && $validLocalDots && $validDomainCharacters && $validDomainDots && $validLocalCharacters && $validMailRecord;
		}
	}

	static function length($input, $min, $max)
	{
		return isset($input[$min - 1]) && !isset($input[$max]);
	}
}
?>
Anonymous's picture

fix

On January 27th, 2009 Anonymous (not verified) says:

Replace the line:
$validStartFinish = !($local[0] == '.' || $local[$localLen - 1] == '.');

with:
$validStartFinish = !($local[0] == '.' || $local[strlen($local) - 1] == '.');

since $localLen isn't defined

Anonymous's picture

Failed Verification

On January 27th, 2009 Anonymous (not verified) says:

I like how you wrote the code: I did a test on the emails that were in the article and your script worked fine except it said the following emails where valid when they should have been invalid
dot.@example.com
Doug\ \"Ace\"\ L\.@example.com

Anonymous's picture

function validEmail:

On November 13th, 2008 Anonymous (not verified) says:

I've also tried this function below to replace checkdnsrr because it doesn't work at all in windows, but still not working, page always keep on loading and nothing displayed:

function myCheckDNSRR($hostName, $recType = ''){
		if(!empty($hostName)) {
			if( $recType == '' ) $recType = "MX";
			exec("nslookup -type=$recType $hostName", $result);
			// check each line to find the one that starts with the host
			// name. If it exists then the function succeeded.
			foreach ($result as $line) {
				if(eregi("^$hostName",$line)) {
					return true;
				}
			}
			// otherwise there was no mail handler for the domain
			return false;
		}
		return false;
	}
Mostaaf's picture

This is from PHP Mail Validator

On November 11th, 2008 Mostaaf (not verified) says:

Hei Yo

HeidiR's picture

validEmail not trapping invalid domains correctly

On October 27th, 2008 HeidiR (not verified) says:

I appreciate your efforts in producing a comprehensive email validation function for php. Unfortunately, when I tried to implement and test this function, it does not appear to invalid domains correctly. For example:

echo(_valid_email('autoit_heidi@yahoo.com')); (valid) Returns true
echo(_valid_email('autoit_heidi@yahoo.co')); (invalid domain) Returns true
echo(_valid_email('autoit_heidi@111111111111111111.com')); (invalid domain) Returns true

Is this the more current code?

Anonymous's picture

quotation marks

On October 20th, 2008 Anonymous (not verified) says:

I used the functionality given in this article in a test case, and emails with quotation marks, with both embedded and without (the embedded ones had proper escape characters) both failed the verification standards...either there must be an update...or someone is lying

Anonymous's picture

"abc@def"@example.com doesnt

On October 20th, 2008 Anonymous (not verified) says:

"abc@def"@example.com

doesnt work

and

"Fred \"quota\" Bloggs"@example.com

doesnt work...if its supposed to, why isnt it?

Anonymous's picture

bump

On October 20th, 2008 Anonymous (not verified) says:

bump.

But seriously is this gonna get an update for the problem whereas the domain part of an email adress is not allowed to start with a number and yet the function allows it?

Giuliano's picture

domains are allowed to start with a digit

On May 7th, 2009 Giuliano (not verified) says:

not sure exactly when they were allowed, but domains can start with a digit.

Malaiac's picture

filter_var

On October 15th, 2008 Malaiac (not verified) says:

I suppose a simple
filter_var($email, FILTER_VALIDATE_EMAIL);
isn't enough ?

Anonymous's picture

FILTER_VALIDATE_EMAIL

On June 30th, 2009 Anonymous (not verified) says:

I belive this function only work for php5 or above

Geoffrey Lee's picture

filter_var isn't perfect either

On July 17th, 2009 Geoffrey Lee (not verified) says:

Yes, this function was introduced in PHP 5.2, and it isn't as comprehensive. A test of filter_var in PHP 5.3 gives:

All of these should succeed:
dclo@us.ibm.com is valid.
abc\@def@example.com is not valid.
abc\\@example.com is not valid.
Fred\ Bloggs@example.com is not valid.
Joe.\\Blow@example.com is not valid.
"Abc@def"@example.com is valid.
"Fred Bloggs"@example.com is valid.
customer/department=shipping@example.com is not valid.
$A12345@example.com is valid.
!def!xyz%abc@example.com is valid.
_somename@example.com is valid.
user+mailbox@example.com is valid.
peter.piper@example.com is valid.
Doug\ \"Ace\"\ Lovell@example.com is not valid.
"Doug \"Ace\" L."@example.com is not valid.

All of these should fail:
abc@def@example.com is not valid.
abc\\@def@example.com is not valid.
abc\@example.com is not valid.
@example.com is not valid.
doug@ is not valid.
"qu@example.com is not valid.
ote"@example.com is not valid.
.dot@example.com is not valid.
dot.@example.com is valid.
two..dot@example.com is valid.
"Doug "Ace" L."@example.com is not valid.
Doug\ \"Ace\"\ L\.@example.com is not valid.
hello world@example.com is not valid.
gatsby@f.sc.ot.t.f.i.tzg.era.l.d. is not valid.

The email validation is deficient.

Matt Kantor's picture

Updates?

On October 9th, 2008 Matt Kantor (not verified) says:

I'd really like to see this article updated in response to some of these comments. Particularly, NOYB and the concerns about IP address domains (even if the given examples are incorrect).

Other updates I'd like to see include:

For now, I'm using your function with a few modifications (including implementing a "trust scale" of 0.0-1.0 instead of an absolute true/false), but my quest for One Email Validator to Rule Them All continues. It'd be awesome if we could somehow get to a point where we didn't need to send any annoying confirmation emails. All in all, great work.

Anonymous's picture

Great code

On September 24th, 2008 Anonymous (not verified) says:

This is very nice routine once for all. Currently i m doing testing on window machine and windows doesnt support checkdnsrr function so i modify it following way to work with Window.

      /* Following code should be activated if hosting is on linux.
      if ($isValid && !(checkdnsrr($domain,"MX") || checkdnsrr($domain,"A")))
      {  // domain not found in DNS
         $isValid = false;
      }
      Following code should be activated if hosting is on windows. */
      if ($isValid && !(myCheckDNSRR($domain,"MX") || myCheckDNSRR($domain,"A")))
      {  // domain not found in DNS
         $isValid = false;
function myCheckDNSRR($hostName, $recType = '')
{
 if(!empty($hostName)) {
   if( $recType == '' ) $recType = "MX";
   exec("nslookup -type=$recType $hostName", $result);
   // check each line to find the one that starts with the host
   // name. If it exists then the function succeeded.
   foreach ($result as $line) {
     if(eregi("^$hostName",$line)) {
       return true;
     }
   }
   // otherwise there was no mail handler for the domain
   return false;
 }
 return false;
}

And please pardon my knowledge, I am very new in programming and just trying to play with it, its not my code i found from other places. But I thought it will help.

Thanks

Anonymous's picture

3.4. Address Specification

On August 29th, 2008 Anonymous (not verified) says:

It was good for me to read about you wanna did this formal right once and for all. I really appreciate this. BUT. Let's take RFC8222 and checkout what exactly an adress is:

http://www.faqs.org/rfcs/rfc2822.html

3.4. Address Specification
Addresses occur in several message header fields to indicate senders
and recipients of messages. An address may either be an individual
mailbox, or a group of mailboxes.

address = mailbox / group

mailbox = name-addr / addr-spec

name-addr = [display-name] angle-addr

angle-addr = [CFWS] "<" addr-spec ">" [CFWS] / obs-angle-addr

group = display-name ":" [mailbox-list / CFWS] ";"
[CFWS]

display-name = phrase

mailbox-list = (mailbox *("," mailbox)) / obs-mbox-list

address-list = (address *("," address)) / obs-addr-list

So there I would really like to see you routine to be OK with this defitinion. For Example angle-addr as part of mailbox is not really supported. Your routine does not even check for the right mailbox definition in an address. I have not checked wether groups are. Isn't this the right place to look for the definition of an email-adress?

Giuliano's picture

I think you are taking the

On May 7th, 2009 Giuliano (not verified) says:

I think you are taking the wrong section. What that seems to describe is the way addresses are written in headers and such.
That is something like:
Julius Caesar
What the article is about is the addr-spec, that is the part between angle brackets.

marsibigo's picture

Validate an E-Mail Address with PHP... (Javascript version)

On June 24th, 2008 marsibigo (not verified) says:

Please replace:
1. strEmail[j] with strEmail.charAt(j)
2. local[0] with local.charAt(0)
3. local[localLen-1] with local.charAt(localLen-1)
4. domain[domainLen-1] with domain.charAt(domainLen-1)

because "strEmail[j]" did not work on ie.

marsibigo's picture

Validate an E-Mail Address with PHP... (Javascript version)

On June 24th, 2008 marsibigo (not verified) says:

//NOTE: use this line code :
//                       strEmail= fixBackSlash(strEmail); 
//      only if email address come from a textbox (form);

function isValidEmail(strEmail)
{
	this.strrpos=function( haystack, needle, offset){
		// http://kevin.vanzonneveld.net
		// +   original by: Kevin van Zonneveld (http://kevin.vanzonneveld.net)
		// *     example 1: strrpos('Kevin van Zonneveld', 'e');
		// *     returns 1: 16
	 
		var i = haystack.lastIndexOf( needle, offset ); // returns -1
		return i >= 0 ? i : false;
	}

	this.fixBackSlash=function(strEmail)
	{
		var strEmailTemp="";
		var isBackSlash = false;
		for(var j=0;j 64)
		{
			 // local part length exceeded
			 isValid = false;
		}
		else if (domainLen < 1 || domainLen > 255)
		{
			// domain part length exceeded
			isValid = false;
		}
		else if (local[0] == '.' || local[localLen-1] == '.')
		{
			// local part starts or ends with '.'
			isValid = false;
		}
		else if (local.match('\\.\\.'))
		{
			 // local part has two consecutive dots
			 isValid = false;
		}
		else if (!domain.match('^[A-Za-z0-9\\-\\.]+$')|| domain[domainLen-1] == '.')
		{
			// character not valid in domain part
			isValid = false;
		}
		else if (domain.match('\\.\\.'))
		{
			// domain part has two consecutive dots
			isValid = false;
		}
		else if(!localsave.match('^(\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-])+$'))
		{
			// character not valid in local part unless 
			// local part is quoted
			if (!localsave.match('^"(\\\\"|[^"])+"$'))
			{
				isValid = false;
			}
		}
	}
	 return isValid;
}
Dave's picture

Many other valid emails still fail

On May 18th, 2008 Dave (not verified) says:

Here's a few examples of valid email addresses that fail using this validator:

localhost
joe@localhost

ipv4
joe@123.456.7.89

ipv6
joe@2001:0db8::1428:57ab

Matt Kantor's picture

Your Examples

On October 9th, 2008 Matt Kantor (not verified) says:

joe@123.456.7.89 is not valid, each byte of an IPv4 address can only range 0-255 decimal (and 456 is outside of this range).

Also, I didn't test your addresses, but the domains have to be registered, otherwise the DNS lookup (checkdnsrr) will fail.

Marian M.Bida's picture

Nice!

On May 11th, 2008 Marian M.Bida (not verified) says:

Excellent, I will use it in my systems.

herseybendevar's picture

Validate an E-Mail Address with PHP, the Right Way

On February 16th, 2008 herseybendevar (not verified) says:

thank you my page used..

AlexCox's picture

JavaScript conversion...

On April 14th, 2008 AlexCox (not verified) says:

Hi!

I've found your PHP script very effective, so I tried to convert it to JavaScript to check an address before it's sent to the server and if necessary warn the user.

It was very easy even if I'm not an expert programmer, but I have some problems with the last "else if" statement, cause it misses the recognition of the following addresses: abc\@def@example.com, Fred\ Bloggs@example.com, Doug\ \"Ace\"\ Lovell@example.com, "Doug \"Ace\" L."@example.com, abc\@example.com (this should fail but it doesn't)

The code i used is:

else if (!local.replace("\\\\","").match(/^(\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-])+$/))
{
// character not valid in local part unless
// local part is quoted
if (!local.match(/^"(\\\\"|[^"])+"$/))
{
isValid = false;
}
}

What I'm missing?
Thanks for any help!

marsibigo dope's picture

Validate an E-Mail Address with PHP... (Javascript version)

On June 24th, 2008 marsibigo dope (not verified) says:

//NOTE: use this line code :
//                       strEmail= fixBackSlash(strEmail); 
//      only if email address come from a textbox (form);

function isValidEmail(strEmail)
{
	this.strrpos=function( haystack, needle, offset){
		// http://kevin.vanzonneveld.net
		// +   original by: Kevin van Zonneveld (http://kevin.vanzonneveld.net)
		// *     example 1: strrpos('Kevin van Zonneveld', 'e');
		// *     returns 1: 16
	 
		var i = haystack.lastIndexOf( needle, offset ); // returns -1
		return i >= 0 ? i : false;
	}

	this.fixBackSlash=function(strEmail)
	{
		var strEmailTemp="";
		var isBackSlash = false;
		for(var j=0;j 64)
		{
			 // local part length exceeded
			 isValid = false;
		}
		else if (domainLen < 1 || domainLen > 255)
		{
			// domain part length exceeded
			isValid = false;
		}
		else if (local[0] == '.' || local[localLen-1] == '.')
		{
			// local part starts or ends with '.'
			isValid = false;
		}
		else if (local.match('\\.\\.'))
		{
			 // local part has two consecutive dots
			 isValid = false;
		}
		else if (!domain.match('^[A-Za-z0-9\\-\\.]+$')|| domain[domainLen-1] == '.')
		{
			// character not valid in domain part
			isValid = false;
		}
		else if (domain.match('\\.\\.'))
		{
			// domain part has two consecutive dots
			isValid = false;
		}
		else if(!localsave.match('^(\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-])+$'))
		{
			// character not valid in local part unless 
			// local part is quoted
			if (!localsave.match('^"(\\\\"|[^"])+"$'))
			{
				isValid = false;
			}
		}
	}
	 return isValid;
}
David's picture

Great article, just a slight fix

On February 2nd, 2008 David (not verified) says:

This is terrific.

There is some sort of typo in the part of the code in Listing 9 where you check the A and MX DNS records, which make this break as written.

Changing:
if ($isValid && !(checkdnsrr($domain,"MX") ||
↪checkdnsrr($domain,"A")))

To:
if ($isValid && !((checkdnsrr($domain,"MX")) ||
(checkdnsrr($domain,"A"))))

seems to make it work.

cruzanmo's picture

your fix works for me too

On August 1st, 2009 cruzanmo (not verified) says:

Thanks for the awesome script!

I ran into the same error with that line, and your fix made it work for me too!

Post new comment

Please note that comments may not appear immediately, so there is no need to repost your comment.
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <pre> <ul> <ol> <li> <dl> <dt> <dd> <i> <b>
  • Lines and paragraphs break automatically.

More information about formatting options

Newsletter

Each week Linux Journal editors will tell you what's hot in the world of Linux. You will receive late breaking news, technical tips and tricks, and links to in-depth stories featured on www.linuxjournal.com.
Sign up for our Email Newsletter

Tech Tip Videos

From the Magazine

December 2009, #188

If last month's Infrastrucuture issue was too "big" for you then try on this month's Embedded issue. Find out how to use Player for programming mobile robots, build a humidity controller for your root cellar, find out how to reduce the boot time of your embedded system, and if you're new to embedded systems find out the basics that go into one. You can also read about the Beagle Board, the Mesh Potato and a spate of other interestingly named items. And along with our regular columns don't miss our new monthly column: Economy Size Geek.







Read this issue