Validate an E-Mail Address with PHP, the Right Way

Develop a working PHP function to validate e-mail addresses.
Requirements

IETF documents, RFC 1035 “Domain Implementation and Specification”, RFC 2234 “ABNF for Syntax Specifications”, RFC 2821 “Simple Mail Transfer Protocol”, RFC 2822 “Internet Message Format”, in addition to RFC 3696 (referenced earlier), all contain information relevant to e-mail address validation. RFC 2822 supersedes RFC 822 “Standard for ARPA Internet Text Messages” and makes it obsolete.

Following are the requirements for an e-mail address, with relevant references:

  1. An e-mail address consists of local part and domain separated by an at sign (@) character (RFC 2822 3.4.1).

  2. The local part may consist of alphabetic and numeric characters, and the following characters: !, #, $, %, &, ', *, +, -, /, =, ?, ^, _, `, {, |, } and ~, possibly with dot separators (.), inside, but not at the start, end or next to another dot separator (RFC 2822 3.2.4).

  3. The local part may consist of a quoted string—that is, anything within quotes ("), including spaces (RFC 2822 3.2.5).

  4. Quoted pairs (such as \@) are valid components of a local part, though an obsolete form from RFC 822 (RFC 2822 4.4).

  5. The maximum length of a local part is 64 characters (RFC 2821 4.5.3.1).

  6. A domain consists of labels separated by dot separators (RFC1035 2.3.1).

  7. Domain labels start with an alphabetic character followed by zero or more alphabetic characters, numeric characters or the hyphen (-), ending with an alphabetic or numeric character (RFC 1035 2.3.1).

  8. The maximum length of a label is 63 characters (RFC 1035 2.3.1).

  9. The maximum length of a domain is 255 characters (RFC 2821 4.5.3.1).

  10. The domain must be fully qualified and resolvable to a type A or type MX DNS address record (RFC 2821 3.6).

Requirement number four covers a now obsolete form that is arguably permissive. Agents issuing new addresses could legitimately disallow it; however, an existing address that uses this form remains a valid address.

The standard assumes a seven-bit character encoding, not multibyte characters. Consequently, according to RFC 2234, “alphabetic” corresponds to the Latin alphabet character ranges a–z and A–Z. Likewise, “numeric” refers to the digits 0–9. The lovely international standard Unicode alphabets are not accommodated—not even encoded as UTF-8. ASCII still rules here.

Developing a Better E-mail Validator

That's a lot of requirements! Most of them refer to the local part and domain. It makes sense, then, to start with splitting the e-mail address around the at sign separator. Requirements 2–5 apply to the local part, and 6–10 apply to the domain.

The at sign can be escaped in the local name. Examples are, Abc\@def@example.com and "Abc@def"@example.com. This means an explode on the at sign, $split = explode("@", $email); or another similar trick to separate the local and domain parts will not always work. We can try removing escaped at signs, $cleanat = str_replace("\\@", "");, but that will miss pathological cases, such as Abc\\@example.com. Fortunately, such escaped at signs are not allowed in the domain part. The last occurrence of the at sign must definitely be the separator. The way to separate the local and domain parts, then, is to use the strrpos function to find the last at sign in the e-mail string.

Listing 3 provides a better method for splitting the local part and domain of an e-mail address. The return type of strrpos will be boolean-valued false if the at sign does not occur in the e-mail string.

Let's start with the easy stuff. Checking the lengths of the local part and domain is simple. If those tests fail, there's no need to do the more complicated tests. Listing 4 shows the code for making the length tests.

Now, the local part has one of two forms. It may have a begin and end quote with no unescaped embedded quotes. The local part, Doug \"Ace\" L. is an example. The second form for the local part is, (a+(\.a+)*), where a stands for a whole slew of allowable characters. The second form is more common than the first; so, check for that first. Look for the quoted form after failing the unquoted form.

Characters quoted using the back slash (\@) pose a problem. This form allows doubling the back-slash character to get a back-slash character in the interpreted result (\\). This means we need to check for an odd number of back-slash characters quoting a non-back-slash character. We need to allow \\\\\@ and reject \\\\@.

It is possible to write a regular expression that finds an odd number of back slashes before a non-back-slash character. It is possible, but not pretty. The appeal is further reduced by the fact that the back-slash character is an escape character in PHP strings and an escape character in regular expressions. We need to write four back-slash characters in the PHP string representing the regular expression to show the regular expression interpreter a single back slash.

A more appealing solution is simply to strip all pairs of back-slash characters from the test string before checking it with the regular expression. The str_replace function fits the bill. Listing 5 shows a test for the content of the local part.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Simple & effective

Objektive Mieten's picture

Simple & effective:

if(!filter_var($email, FILTER_VALIDATE_EMAIL))
{
exit("E-mail is not valid");
}

PHP

voiture sans permis's picture

If you want a PHP script to verify an email address then use this quick and simple PHP regular expression for email validation. This is also case-insensitive, so it will treat all characters as lower case. It is a really easy way to check the syntax and format of an email address

There are a lot short

Mike Crowl's picture

There are a lot short description like this one, but usually application requires a lot validation and need have some common solution. And it will be good to have easy way to add this solution to ready/live project. For now I found Vitana validation solution only http://vitana-group.com/article/php/validation . It is good but I prefer to have a choice. Can anybody provide there link to another good validation functionality.

This article made my cry a river

Anonymous's picture

it doesn't even allow a gmail email address.

RFC 3696 has errata. This article is WRONG!

dominicsayers's picture

I can't believe LJ have allowed this article to remain here uncorrected for so long.

The examples given in this article are NOT VALID. John Klensin has corrected the RFC but you would only know this if you click the Errata link

I know of nothing that Doug Lovell or LJ have done to correct the misinformation on this page.

More information here: http://j.mp/isemail

filter_var()

Michael Rushton's picture

It might also be worth noting, for those who don't know, that PHP has an in-built function for validating email addresses:

filter_var($email_address, FILTER_VALIDATE_EMAIL)

RFC

Oleg Gerasimenko's picture

filter_var($email_address, FILTER_VALIDATE_EMAIL) doesn't cover all RFC specifications, for example, umlauts.

Umlauts are not allowed.

Michael Rushton's picture

Umlauts are not allowed.

FSOCKOPEN

icemanza's picture

Hi,

What are the merits of adding an fsockopen check on the domain as an additionla test at the end?

Regards
Mark

how can this be added to an

Anonymous's picture

how can this be added to an existing contact form/ contact php script?

Check if an email address is from a valid domain

Anonymous's picture

Wanted, but not needed is a module that would function sort of like a MySpace area for each user. Pictures, Videos, Blog, profiles, maybe some games, etc...

This function is great but

Saint's picture

This function is great but when I delete the DNS check it accept emails like something@something where the domain doesn't contain any "."

//PS. sorry for my english :P

.co.uk

Anonymous's picture

Hi thanks very much that ! Just one question - how to fix it to accept .co.uk email adressess ?

Thanks in advance!

function should work fine

Anonymous's picture

function should work fine with .co.uk addresses. As long as you give it something like someone@something.co.uk it will work fine. It just can't have a double dot, @some..thing.co.uk or start with a dot. So someone@.co.uk will fail because it isn't a properly formatted domain.

Validate an E-Mail Address with PHP, the Right Way

Anonymous's picture

Thanks for providing this article. So many google searches only produce "this works for me in my environment/work" without any explanation of how it does it's magic. This leads to others posting direct or modified copies, leading to errors being propagated.
On reading the comments one wonders if some actually read the full article.
I've little experience with regular expressions but it seems logical to me to search for the presence of the six or seven not allowed characters rather than for the non-presence of the other 120 odd character codes.
I will now soak myself in the explanation.

xy-domain.com is not passing - it doesn't like the dash

Anonymous's picture

Hi,

first thank you for this contribution. Finally someone who takes care of RFCs ;)

Now I am being not very good with the regex but when I test it wiht a domain and a dash in it, it doesn't work.

At this time I don't know what to change of this code:

else if (!preg_match('/^[A-Za-z0-9\\-\\.]+$/', $domain))(/code)

so if there is one person who can point out what to change this would be really appreciated.

Cheers

Try

Michael Rushton's picture

Try this:

'/^[A-Za-z0-9\.-]+$/'

Obviously there are problems with this; it allows consecutive dashes and periods, as well as allowing dashes and periods both next to each other and at the start and the end, and also allowing a TLD that starts with a number.

Without the slash before the

Michael Rushton's picture

Without the slash before the dot, of course...

'/^[A-Za-z0-9.-]+$/'

It looks like that particular

Anonymous's picture

It looks like that particular regular expression is broken. It allows backslashes in the domain name. The author forgot that this is a character class inside a single-tick PHP string and the rules that apply to the rest of a regular expression don't apply here. The part of the string '\\-\\' appears to be intended as an escaped backslash. However, PHP treats backslashes inside single-tick (') PHP strings differently from quoted strings ("). It actually doesn't do any escaping. So, the end result is the range of characters '\-\'. Now, the regular expression engine on my local PHP install appears to see that and then probably assumes the author screwed something up and allows '\' and '-'. However, it is also entirely possible on other PHP versions, such as yours, that the regular expression engine sees that as only allowing '\' characters.

That's the long-winded explanation. Now for the "solution"*:

else if (!preg_match('/^[A-Za-z0-9\-.]+$/', $domain))

Inside a character class, the '.' character is treated as a real '.' character. It does not need to be escaped. However, the '-' character is used for ranges of characters (e.g. 'A-Z'), so it has to be escaped.

* In general, regular expressions are used INCORRECTLY and are almost always broken in some fashion. Any time I see a preg_match(), I immediately assume that the code is broken and 99% of the time I'm right. Only preg_replace() is correct. Regular expressions are NOT for data validation but data FILTERING. preg_replace() is the only PHP function that meets that criteria. Barely. I still cringe when I see even that. So, the "solution" above is still technically broken. All I did was remove the weird issue documented at the start of my reply.

I see this and every other e-mail address validation routine as completely broken. Those with domain validation won't work on most Windows servers (checkdnsrr() is only available on Windows as of 5.3.x) and they all use preg_match() instead of preg_replace().

A good "validation" routine should clean up clearly broken addresses instead of just declaring them bad. Detecting a bad domain is easy - attempt to get the MX or A record for it (PEAR Net::DNS should work everywhere) and the number of characters in a domain is very limited. Detecting bad characters before the '@' is best left to the destination mail server. You can do some _obvious_ filtering but, other than that, just assume it will work.

Also, it appears that the maximum length of an e-mail address is 254 characters:

http://www.dominicsayers.com/isemail/

And near the bottom of that page, you can see that it appears the author there doesn't like this function either!

Of course, his function is only slightly better than yours - it too uses preg_match() and getdnsrr().

I once saw a regular expression to supposedly validate ALL e-mail addresses that was several pages long from some O'Reilly book. That alone should tell you that e-mail validation is HARD to IMPOSSIBLE. E-mail filtering, however, should be magnitudes easier.

I realize this must be hard to swallow that all these years people have been doing it all wrong, including yourself. But don't feel too bad! Everyone doing validation is in a similar-looking boat. They look similar because the boats completely missed the water!

Clarification and comments

Davnor's picture

The correction offered by "Anonymous" will work, but his/her explanation is a little off the mark.

The only actual problem with the original expression is that it incorrectly escapes the period inside the character class.

However, the issue of escaping a backslash inside a single-quoted string is entirely separate from the issue of escaping characters inside a regex character class. The evaluation of the escaped backslashes is performed by PHP's string processor, BEFORE the string is ever passed to the regex engine.

For example, consider the following two strings (not regular expressions):

'\\this is a backslash'
'\this is a backslash'

Both evaluate to:

\this is a backslash

However, it is still good practice to escape backslashes in single quoted strings, to avoid errors later. For example, let's say we decide to add some escaped single-quotes to the examples above:

'\\\'this is a backslash\''
'\\'this is a backslash\''

The second string now contains a syntax error, because the first backslash is now treated as the beginning of an escape sequence, instead of a literal backslash.

So, back to the regular expression. The original single quoted-string,

'/^[A-Za-z0-9\\-\\.]+$/'

gets resolved to the expression,

/^[A-Za-z0-9\-\.]+$/

which is then passed to the regex engine. Note that the dash is properly escaped, but the period is escaped too, which is not correct. So, we just need to take out the backslashes before the period,

'/^[A-Za-z0-9\\-.]+$/'

so it now resolves to,

/^[A-Za-z0-9\-.]+$/

which allows dashes and periods, but not backslashes, which is what we want. :-)

Again, the solution offered by Anonymous will work too, it's just not following the best practice, in my opinion.

Another option is put the dash at the end of the character class, which will cause the regex engine to treat it as a literal, so then nothing has to be escaped:

'/^[A-Za-z0-9.-]+$/'

On a separate note, I take exception to Anonymous's generalizations about the use of regular expressions. It is true, the subtleties of regular expressions can certainly trip up programmers, but I think it's unrealistic to automatically assume that most of the expressions in use are incorrect.

I am also genuinely baffled by this assertion:

"Regular expressions are NOT for data validation but data FILTERING."

Where did this idea come from? If you want to be precise, regular expressions are not for filtering either; they are designed for one purpose: pattern matching.

But what good does it do to match (or not match) a pattern if you can't do anything useful with that information. Both preg_match() and preg_replace() use a regular expression in the same way - they look for matching pattern(s); the only difference is what the functions do after a match is found or not found.

As far as performing validation, this simply means "check if the data is valid". But then you have to define what is valid in a given context. In this article, a valid email address is defined as follows:

  1. It meets the formal standard for the format of an email address.
  2. The domain part of the address can be verified by a DNS lookup.

To me, that seems like a pretty reasonable definition of a valid email address. And, since criterion (1) involves matching patterns, regular expressions are the perfect tool for the job.

At least, those are my two cents. [Stepping off of soapbox]

Spurious Character

Anonymous's picture

Sorry if this seems like a dumb question, but is that weird character -- immediately to the left of the second checkdnsrr reference -- spurious or intended? Only when I paste the code into my code editor it's just showing up as a square box (like: □).

It's a right hooked arrow

Mitch Frazier's picture

That's a "↪" HTML entity. It should look like:

rarrhk

Those are used in the print magazine (which is where this was originally published) to indicate that the line was wrapped during print layout because it was too long.

If you're not seeing that in your browser then you may have the character encoding set incorrectly. Try setting it to UTF-8.

Mitch Frazier is an Associate Editor for Linux Journal.

I am having the same problem.

Anonymous's picture

I am having the same problem. I am a newbie and do not now how to change the character setting to UTF-8. Can you please show me how to do this.

Thank you.

Browser

Mitch Frazier's picture

What browser are you using?

Mitch Frazier is an Associate Editor for Linux Journal.

Very cool

Anonymous's picture

This is an excellent article and filled alot of holes in my knowledge of email validation. Thank you so much for this

Andy

FALSE POSITIVES

Anonymous's picture

for some unknown reasons the used checkdnsrr() returns false positives on some systems even for clearly unexisting domains (for example jjjjjjjjjjjjjjjjjjj.cc etc /syntax ok but does not exist/) so I recommend changing that part to get_dns_record followed by count() like this :


// here some basic rules check and local part check
// before we try querying dns

if ($isValid===true) // if still ok
{ $records = get_dns_records($domain);
if (is_array($records) AND count($records)>0) // domain found in DNS
{ $isValid = true;}
else {$isValid = false;}
}
return $isValid

(imho this maybe be a little more hungry on high-load servers,
but as the get_dns_records() will not find any records for unexisting domains, this should be very reliable check)

Peminator

Always add a full stop to the

Anonymous's picture

Always add a full stop to the end of the domain, otherwise the server tries to make the domain valid by appending its own domain name.

http://www.php.net/manual/en/function.checkdnsrr.php#19508

Thank You & Topic Suggestion

Anonymous's picture

Thank you for your quality journal piece and explanation of all the considerations that really go into this issue. Very, very helpful.

Through most of 2009 I was tasked weekly to fix websites that had gone Viral ( in the bad way ).

Cause: JavaScript and SQL injection to weak forms. AND previous programmer teams who spent too much time downloading videos and games, often infecting their computers with viruses that would actually "read their FTP" files and/or "snoop FTP" logins.
Effect: All sites they ever worked on, show up as malware downloader sites overnight.

That's when I get "The Call"... usually at midnight, yeah?

This year... I've had to fix dozens of sites whose forms were getting "pounded" by form spammer bots. 80 to 1000 emails per day per site.

I will be applying your "Complete Example" above to a new set of sites and landing pages that have to go live this week, ready to accept traffic from AdWords. I want to stop the "Form Input Spam" bots and the "JavaScript Field Injection" Form Violator/Malware bots in their tracks ahead of time. While, of course, making sure that real leads get through, properly formatted!

Next I will be searching for: Phone Number entry format validators, a JavaScrip field Injection Stoppers, and pre-submit Human Tests ( ie Captcha ). All of which I wish to wrap up into a single PHP+JS library I can re-use for all of my upcoming sites.

My I suggest aanother article covering this "complete set" of form-related considerations as a full-scope topic, for website builders such as myself.

Thanks again,

James Arvigo
Website Admin
http://www.ServWiz.com
http://www.VernonsLandEscapes.com

Syntax question

Anonymous's picture

I was wondering what the ? operator placed directly before a function meant. It wasn't used in the if-else ? : way so it's confusing me. The line of code I'm looking at is where he uses checkdnsrr() for the second time near the bottom of the code. I copied and pasted and got ?checkdnsrr($domain,"A") I tried running the code and got... Oh, wait. I wonder if checkdnsrr doesn't work with my version of Windows. But I'm at XP right now and I thought it was older versions that it didn't work with. Well, I shall be back if my OS isn't the problem.

*scratches head*

Anonymous's picture

Ok, so checkdnsrr was only implemented at php 5.3.0 which I am running.

Great stuff

Anonymous's picture

I can't seem to use the checkdnsrr function in my local version but other than that it's great. If you can reply - to pieter .at. cheerful .dot. .com. , I didn't think that 'me@home' was a valid email address?

Thank you

Checking Mail box validity

Anonymous's picture

Hi All,

Good Morning,

function checkEmail($email) {
echo $email;
if(preg_match("/^([a-zA-Z0-9])+([a-zA-Z0-9\._-])*@([a-zA-Z0-9_-])+([a-zA-Z0-9\._-]+)+$/" , $email))
{
list($username,$domain)=split('@',$email);

if(!getmxrr ($domain,$mxhosts)) {
return false;
}
return true;
}
return false;
}

The Above code checks the validity of the email and also returns if
the domain exists. But I need to check the validity of the mail box

Example:
suresh981@gmail.com(Valid Mail Address)
s@gmail.com (Valid Mail Address) though the mail box doesn't not exists
can we validate the mail box exsistency....... Help me.

Regards,
Suresh

Are emails like

Anonymous's picture

Are emails like somelocal@.domain are valid? What about dot at the begining of domain name? preg_match('/^[A-Za-z0-9\\-\\.]+$/' is not enough I guess...

Modified regexp

Anonymous's picture

Hi, I modified the "favorite regular expression" to :

^[a-zA-Z0-9_.-]+@[a-zA-Z0-9][a-zA-Z0-9-.]+\.([a-zA-Z]{2,6})$

Actually, it now checks if the domain part begins with a letter and checks the extension (2 letters or more - I had many adresses ending with .f or .).

email testing list

Anonymous's picture

Hi, I was going through the list of emails given to test this validation function and when I entered abc\\@example.com the third email address down on the list of emails that should not be valid, it said it was valid and sent the email.

Is there something wrong? should this email address validate? Is there an error in the list of emails that should not pass validation?

just wondering :o/

where is the right address for this path http://www.actionscript

Nicolas's picture

Where is the right address for this path?

http://www.actionscript.tv/email.php

Regards.

A not about TLD's

Anonymous's picture

If you have added some non dns based TLD (top level Domain) checking - beware! It is highly probable that in the near future TLD's will be opened up and could get quite long.

email addresses like: myname@thisisareallylongtld could be valid, and under the spec they are legal (note the lack of .'s).

Assuming that all tld's will be 6 chars or less, just because that is presently true is short sighted. I have seen people use email checkers with an internal list of TLD's to compare to, and they have been frustrated out how out of date such lists become.

Email addresses like;
me@a
me@a.b.c
me@jhsdfjhasdlfjkhasdfiuyerkjbhsdjkfasdjfbalskdjfhasdfasdfasdfasdf
me@lh-lkj.lkj-lkj
me@a-b-c-d-e-f-g

Could potentially be used in the future - While DNS may be expensive, if you want to be certain that the domain is valid, DNS is the only way this can be achieved.

Hey, great article, but the

John O'Connell's picture

Hey, great article, but the link to IloveJackDaniels is no longer valid. His site has been moved to http://www.addedbytes.com

Didn't know some of the characters you showed were valid.. Thanks for the tips!

RFC 5322 email-validation regex

Michael Rushton's picture

Your code doesn't seem to disallow consecutive hyphens in the domain name, which it should (unless internationalized, in why case the domain address may begin: xn--).

The following regex, I believe, complies completely with RFC 5322. And also does away with the need for any further functions (save for checking to see if the domain name actually exists):

/^(?=.{5,254})(?:(?:\"[^\"]{1,62}\")|(?:(?!\.)(?!.*\.[.@-])[a-z0-9!#$%&'*+\/=?^_`{|}~^.-]{1,64}))@(?:(?:\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\])|(?:(?!-)(?!.*-\$)(?!.*-\.)(?!.*[^n]--)(?!.*[^x]n--)(?!n--)(?!.*[^.]xn--)(?:[a-z0-9-]{1,63}\.){1,127}(?:[a-z0-9-]{1,63})))$/i

Another (and hopefully final) update

Michael Rushton's picture

After looking over my regex, I decided it could do with some tidying up. Here is the cleaner version, included in a function:

function email($email) {
return preg_match("/^(?!.{255,})(?!.{65,}@)(?:(?:\"[^\"]{1,62}\")|(?:[a-z0-9!#$%&'*+\/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+\/=?^_`{|}~-]+)*))@(?:(?:\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]{1,2})\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]{1,2})\])|(?:(?!.*[^.]{64,})(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+)*\.){1,127}(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+)*))$/i", $email);
}

Update

Michael Rushton's picture

I've noticed a small error in my regex. It disallowed an address of the form a.-a.com, which is legal. Here is the update:

/^(?=.{5,254})(?:(?:\"[^\"]{1,62}\")|(?:(?!\.)(?!.*\.[.@])[a-z0-9!#$%&'*+\/=?^_`{|}~^.-]{1,64}))@(?:(?:\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\])|(?:(?!-)(?!.*-\$)(?!.*-\.)(?!.*\.-)(?!.*[^n]--)(?!.*[^x]n--)(?!n--)(?!.*[^.]xn--)(?:[a-z0-9-]{1,63}\.){1,127}(?:[a-z0-9-]{1,63})))$/i"

Not valid regex?

Anonymous's picture

I might doing something stupidly wrong but plugging your regex into PHP5.2 and using Douglas's test list, it rejects these that should pass

abc\@def@example.com is int(0)
abc\\@example.com is int(0)
Fred\ Bloggs@example.com is int(0)
Joe.\\Blow@example.com is int(0)
Doug\ \"Ace\"\ Lovell@example.com is int(0)
"Doug \"Ace\" L."@example.com is int(0)

The code used
$isValid=preg_match("/^(?=.{5,254})(?:(?:\"[^\"]{1,62}\")|(?:(?!\.)(?!.*\.[.@])[a-z0-9!#$%&'*+\/=?^_`{|}~^.-]{1,64}))@(?:(?:\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\])|(?:(?!-)(?!.*-\$)(?!.*-\.)(?!.*\.-)(?!.*[^n]--)(?!.*[^x]n--)(?!n--)(?!.*[^.]xn--)(?:[a-z0-9-]{1,63}\.){1,127}(?:[a-z0-9-]{1,63})))$/i", $email);

While my above regex is

Michael Rushton's picture

While my above regex is actually quite lacking (the updated version is available on my website, which I'm hoping will be linked to via my name or somewhere in this post), those email addresses are obsoleted by RFC 5322. True, they ought to be accepted as valid, but only for backward compatibility's sake. My validation was intended to allow for semantically-visible (no comments or folding white space) and non-obsolete email addresses.

Clarification

Michael Rushton's picture

Note: I meant to say that each LABEL of the domain name may begin with xn-- if internationalized.

Another Else If

Nate D's picture

I added another else if to the test because i noticed that bob@yahoo would pass. So I added a check to see if there is a dot 4 positions before the end of the domain section. View below:

else if(substr($domain,($domainLen - 4),1) != ".")
{
//no dot 4 positions before the end
$isValid = false;
}

make it check this twice one

Anonymous's picture

make it check this twice one time for position 2 then 3.

really?

Anonymous's picture

Whatabout .nl, .ca, .ru, .more or less anything? .com is way too limited I'd guess

It seems to have the grace to

greggy's picture

It seems to have the grace to allow uppercase alphabetic characters, and it doesn't make the error of assuming a high-level domain name has only two or three characters.
meizitang botanical

What do you think about it?

Aravak's picture

What do you think about filter_var function? It's standart php function for validate data.
You can see example in my blog - Email validation without regexp.

In Russian Проверка Email на php.

need to update the link to ilovejackdaniels

Anonymous's picture

ilovejackdaniels.com will die soon, the new link is here:

http://www.addedbytes.com/php/email-address-validation

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix