Validate an E-Mail Address with PHP, the Right Way
IETF documents, RFC 1035 “Domain Implementation and Specification”, RFC 2234 “ABNF for Syntax Specifications”, RFC 2821 “Simple Mail Transfer Protocol”, RFC 2822 “Internet Message Format”, in addition to RFC 3696 (referenced earlier), all contain information relevant to e-mail address validation. RFC 2822 supersedes RFC 822 “Standard for ARPA Internet Text Messages” and makes it obsolete.
Following are the requirements for an e-mail address, with relevant references:
An e-mail address consists of local part and domain separated by an at sign (@) character (RFC 2822 3.4.1).
The local part may consist of alphabetic and numeric characters, and the following characters: !, #, $, %, &, ', *, +, -, /, =, ?, ^, _, `, {, |, } and ~, possibly with dot separators (.), inside, but not at the start, end or next to another dot separator (RFC 2822 3.2.4).
The local part may consist of a quoted string—that is, anything within quotes ("), including spaces (RFC 2822 3.2.5).
Quoted pairs (such as \@) are valid components of a local part, though an obsolete form from RFC 822 (RFC 2822 4.4).
The maximum length of a local part is 64 characters (RFC 2821 4.5.3.1).
A domain consists of labels separated by dot separators (RFC1035 2.3.1).
Domain labels start with an alphabetic character followed by zero or more alphabetic characters, numeric characters or the hyphen (-), ending with an alphabetic or numeric character (RFC 1035 2.3.1).
The maximum length of a label is 63 characters (RFC 1035 2.3.1).
The maximum length of a domain is 255 characters (RFC 2821 4.5.3.1).
The domain must be fully qualified and resolvable to a type A or type MX DNS address record (RFC 2821 3.6).
Requirement number four covers a now obsolete form that is arguably permissive. Agents issuing new addresses could legitimately disallow it; however, an existing address that uses this form remains a valid address.
The standard assumes a seven-bit character encoding, not multibyte characters. Consequently, according to RFC 2234, “alphabetic” corresponds to the Latin alphabet character ranges a–z and A–Z. Likewise, “numeric” refers to the digits 0–9. The lovely international standard Unicode alphabets are not accommodated—not even encoded as UTF-8. ASCII still rules here.
That's a lot of requirements! Most of them refer to the local part and domain. It makes sense, then, to start with splitting the e-mail address around the at sign separator. Requirements 2–5 apply to the local part, and 6–10 apply to the domain.
The at sign can be escaped in the local name. Examples are, Abc\@def@example.com and "Abc@def"@example.com. This means an explode on the at sign, $split = explode("@", $email); or another similar trick to separate the local and domain parts will not always work. We can try removing escaped at signs, $cleanat = str_replace("\\@", "");, but that will miss pathological cases, such as Abc\\@example.com. Fortunately, such escaped at signs are not allowed in the domain part. The last occurrence of the at sign must definitely be the separator. The way to separate the local and domain parts, then, is to use the strrpos function to find the last at sign in the e-mail string.
Listing 3 provides a better method for splitting the local part and domain of an e-mail address. The return type of strrpos will be boolean-valued false if the at sign does not occur in the e-mail string.
Listing 3. Splitting the Local Part and Domain
$isValid = true;
$atIndex = strrpos($email, "@");
if (is_bool($atIndex) && !$atIndex)
{
$isValid = false;
}
else
{
$domain = substr($email, $atIndex+1);
$local = substr($email, 0, $atIndex);
// ... work with domain and local parts
}
Let's start with the easy stuff. Checking the lengths of the local part and domain is simple. If those tests fail, there's no need to do the more complicated tests. Listing 4 shows the code for making the length tests.
Listing 4. Length Tests for Local Part and Domain
$localLen = strlen($local);
$domainLen = strlen($domain);
if ($localLen < 1 || $localLen > 64)
{
// local part length exceeded
$isValid = false;
}
else if ($domainLen < 1 || $domainLen > 255)
{
// domain part length exceeded
$isValid = false;
}
Now, the local part has one of two forms. It may have a begin and end quote with no unescaped embedded quotes. The local part, Doug \"Ace\" L. is an example. The second form for the local part is, (a+(\.a+)*), where a stands for a whole slew of allowable characters. The second form is more common than the first; so, check for that first. Look for the quoted form after failing the unquoted form.
Characters quoted using the back slash (\@) pose a problem. This form allows doubling the back-slash character to get a back-slash character in the interpreted result (\\). This means we need to check for an odd number of back-slash characters quoting a non-back-slash character. We need to allow \\\\\@ and reject \\\\@.
It is possible to write a regular expression that finds an odd number of back slashes before a non-back-slash character. It is possible, but not pretty. The appeal is further reduced by the fact that the back-slash character is an escape character in PHP strings and an escape character in regular expressions. We need to write four back-slash characters in the PHP string representing the regular expression to show the regular expression interpreter a single back slash.
A more appealing solution is simply to strip all pairs of back-slash characters from the test string before checking it with the regular expression. The str_replace function fits the bill. Listing 5 shows a test for the content of the local part.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
- RSS Feeds
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Download the Free Red Hat White Paper "Using an Open Source Framework to Catch the Bad Guy"
- Tech Tip: Really Simple HTTP Server with Python
- Readers' Choice Awards
- Please correct the URL for Salt Stack's web site
18 min 5 sec ago - Android is Linux -- why no better inter-operation
2 hours 33 min ago - Connecting Android device to desktop Linux via USB
3 hours 1 min ago - Find new cell phone and tablet pc
4 hours 1 sec ago - Epistle
5 hours 28 min ago - Automatically updating Guest Additions
6 hours 37 min ago - I like your topic on android
7 hours 23 min ago - Reply to comment | Linux Journal
7 hours 45 min ago - This is the easiest tutorial
13 hours 59 min ago - Ahh, the Koolaid.
19 hours 38 min ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




Comments
Simple & effective
Simple & effective:
if(!filter_var($email, FILTER_VALIDATE_EMAIL))
{
exit("E-mail is not valid");
}
PHP
If you want a PHP script to verify an email address then use this quick and simple PHP regular expression for email validation. This is also case-insensitive, so it will treat all characters as lower case. It is a really easy way to check the syntax and format of an email address
There are a lot short
There are a lot short description like this one, but usually application requires a lot validation and need have some common solution. And it will be good to have easy way to add this solution to ready/live project. For now I found Vitana validation solution only http://vitana-group.com/article/php/validation . It is good but I prefer to have a choice. Can anybody provide there link to another good validation functionality.
This article made my cry a river
it doesn't even allow a gmail email address.
RFC 3696 has errata. This article is WRONG!
I can't believe LJ have allowed this article to remain here uncorrected for so long.
The examples given in this article are NOT VALID. John Klensin has corrected the RFC but you would only know this if you click the Errata link
I know of nothing that Doug Lovell or LJ have done to correct the misinformation on this page.
More information here: http://j.mp/isemail
filter_var()
It might also be worth noting, for those who don't know, that PHP has an in-built function for validating email addresses:
filter_var($email_address, FILTER_VALIDATE_EMAIL)RFC
filter_var($email_address, FILTER_VALIDATE_EMAIL) doesn't cover all RFC specifications, for example, umlauts.
Umlauts are not allowed.
Umlauts are not allowed.
FSOCKOPEN
Hi,
What are the merits of adding an fsockopen check on the domain as an additionla test at the end?
Regards
Mark
how can this be added to an
how can this be added to an existing contact form/ contact php script?
Check if an email address is from a valid domain
Wanted, but not needed is a module that would function sort of like a MySpace area for each user. Pictures, Videos, Blog, profiles, maybe some games, etc...
This function is great but
This function is great but when I delete the DNS check it accept emails like something@something where the domain doesn't contain any "."
//PS. sorry for my english :P
.co.uk
Hi thanks very much that ! Just one question - how to fix it to accept .co.uk email adressess ?
Thanks in advance!
function should work fine
function should work fine with .co.uk addresses. As long as you give it something like someone@something.co.uk it will work fine. It just can't have a double dot, @some..thing.co.uk or start with a dot. So someone@.co.uk will fail because it isn't a properly formatted domain.
Validate an E-Mail Address with PHP, the Right Way
Thanks for providing this article. So many google searches only produce "this works for me in my environment/work" without any explanation of how it does it's magic. This leads to others posting direct or modified copies, leading to errors being propagated.
On reading the comments one wonders if some actually read the full article.
I've little experience with regular expressions but it seems logical to me to search for the presence of the six or seven not allowed characters rather than for the non-presence of the other 120 odd character codes.
I will now soak myself in the explanation.
xy-domain.com is not passing - it doesn't like the dash
Hi,
first thank you for this contribution. Finally someone who takes care of RFCs ;)
Now I am being not very good with the regex but when I test it wiht a domain and a dash in it, it doesn't work.
At this time I don't know what to change of this code:
else if (!preg_match('/^[A-Za-z0-9\\-\\.]+$/', $domain))(/code)so if there is one person who can point out what to change this would be really appreciated.
Cheers
Try
Try this:
'/^[A-Za-z0-9\.-]+$/'Obviously there are problems with this; it allows consecutive dashes and periods, as well as allowing dashes and periods both next to each other and at the start and the end, and also allowing a TLD that starts with a number.
Without the slash before the
Without the slash before the dot, of course...
'/^[A-Za-z0-9.-]+$/'It looks like that particular
It looks like that particular regular expression is broken. It allows backslashes in the domain name. The author forgot that this is a character class inside a single-tick PHP string and the rules that apply to the rest of a regular expression don't apply here. The part of the string '\\-\\' appears to be intended as an escaped backslash. However, PHP treats backslashes inside single-tick (') PHP strings differently from quoted strings ("). It actually doesn't do any escaping. So, the end result is the range of characters '\-\'. Now, the regular expression engine on my local PHP install appears to see that and then probably assumes the author screwed something up and allows '\' and '-'. However, it is also entirely possible on other PHP versions, such as yours, that the regular expression engine sees that as only allowing '\' characters.
That's the long-winded explanation. Now for the "solution"*:
else if (!preg_match('/^[A-Za-z0-9\-.]+$/', $domain))
Inside a character class, the '.' character is treated as a real '.' character. It does not need to be escaped. However, the '-' character is used for ranges of characters (e.g. 'A-Z'), so it has to be escaped.
* In general, regular expressions are used INCORRECTLY and are almost always broken in some fashion. Any time I see a preg_match(), I immediately assume that the code is broken and 99% of the time I'm right. Only preg_replace() is correct. Regular expressions are NOT for data validation but data FILTERING. preg_replace() is the only PHP function that meets that criteria. Barely. I still cringe when I see even that. So, the "solution" above is still technically broken. All I did was remove the weird issue documented at the start of my reply.
I see this and every other e-mail address validation routine as completely broken. Those with domain validation won't work on most Windows servers (checkdnsrr() is only available on Windows as of 5.3.x) and they all use preg_match() instead of preg_replace().
A good "validation" routine should clean up clearly broken addresses instead of just declaring them bad. Detecting a bad domain is easy - attempt to get the MX or A record for it (PEAR Net::DNS should work everywhere) and the number of characters in a domain is very limited. Detecting bad characters before the '@' is best left to the destination mail server. You can do some _obvious_ filtering but, other than that, just assume it will work.
Also, it appears that the maximum length of an e-mail address is 254 characters:
http://www.dominicsayers.com/isemail/
And near the bottom of that page, you can see that it appears the author there doesn't like this function either!
Of course, his function is only slightly better than yours - it too uses preg_match() and getdnsrr().
I once saw a regular expression to supposedly validate ALL e-mail addresses that was several pages long from some O'Reilly book. That alone should tell you that e-mail validation is HARD to IMPOSSIBLE. E-mail filtering, however, should be magnitudes easier.
I realize this must be hard to swallow that all these years people have been doing it all wrong, including yourself. But don't feel too bad! Everyone doing validation is in a similar-looking boat. They look similar because the boats completely missed the water!
Clarification and comments
The correction offered by "Anonymous" will work, but his/her explanation is a little off the mark.
The only actual problem with the original expression is that it incorrectly escapes the period inside the character class.
However, the issue of escaping a backslash inside a single-quoted string is entirely separate from the issue of escaping characters inside a regex character class. The evaluation of the escaped backslashes is performed by PHP's string processor, BEFORE the string is ever passed to the regex engine.
For example, consider the following two strings (not regular expressions):
'\\this is a backslash'
'\this is a backslash'
Both evaluate to:
\this is a backslash
However, it is still good practice to escape backslashes in single quoted strings, to avoid errors later. For example, let's say we decide to add some escaped single-quotes to the examples above:
'\\\'this is a backslash\''
'\\'this is a backslash\''
The second string now contains a syntax error, because the first backslash is now treated as the beginning of an escape sequence, instead of a literal backslash.
So, back to the regular expression. The original single quoted-string,
'/^[A-Za-z0-9\\-\\.]+$/'
gets resolved to the expression,
/^[A-Za-z0-9\-\.]+$/
which is then passed to the regex engine. Note that the dash is properly escaped, but the period is escaped too, which is not correct. So, we just need to take out the backslashes before the period,
'/^[A-Za-z0-9\\-.]+$/'
so it now resolves to,
/^[A-Za-z0-9\-.]+$/
which allows dashes and periods, but not backslashes, which is what we want. :-)
Again, the solution offered by Anonymous will work too, it's just not following the best practice, in my opinion.
Another option is put the dash at the end of the character class, which will cause the regex engine to treat it as a literal, so then nothing has to be escaped:
'/^[A-Za-z0-9.-]+$/'
On a separate note, I take exception to Anonymous's generalizations about the use of regular expressions. It is true, the subtleties of regular expressions can certainly trip up programmers, but I think it's unrealistic to automatically assume that most of the expressions in use are incorrect.
I am also genuinely baffled by this assertion:
"Regular expressions are NOT for data validation but data FILTERING."
Where did this idea come from? If you want to be precise, regular expressions are not for filtering either; they are designed for one purpose: pattern matching.
But what good does it do to match (or not match) a pattern if you can't do anything useful with that information. Both preg_match() and preg_replace() use a regular expression in the same way - they look for matching pattern(s); the only difference is what the functions do after a match is found or not found.
As far as performing validation, this simply means "check if the data is valid". But then you have to define what is valid in a given context. In this article, a valid email address is defined as follows:
To me, that seems like a pretty reasonable definition of a valid email address. And, since criterion (1) involves matching patterns, regular expressions are the perfect tool for the job.
At least, those are my two cents. [Stepping off of soapbox]
Spurious Character
Sorry if this seems like a dumb question, but is that weird character -- immediately to the left of the second checkdnsrr reference -- spurious or intended? Only when I paste the code into my code editor it's just showing up as a square box (like: □).
It's a right hooked arrow
That's a "↪" HTML entity. It should look like:
Those are used in the print magazine (which is where this was originally published) to indicate that the line was wrapped during print layout because it was too long.
If you're not seeing that in your browser then you may have the character encoding set incorrectly. Try setting it to UTF-8.
Mitch Frazier is an Associate Editor for Linux Journal.
I am having the same problem.
I am having the same problem. I am a newbie and do not now how to change the character setting to UTF-8. Can you please show me how to do this.
Thank you.
Browser
What browser are you using?
Mitch Frazier is an Associate Editor for Linux Journal.
Very cool
This is an excellent article and filled alot of holes in my knowledge of email validation. Thank you so much for this
Andy
FALSE POSITIVES
for some unknown reasons the used checkdnsrr() returns false positives on some systems even for clearly unexisting domains (for example jjjjjjjjjjjjjjjjjjj.cc etc /syntax ok but does not exist/) so I recommend changing that part to get_dns_record followed by count() like this :
// here some basic rules check and local part check
// before we try querying dns
if ($isValid===true) // if still ok
{ $records = get_dns_records($domain);
if (is_array($records) AND count($records)>0) // domain found in DNS
{ $isValid = true;}
else {$isValid = false;}
}
return $isValid
(imho this maybe be a little more hungry on high-load servers,
but as the get_dns_records() will not find any records for unexisting domains, this should be very reliable check)
Peminator
Always add a full stop to the
Always add a full stop to the end of the domain, otherwise the server tries to make the domain valid by appending its own domain name.
http://www.php.net/manual/en/function.checkdnsrr.php#19508
Thank You & Topic Suggestion
Thank you for your quality journal piece and explanation of all the considerations that really go into this issue. Very, very helpful.
Through most of 2009 I was tasked weekly to fix websites that had gone Viral ( in the bad way ).
Cause: JavaScript and SQL injection to weak forms. AND previous programmer teams who spent too much time downloading videos and games, often infecting their computers with viruses that would actually "read their FTP" files and/or "snoop FTP" logins.
Effect: All sites they ever worked on, show up as malware downloader sites overnight.
That's when I get "The Call"... usually at midnight, yeah?
This year... I've had to fix dozens of sites whose forms were getting "pounded" by form spammer bots. 80 to 1000 emails per day per site.
I will be applying your "Complete Example" above to a new set of sites and landing pages that have to go live this week, ready to accept traffic from AdWords. I want to stop the "Form Input Spam" bots and the "JavaScript Field Injection" Form Violator/Malware bots in their tracks ahead of time. While, of course, making sure that real leads get through, properly formatted!
Next I will be searching for: Phone Number entry format validators, a JavaScrip field Injection Stoppers, and pre-submit Human Tests ( ie Captcha ). All of which I wish to wrap up into a single PHP+JS library I can re-use for all of my upcoming sites.
My I suggest aanother article covering this "complete set" of form-related considerations as a full-scope topic, for website builders such as myself.
Thanks again,
James Arvigo
Website Admin
http://www.ServWiz.com
http://www.VernonsLandEscapes.com
Syntax question
I was wondering what the ? operator placed directly before a function meant. It wasn't used in the if-else ? : way so it's confusing me. The line of code I'm looking at is where he uses checkdnsrr() for the second time near the bottom of the code. I copied and pasted and got ?checkdnsrr($domain,"A") I tried running the code and got... Oh, wait. I wonder if checkdnsrr doesn't work with my version of Windows. But I'm at XP right now and I thought it was older versions that it didn't work with. Well, I shall be back if my OS isn't the problem.
*scratches head*
Ok, so checkdnsrr was only implemented at php 5.3.0 which I am running.
Great stuff
I can't seem to use the checkdnsrr function in my local version but other than that it's great. If you can reply - to pieter .at. cheerful .dot. .com. , I didn't think that 'me@home' was a valid email address?
Thank you
Checking Mail box validity
Hi All,
Good Morning,
function checkEmail($email) {
echo $email;
if(preg_match("/^([a-zA-Z0-9])+([a-zA-Z0-9\._-])*@([a-zA-Z0-9_-])+([a-zA-Z0-9\._-]+)+$/" , $email))
{
list($username,$domain)=split('@',$email);
if(!getmxrr ($domain,$mxhosts)) {
return false;
}
return true;
}
return false;
}
The Above code checks the validity of the email and also returns if
the domain exists. But I need to check the validity of the mail box
Example:
suresh981@gmail.com(Valid Mail Address)
s@gmail.com (Valid Mail Address) though the mail box doesn't not exists
can we validate the mail box exsistency....... Help me.
Regards,
Suresh
Are emails like
Are emails like somelocal@.domain are valid? What about dot at the begining of domain name?
preg_match('/^[A-Za-z0-9\\-\\.]+$/'is not enough I guess...Modified regexp
Hi, I modified the "favorite regular expression" to :
^[a-zA-Z0-9_.-]+@[a-zA-Z0-9][a-zA-Z0-9-.]+\.([a-zA-Z]{2,6})$Actually, it now checks if the domain part begins with a letter and checks the extension (2 letters or more - I had many adresses ending with .f or .).
email testing list
Hi, I was going through the list of emails given to test this validation function and when I entered abc\\@example.com the third email address down on the list of emails that should not be valid, it said it was valid and sent the email.
Is there something wrong? should this email address validate? Is there an error in the list of emails that should not pass validation?
just wondering :o/
where is the right address for this path http://www.actionscript
Where is the right address for this path?
http://www.actionscript.tv/email.php
Regards.
A not about TLD's
If you have added some non dns based TLD (top level Domain) checking - beware! It is highly probable that in the near future TLD's will be opened up and could get quite long.
email addresses like: myname@thisisareallylongtld could be valid, and under the spec they are legal (note the lack of .'s).
Assuming that all tld's will be 6 chars or less, just because that is presently true is short sighted. I have seen people use email checkers with an internal list of TLD's to compare to, and they have been frustrated out how out of date such lists become.
Email addresses like;
me@a
me@a.b.c
me@jhsdfjhasdlfjkhasdfiuyerkjbhsdjkfasdjfbalskdjfhasdfasdfasdfasdf
me@lh-lkj.lkj-lkj
me@a-b-c-d-e-f-g
Could potentially be used in the future - While DNS may be expensive, if you want to be certain that the domain is valid, DNS is the only way this can be achieved.
Hey, great article, but the
Hey, great article, but the link to IloveJackDaniels is no longer valid. His site has been moved to http://www.addedbytes.com
Didn't know some of the characters you showed were valid.. Thanks for the tips!
RFC 5322 email-validation regex
Your code doesn't seem to disallow consecutive hyphens in the domain name, which it should (unless internationalized, in why case the domain address may begin: xn--).
The following regex, I believe, complies completely with RFC 5322. And also does away with the need for any further functions (save for checking to see if the domain name actually exists):
/^(?=.{5,254})(?:(?:\"[^\"]{1,62}\")|(?:(?!\.)(?!.*\.[.@-])[a-z0-9!#$%&'*+\/=?^_`{|}~^.-]{1,64}))@(?:(?:\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\])|(?:(?!-)(?!.*-\$)(?!.*-\.)(?!.*[^n]--)(?!.*[^x]n--)(?!n--)(?!.*[^.]xn--)(?:[a-z0-9-]{1,63}\.){1,127}(?:[a-z0-9-]{1,63})))$/iAnother (and hopefully final) update
After looking over my regex, I decided it could do with some tidying up. Here is the cleaner version, included in a function:
function email($email) {return preg_match("/^(?!.{255,})(?!.{65,}@)(?:(?:\"[^\"]{1,62}\")|(?:[a-z0-9!#$%&'*+\/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+\/=?^_`{|}~-]+)*))@(?:(?:\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]{1,2})\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]{1,2})\])|(?:(?!.*[^.]{64,})(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+)*\.){1,127}(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+)*))$/i", $email);
}
Update
I've noticed a small error in my regex. It disallowed an address of the form a.-a.com, which is legal. Here is the update:
/^(?=.{5,254})(?:(?:\"[^\"]{1,62}\")|(?:(?!\.)(?!.*\.[.@])[a-z0-9!#$%&'*+\/=?^_`{|}~^.-]{1,64}))@(?:(?:\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\])|(?:(?!-)(?!.*-\$)(?!.*-\.)(?!.*\.-)(?!.*[^n]--)(?!.*[^x]n--)(?!n--)(?!.*[^.]xn--)(?:[a-z0-9-]{1,63}\.){1,127}(?:[a-z0-9-]{1,63})))$/i"
Not valid regex?
I might doing something stupidly wrong but plugging your regex into PHP5.2 and using Douglas's test list, it rejects these that should pass
abc\@def@example.com is int(0)
abc\\@example.com is int(0)
Fred\ Bloggs@example.com is int(0)
Joe.\\Blow@example.com is int(0)
Doug\ \"Ace\"\ Lovell@example.com is int(0)
"Doug \"Ace\" L."@example.com is int(0)
The code used
$isValid=preg_match("/^(?=.{5,254})(?:(?:\"[^\"]{1,62}\")|(?:(?!\.)(?!.*\.[.@])[a-z0-9!#$%&'*+\/=?^_`{|}~^.-]{1,64}))@(?:(?:\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9]?[0-9])\])|(?:(?!-)(?!.*-\$)(?!.*-\.)(?!.*\.-)(?!.*[^n]--)(?!.*[^x]n--)(?!n--)(?!.*[^.]xn--)(?:[a-z0-9-]{1,63}\.){1,127}(?:[a-z0-9-]{1,63})))$/i", $email);
While my above regex is
While my above regex is actually quite lacking (the updated version is available on my website, which I'm hoping will be linked to via my name or somewhere in this post), those email addresses are obsoleted by RFC 5322. True, they ought to be accepted as valid, but only for backward compatibility's sake. My validation was intended to allow for semantically-visible (no comments or folding white space) and non-obsolete email addresses.
Clarification
Note: I meant to say that each LABEL of the domain name may begin with xn-- if internationalized.
Another Else If
I added another else if to the test because i noticed that bob@yahoo would pass. So I added a check to see if there is a dot 4 positions before the end of the domain section. View below:
else if(substr($domain,($domainLen - 4),1) != ".")
{
//no dot 4 positions before the end
$isValid = false;
}
make it check this twice one
make it check this twice one time for position 2 then 3.
really?
Whatabout .nl, .ca, .ru, .more or less anything? .com is way too limited I'd guess
It seems to have the grace to
It seems to have the grace to allow uppercase alphabetic characters, and it doesn't make the error of assuming a high-level domain name has only two or three characters.
meizitang botanical
What do you think about it?
What do you think about filter_var function? It's standart php function for validate data.
You can see example in my blog - Email validation without regexp.
In Russian Проверка Email на php.
need to update the link to ilovejackdaniels
ilovejackdaniels.com will die soon, the new link is here:
http://www.addedbytes.com/php/email-address-validation