Validate an E-Mail Address with PHP, the Right Way

Develop a working PHP function to validate e-mail addresses.

Spread the word! There is some danger that common usage and widespread sloppy coding will establish a de facto standard for e-mail addresses that is more restrictive than the recorded formal standard. If you want to fool the spambots, adopt an e-mail address like, {^c\@**Dog^}@cartoon.com. Unfortunately, you might fool some legitimate e-commerce sites as well. Which do you suppose will adapt more quickly?

Douglas Lovell is a software engineer with IBM Research, author of The XSL Formatting Objects Developer's Handbook published by Sams, and Web site editor for iac52.org.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Great article, just a slight fix

David's picture

This is terrific.

There is some sort of typo in the part of the code in Listing 9 where you check the A and MX DNS records, which make this break as written.

Changing:
if ($isValid && !(checkdnsrr($domain,"MX") ||
↪checkdnsrr($domain,"A")))

To:
if ($isValid && !((checkdnsrr($domain,"MX")) ||
(checkdnsrr($domain,"A"))))

seems to make it work.

your fix works for me too

cruzanmo's picture

Thanks for the awesome script!

I ran into the same error with that line, and your fix made it work for me too!

Your format validation code

Anonymous's picture

Your format validation code will inappropriately permit an all numeric TLD.
“There is an additional rule that essentially requires that top-level domain names not be all- numeric.“ - RFC 3696 - 2

http://SimonSlick.com/VEAF/ValidateEmailAddressFormat.html

Sure your DNS lookup up would fail, but what is the point of validating the format if you are just going to do a DNS lookup anyway for a domain name that should have already been deemed invalid by the format validation code.

Format validation and existence verification (DNS lookup) serve different purposes, and just because a domain name does not exist does not mean the format is not valid.

There are so many holes in your code, whoever paid you for this write-up is highly deserving of a total refund. If you are going to title such an article as "... the Right Way", you could at least do it the Right Way.

The code at http://SimonSlick.com/VEAF/ValidateEmailAddressFormat.html is actually better, and even includes code for verifying actual existence of an eMailbox.

simonslick.com/veaf is busted

Anonymous's picture

The code at simonslick.com is wrong -- it does not seem to match the RFC at all. Just try the examples given in this article as well as more common cases like:

foo+bar@example.com
foo%bar@example.com
foo <bar@example.com>
(foo) bar@example.com

Working Code & Extensive Regular Expressions

Anonymous's picture

Working Code with Extensive use of Regular Expressions for validating email address format.

Check it out and see if you can find any faults.

http://SimonSlick.com/VEAF/ValidateEmailAddressFormat.html

Email address validation head-to-head

Dominic Sayers's picture

Yes, there are some faults with the Simon Slick code. It's also worth pointing out that both Simon Slick and Doug Lovell's code is copyright All Rights Reserved. You can't use it in your project.

I've written about some public-domain validation functions here: http://www.dominicsayers.com/isemail/

The Simon Slick code fails on some of the examples in RFC3696.

As some of these comments have pointed out, there are a lot of RFCs that cover this ground. For what it's worth, I believe my function complies with RFCs 1123, 2396, 3696, 4291, 4343, 5321 & 5322.

RFC Compliance

NOYB's picture

RFC Compliance

Backslash is not an RFC compliant component of an non-quoted email address local-part. May have been in the past, but not anymore, and has not be since the publication of RFC 2822 (2001). Move on folks.

This is also reinforced by RFC 3696 (2004).
http://tools.ietf.org/html/rfc3696

3. Restrictions on email addresses

Without quotes, local-parts may consist of any combination of
alphabetic characters, digits, or any of the special characters

! # $ % & ' * + - / = ? ^ _ ` . { | } ~

period (".") may also appear, but may not be used to start or end the
local part, nor may two or more consecutive periods appear. Stated
differently, any ASCII graphic (printing) character other than the
at-sign ("@"), backslash, double quote, comma, or square brackets may
appear without quoting. If any of that list of excluded characters
are to appear, they must be quoted.

Also see the RFC3696 Errata
http://www.rfc-editor.org/cgi-bin/errataSearch.pl?rfc=3696

These are not RFC compliant:
Fred\ Bloggs@example.com
Joe.\\Blow@example.com

And should have read as:
"Fred\ Bloggs"@example.com
"Joe.\\Blow"@example.com

Also, "the upper limit on address lengths (local-part@domain-part) should normally be considered to be 256."

And as someone already alluded to, the domain name is now, for quite some time I might add, allowed to begin with a digit.

You need to update your code, test data and this article.

RFC Compliance

NOYB's picture

Also the quoted string check appears would allow null (x00). According to RFC 2822 3.2.5. Quoted strings and 3.2.1. Primitive Tokens the permitted NO-WS-CTL characters are x01-x08, x0B, x0E-x1F, x7F. This does not include the null character x00.

RFC 2822
4.1. Miscellaneous obsolete tokens
The obs-char and obs-qp elements each add ASCII value 0.

Appendix B. Differences from earlier standards
Items marked with an asterisk (*) below are items which
appear in section 4 of this document and therefore can no longer be
generated.
12. ASCII 0 (null) removed.*

Challenge

Tom Burt's picture

So should the e-mail address someone@3com.com be accepted or not? It fails to satisfy requirement #7 above but I guess the code in Listing 9 would accept it.
Tom

It is working good

coderbari's picture

I tested it with many options and its working fine :D

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState