Validating an Email Address with Regular Expressions

hoangvu · Jul 26, 2012

These few lines of JavaScript go a long way to validate email addresses.

Code:

window.onload = initForms;

function initForms() {
     for (var i=0; i< document.forms.length; i++) {
         document.forms[i].onsubmit = function() {return validForm();}
     }
}

function validForm() {
     var allGood = true;
     var allTags = document.getElementsByTagName ("*");

     for (var i=0; i<allTags.length; i++) {
        if (!validTag(allTags[i])) {
           allGood = false;
        }
     }
     return allGood;

     function validTag(thisTag) {
        var outClass = "";
        var allClasses = thisTag.className.split (" ");

        for (var j=0; j<allClasses.length; j++) {
           outClass += validBasedOnClass(allClasses[j]) + " ";
        }

        thisTag.className = outClass;

        if (outClass.indexOf("invalid") > -1) {
           invalidLabel(thisTag.parentNode);
           thisTag.focus();
           if (thisTag.nodeName == "INPUT") {
              thisTag.select();
           }
           return false;
        }
           return true;

           function validBasedOnClass(thisClass) {
              var classBack = "";

              switch(thisClass) {
                 case "":
                 case "invalid":
                    break;
                 case "email":
                    if (allGood && !validEmail (thisTag.value)) classBack = "invalid ";
                 default:
                    classBack += thisClass;
              }
              return classBack;
           }

           function validEmail(email) {
              var re = /^\w+([\.-]?\w+)*@\w+  ([\.-]?\w+)*(\.\w{2,3})+$/;

              return re.test(email);
           }

           function invalidLabel(parentTag) {
              if (parentTag.nodeName == "LABEL") {
                 parentTag.className += " invalid";
              }
          }
      }
}

The HTML for the email validation example.

Code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
     <title>Email Validation</title>
     <link rel="stylesheet" href="script01.css" />
     <script language="Javascript" type="text/ javascript" src="script01.js">
     </script>
</head>
<body>
     <h2 align="center">Email Validation</h2>
     <form action="someAction.cgi">
         <p><label>Email Address:
         <input class="email" type="text" size="50" /></label></p>
         <p><input type="reset" /> <input type="submit" value="Submit" /></p>
     </form>
</body>
</html>

To validate an email address using regular expressions:

Code:

var re = /^\w+([\.-]?\w+)*@\w+ ([\.-]?\w+)*(\.\w{2,3})+$/;

Yow! What on earth is this? Don't panic; it's just a regular expression in the validEmail() function. Let's break it apart and take it piece by piece. Like any line of JavaScript, you read a regular expression from left to right.

First, re is just a variable. We've given it the name re so that when we use it later, we'll remember that it's a regular expression. The line sets the value of re to the regular expression on the right side of the equals sign.

A regular expression always begins and ends with a slash, / (of course, there is still a semicolon here, to denote the end of the JavaScript line, but the semicolon is not part of the regular expression). Everything in between the slashes is part of the regular expression.

The caret ^ means that we're going to use this expression to examine a string starting at the string's beginning. If the caret was left off, the email address might show as valid even though there was a bunch of garbage at the beginning of the string.

The expression \w means any one character, "a" through "z", "A" through "Z", "0" through "9", or underscore. An email address must start with one of these characters.

The plus sign + means one or more of whatever the previous item was that we're checking on. In this case, an email address must start with one or more of any combination of the characters "a" through "z", "A" through "Z", "0" through "9", or underscore.

The opening parenthesis ( signifies a group. It means that we're going to want to refer to everything inside the parentheses in some way later, so we put them into a group now.

The brackets [] are used to show that we can have any one of the characters inside. In this example, the characters \.- are inside the brackets. We want to allow the user to enter either a period or a dash, but the period has a special meaning to regular expressions, so we need to preface it with a backslash \ to show that we really want to refer to the period itself, not its special meaning. Using a backslash before a special character is called escaping that character. Because of the brackets, the entered string can have either a period or a dash here, but not both. Note that the dash doesn't stand for any special character, just itself.

The question mark ? means that we can have zero or one of the previous item. So along with it being okay to have either a period or a dash in the first part of the email address (the part before the @), it's also okay to have neither.

Following the ?, we once again have \w+, which says that the period or dash must be followed by some other characters.

The closing parenthesis ) says that this is the end of the group. That's followed by an asterisk *, which means that we can have zero or more of the previous itemin this case, whatever was inside the parentheses. So while "dori" is a valid email prefix, so is "testing-testing-1-2-3".

The @ character doesn't stand for anything besides itself, located between the email address and the domain name.

The \w+ once again says that a domain name must start with one or more of any character "a" through "z", "A" through "Z", "0" through "9", or underscore. That's again followed by ([\.-]?\w+)* which says that periods and dashes are allowed within the suffix of an email address.

We then have another group within a set of parentheses: \.\w{2,3} which says that we're expecting to find a period followed by characters. In this case, the numbers inside the braces mean either 2 or 3 of the previous item (in this case the \w, meaning a letter, number, or underscore). Following the right parenthesis around this group is a +, which again means that the previous item (the group, in this case) must exist one or more times. This will match ".com" or ".edu", for instance, as well as "ox.ac.uk".

And finally, the regular expression ends with a dollar sign $, which signifies that the matched string must end here. This keeps the script from validating an email address that starts off properly but contains garbage characters at the end. The slash closes the regular expression. The semicolon ends the JavaScript statement, as usual.

Code:

return re.test(email);

This single line takes the regular expression defined in the previous step and uses the test() method to check the validity of email.

Validating an Email Address with Regular Expressions

hoangvu

New member

Latest posts New threads

Latest posts

New threads

Referral contests

Refer Your Friends to WebmasterSun to Win Cash and Prizes!

Latest Offers New Reviews

Latest affiliate promotion

Popular Contributors - Past 30 Days

Sponsors

Popular tags