Checking for valid IP addresses using JavaScript Regular Expressions

There are times when an Internet Protocol (IP) address needs to be verified/validated. What are examples of valid IP addresses?

127.0.0.1
1.2.3.4
192.168.1.100

The pattern is that we have 4 integer values (from 0 to 255) separated by periods. Can we use a really simple regular expression (RegExp) like this?

var ipRE = new RegExp( '^\d+\.\d+\.\d+\.\d$' );

What does this RegExp mean, and how is it supposed to work?

^ - This symbol matches the beginning of the string.
\d - This meta-character matches a single digit (i.e., 0 to 9).
+ - This symbol says to repeat the preceding pattern or symbol 1 or more times (i.e., 1 or more digits).
\. - This sequence is needed to match a period. Since a period has special meaning in a RegExp, we have to precede it with the backslash to indicate that we don't want the special meaning, we really want to match a period.
$ - This symbol matches the end of the string.

So, this RegExp will match a string containing 4 integer values, separated by periods. How can we check it out to see if it satisfies our requirements?

<html>
                      <head>
                      <title>IP address validation</title>
                      <script type='text/javascript'>
                        function validate( value ) {
                          var ipRE = new RegExp( '^\d+\.\d+\.\d+\.\d+$' );
                          alert( ( ipRE.test( value ) ? '' : 'in' ) + 'valid' );
                        }
                      </script>
                      </head>
                      <body>
                      Address: <input type='text' onchange='validate(this.value)'>
                      </body>
                      </html>

Open in new window

When we try this simple page, and enter the simplest of IP addresses, (i.e., "0.0.0.0"), and press the Tab key to have the validate() function execute, we will probably be surprised to see the alert dialog box say "invalid". What?!? How did we get this simple RegExp wrong? The answer is that we forgot that the escape character (i.e., the backslash) that is used to identify one of the RegExp meta-characters needs to be present when we create the RegExp. Huh?

Alright, if you have a trivial alert() string containing '\d', what gets displayed? A "d". In order to have the string being passed to the RegExp() constructor contain the text we want, we have to escape the backslash (i.e., '\\d'), for each and every backslash in the expression... (sigh). So, what we really needed to do was to have the RegExp assignment in the code be:

var ipRE = new RegExp( '^\\d+\\.\\d+\\.\\d+\\.\\d+$' );

Does that fix it? Yes, but it's not quite that simple. What happens if we specify an octet value that is too big? What's an octet? It's one of those 4 integer values between periods. Each octet may only have a value from 0..255. So, let's try an obviously invalid address of 999.999.999.999. Unfortunately, the RegExp, as written says that this is valid. Why? Because it allows each octet to be any positive integer. As long as each octet has 1 or more digits, it matches the RegExp pattern as written.

How do we fix that? Let's begin by figuring out how to validate an octet using a RegExp. We know that we have to have at least 1 digit, so this is simply '\d' (in order to make the regular expressions easier to read, we're going to leave out the double backslashes until we absolutely need them).

So far, we know that /^\d$/ can be used to validate that we have 1, and only 1 digit. That takes care of the values from 0-9. Do we want to allow a '0' in the tens position (i.e., is '01' ok)? No, so in order to check a valid 2 digit number (i.e., from 0..99), we need a slightly more complex RegExp:

var TwoDigits = /^[1-9]\d|\d$/:

Does this work? Let's test it.

<html>
                      <head>
                      <title>0..99 validation</title>
                      <script type='text/javascript'>
                        function validate( value ) {
                          var TwoDigits = new RegExp( '^[1-9]\\d|\\d$' );
                          alert( TwoDigits.test( value ) );
                        }
                      </script>
                      </head>
                      <body>
                      0..99: <input type='text' onchange='validate(this.value)'>
                      </body>
                      </html>

Open in new window

When we put in single digit values the result is true as expected. What is unexpected is the fact that 100 is also "valid". What is going on? Let's try some different values to see what is going on. Additional testing shows that any two digit values from 10..99 followed by anything (e.g., 10x), and anything followed by a single digit (e.g., A0) are also being seen as valid.

This means that the or operator (i.e., the '|') is of a lesser precedence. This means that when we thought that the RegExp was being interpreted as this:

/^([1-9]\d|\d)$/

(i.e., the start of string followed by either two digits (the first of which can't be a zero), or a single digit, followed by the end of string). It was, in fact, being interpreted as this:

/(^[1-9]\d)|(\d$)/

(i.e., the start of string followed by two digits, or a single digit followed by the end of string) which is a totally different expression, and not what we intended, or wanted. So, in order to fix it, we need to add parentheses, which changes our RegExp definition to:

var TwoDigits = new RegExp( '^([1-9]\\d|\\d)$' );

When we test this, we get the values we expected. So, this RegExp can be used to verify a number from 0..99.

Next, we have to figure out how to get the RegExp to allow only values from 100..255. Building on the two digit expression leads us to the following ranges of values to be matched:

250..255 = 25[0-5]
200..249 = 2[0-4][0-9] or 2[0-4]\d
100..199 = 1[0-9][0-9] or 1\d\d
10..99 = [1-9][0-9] or [1-9]\d
0..9 = [0-9] or \d

I don't know about you, but I find switching back and forth between bracketed groups and escaped meta sequences a little hard to read. Additionally, by only using bracket groups, we don't have to worry about, or remember to double the backslashes. Because of that, I prefer this regular expression for these 5 ranges:

var octet = '^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])$';

and we can verify this RegExp using the following page.

<html>
                      <head>
                      <title> octet </title>
                      </head>
                      <body>
                      <script type='text/javascript'>
                        var octet = /^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])$/;
                        for ( var i = -1; i < 257; i++ ) {
                          if ( ! octet.test( '' + i ) ) {
                            document.write( i + '<br>' );
                          }
                        }
                      </script>
                      </body>
                      </html>

Open in new window

The output for which shows that -1 and 256 are the only invalid values from -1..256. This is exactly what we want. One thing to note, however, is that by using the parentheses, we are creating a capturing group. While we're creating a regular expression to validate an IP address, it is unlikely that we will want, or need the individual octets matched. So, how do we change this into a non-capturing group? By changing the simple parentheses ^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])$ to use this syntax: (?:...) instead. So, the RegExp becomes:

var octet = /^(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])$/;

And the big question that remains is:

How do we build a RegExp to recognize an IP address that is composed of 4, dot (period) separated octets?

One way is to build it using the existing octet definition:

var ip = '(?:' + octet + '\\.){3}' + octet;

When put into a complete script looks like:

<html>
                      <head>
                      <title> octet </title>
                      <script type='text/javascript'>
                        var octet = '(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])';
                        var ip    = '(?:' + octet + '\\.){3}' + octet;
                        var ipRE  = new RegExp( '^' + ip + '$' );
                      
                        function validate( value ) {
                          alert( ( ipRE.test( value ) ? '' : 'in' ) + 'valid' );
                        }
                      </script>
                      </head>
                      <body>
                      IP Address: <input type='text' onchange='validate(this.value)'>
                      </body>
                      </html>

Open in new window

This is looking really good. Can we make it just a little more complex by allowing a valid IP address to be surrounded by square brackets (e.g., [127.0.0.1])? Our first inclination might be to use something like this:

var ipRE = new RegExp( '^\[?' + ip + '\]?$' );

This means that a leading and trailing square bracket are optional. Testing shows that each of the following are all considered valid:

0.0.0.0 - valid, which is good.
[127.0.0.1] - valid, which is good.
123.1.2.3] - valid, which is bad (No opening bracket).
[192.168.1.101 - valid, which is also bad (No closing bracket).

What happened? Well, the surrounding brackets are considered optional. So, zero, or one occurrence of a bracket at either end of the IP address are considered valid by that RegExp. How do we fix this? Well, essentially, we have to say that in order to be valid, we can have an IP address with or without surrounding matching brackets. How do we do this? The most reasonable way is to use something like this:

var quad = '(\\[' + ip + '\\])|(' + ip + ')';

This means that the whole thing looks like:

<html>
                      <head>
                      <title> quad </title>
                      <script type='text/javascript'>
                        var octet = '(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])';
                        var ip    = '(?:' + octet + '\\.){3}' + octet;
                        var quad  = '(\\[' + ip + '\\])|(' + ip + ')';
                        var ipRE  = new RegExp( '^' + quad + '$' );
                      
                        function validate( value ) {
                          if ( ipRE.test( value ) ) {
                            alert( '1: "' + RegExp.$1 + '"\n2: "' + RegExp.$2 + '"' );
                          } else {
                            alert( 'invalid' );
                          }
                        }
                      </script>
                      </head>
                      <body>
                      IP Address: <input type='text' onchange='validate(this.value)'>
                      </body>
                      </html>

Open in new window

This is really, really close, but again has the problem that the match is either put in RegExp.$1, or in RegExp.$2, which means that within our code we have to have an additional check to see which pattern was matched. Can this be fixed?

Sure, one way would be to recognize that we are currently only matching values that start at the beginning of a string, and end at the end of the same string. If we want to be able to match this pattern in the middle of a string, we can't just simply use the input value.

So, what we need to do is to change the existing capturing groups (i.e., the simple open/close parentheses) in the definition of quad, into non-capturing groups like this:

var quad = '(?:\\[' + ip + '\\])|(?:' + ip + ')';

and surrounding the quad pattern with a capturing group, like this:

var ipRE = new RegExp( '(' + quad + ')' );

This makes the script look like this:

<html>
                      <head>
                      <title> IP address validation </title>
                      <script type='text/javascript'>
                        var octet = '(?:25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])';
                        var ip    = '(?:' + octet + '\\.){3}' + octet;
                        var quad  = '(?:\\[' + ip + '\\])|(?:' + ip + ')';
                        var ipRE  = new RegExp( '(' + quad + ')' );
                      
                        function validate( value ) {
                          if ( ipRE.test( value ) ) {
                            alert( '"' + RegExp.$1 + '"' );
                          } else {
                            alert( 'invalid' );
                          }
                        }
                      </script>
                      </head>
                      <body>
                      IP Address: <input type='text' onchange='validate(this.value)'>
                      </body>
                      </html>

Open in new window

The only possible potential for confusion seems to be the fact that if you try a value like [127.0.0.1 (note the missing closing bracket), it matches the pattern. Look closely though, the leading opening bracket is not considered part of the IP address. If you add a closing bracket, and try again, you will see that the brackets are considered part of the address if they are both present.

Hopefully you will find this article interesting, and helpful. I decided to write it after taking a close look at the e-mail address RegExp that was described here.

Checking for valid IP addresses using JavaScript Regular Expressions

Comments (1)