• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1574
  • Last Modified:

Regular Expression FAKE Phone Number handling

I've got a somewhat complex phone number regular expression i need made.  I have a TextBox control on a form which gets validated using javascript regular expressions.  I use .NET, and so I am using field validators and don't actually have access to the input programatically at the validator level.  simply put, i must validate the input using a regular expression.

The phone number field accepts phone numbers, in a number of formats, such as:
9205551212
920-555-1212
(920) 555-1212
etc...

Here is the regular expression I already use to get the above results:
^\s*([\(]?)\[?\s*\d{3}\s*\]?[\)]?\s*[\-]?[\.]?\s*\d{3}\s*[\-]?[\.]?\s*\d{4}$

Simply put, there must always be 10 numbers, however, there may or may not be some separators in the number.

Now, something that is commonly happening is that I am getting junk input (fake phone numbers) put in by users.  There are a few common paterns which they will use:
1)  All 10 digits same number:  1111111111   OR   2222222222   OR   etc...
2)  Repeating pattern type 1:    1212121212   OR   2323232323   OR   etc...
3)  Repeating pattern type 2:    1231231231   OR   3213213213   OR   etc...
4)  Incrementing numbers:        1234567890   OR   0123456789    

I would like to create a NEW Regular Expression that would work with javascript and only accept numbers which do NOT match any of the 4 types of common patterns above.  This NEW expression, should not be built into my already current expression above which validates the "Format" of the number.  

Can someone help?
0
JohnyStyles577
Asked:
JohnyStyles577
  • 5
  • 4
  • 2
  • +2
3 Solutions
 
ozoCommented:
1)  All 10 digits same number:  1111111111   OR   2222222222   OR   etc...
^(?!(.)\1{9})
2)  Repeating pattern type 1:    1212121212   OR   2323232323   OR   etc...
^(?!(..)\1{4})
3)  Repeating pattern type 2:    1231231231   OR   3213213213   OR   etc...
^(?!((.)..)\1{2}\2)
4)  Incrementing numbers:        1234567890   OR   012345678
(?!1234567890|012345678)
0
 
KhoiNqqCommented:
I think these pattern can be valid phone number. I don't know about your country rule, but Vietnam, my country, some taxi company using these pattern (eg: 8 21 21 21, 8 26 26 26, 8 111 111) for easier to remember, so those pattern is valid. Some businessman also using those pattern for their number.

I think the best ways to validate your number is checking the prefix (each telco always has their own prefix), number of digit and and possible place of separator, it's enough.
0
 
Eddie ShipmanAll-around developerCommented:
You can write another script to lookup the area code based on the zip code entered and
call it using an ajax call and if it doesn't match, disallow it.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
ddrudikCommented:
No matter how you make the rule, a visitor determined to not provide a valid phone number will get around any regex rule.
0
 
Eddie ShipmanAll-around developerCommented:
My solution to lookup the number for the zip code is the way to go.
0
 
JohnyStyles577Author Commented:
All very good points.  Amazing the fast response on this!  

Ozo,
I did not test your regex expressions out, but they all seem valid.  one key thing I was missing was how to complement the value (NOT operator), but I see how you used the negative lookahead to accomplish this.  There is one thing missing and that is that I wanted to receive an answer which included the parsing of the phone numbers based on my original regex.  This is the complex part.  I wrote the example phone numbers as:
1111111111
But they can also be entered as:
(111) 111-1111 or
111-111-111 or
( 111 ) 111 - 1111 or
111 111 1111
etc...

Please look at my original post to view the regex used to accepts a valid "formatted" phone number.  I would like the paterns you gave me to take all of this into account (essentially, be able to skip over non number characters when doing it's parsing, but continue to do the pattern matching).  So all of the above cases should return FALSE for your first regex.

KhoiNqq,
You are right.  But seeing how my application is typically used by only consumers, not businesses, I would expect that the number of instances where a valid phone number was rejected due to these rules would be extremely low, and for my application, this is acceptable.

EddieShipman,
I like your solution too, and I do have a zipcode database which associated area codes.  However, I think this would cause more rejections of phone numbers than necessary.  For instance, someone lives in ZipCode A, but works in ZipCode B, and provides thr work number as contact info (fairly common).  Also, now you have the right to keep your phone number in US, even if you move to a new location.  I do have alot of past data I can use for analysis, so I may try running some stats at some point to see what percentage of my data would actually pass that validation.

ddrudik,
True, but I just want to lower the percentage of junk data I get.  You are right in that it is impossible to get 100% clean data, but if I could improve from 90% to 98%, this would make a huge difference.
0
 
ozoCommented:
can you strip the non-digits before the test?
0
 
ozoCommented:
2)  Repeating pattern type 1:    1212121212   OR   2323232323   OR   etc...
^(?!(.)\D*(.)(\D*\1\D*\2){4})
0
 
JohnyStyles577Author Commented:
ozo,

I have almost exactly what I need.  Thanks for the example above which includes the parsing for other non-digit characters.  I have only one problem left with this.  I morphed your regex from above into this for testing:
^(?!\D*(\d)\D*(\d)(\D*\1\D*\2){2})

This checks for a patern of 2 digits repeated at least 3 times (instead of 5 times)  So this would also be invalid:
1212125555    (because of the repeating 121212 in the beginning)

However, I want to make sure this patern is not seen anywhere in the string (not just in the beginning), so I removed the caret in the begining of the regex, thinking this should do it:
(?!\D*(\d)\D*(\d)(\D*\1\D*\2))

The idea was that the following should all return false:
5121212555
or
5555121212

But once I remove the caret at the beginning of the regex, it no longer seems to work.  Any ideas?
0
 
ozoCommented:
The problem was that it found a match at
55(55)121212
One way to handle that, if you don't want to change your test to match what you don't want instead of what you want, could be
^(?!.*\D*(\d)\D*(\d)(\D*\1\D*\2){2})
0
 
JohnyStyles577Author Commented:
ok, that works perfectly!  There is one thing I am still waiting for, which is case 4 that I listed.  I would like to not allow incrementing / decrementing numbers.  How would you do this with Regex?

BTW, here are the regex's I've decided on which cover cases 1, 2, and 3, and a bit more:

1212123333  - no pair of repeating digits 3 times or more
^(?!.*(\d)\D*(\d)(\D*\1\D*\2){2})

1231233333  - no group of 3 repeating digits 2 times or more
^(?!.*\D*(\d)\D*(\d)\D*(\d)(\D*\1\D*\2\D*\3))

1234123433  - no group of 4 repeating digits 2 times or more
^(?!.*\D*(\d)\D*(\d)\D*(\d)\D*(\d)(\D*\1\D*\2\D*\3\D*\4))

1234512345  - no group of 5 repeating digits 2 times or more
^(?!.*\D*(\d)\D*(\d)\D*(\d)\D*(\d)\D*(\d)(\D*\1\D*\2\D*\3\D*\4\D*\5))
0
 
ozoCommented:
besides
(?!1234567890|012345678)
you might do something like
(?=0[^1]|1[^2]|2[^3]|3[^4]|4[^5]|5[^6]|6[^7]|7[^8]|8[^9]|9[^0])0[^9]|1[^0]|2[^1]|3[^2]|4[^3]|5[^4]|6[^5]|7[^6]|8[^7]|9[^8]
0
 
JohnyStyles577Author Commented:
Thanks very much for all of the help.  I have everything I need now.

Here is what I've decided to use for phone number validation.  I realize that these will capture some valid phone numbers, but for me it is more important to limit the amount of bad numbers allowed through, even at the cost of having a small minority of users likely < 1/1000 or maybe even 1/10000 not be able to submit there valid numbers.  Here are all of my solutions:

1212123333  - no pair of repeating digits 3 times or more
^(?!.*(\d)\D*(\d)(\D*\1\D*\2){2})

1231233333  - no group of 3 repeating digits 2 times or more
^(?!.*(\d)\D*(\d)\D*(\d)(\D*\1\D*\2\D*\3))

1234123433  - no group of 4 repeating digits 2 times or more
^(?!.*(\d)\D*(\d)\D*(\d)\D*(\d)(\D*\1\D*\2\D*\3\D*\4))

1234512345  - no group of 5 repeating digits 2 times or more
^(?!.*(\d)\D*(\d)\D*(\d)\D*(\d)\D*(\d)(\D*\1\D*\2\D*\3\D*\4\D*\5))

1234533333 or 5432133333 - no incrementing groups of 1 through 5 or 5 through 1.
^(?!.*1\D*2\D*3\D*4\D*5|.*5\D*4\D*3\D*2\D*1)

9205551212 - no 555-1212 numbers allowed
^(?!.*5\D*5\D*5\D*1\D*2\D*1\D*2)

Again thanks so much for all of the help!  Especially ozo, thank you!
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

  • 5
  • 4
  • 2
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now