Solved

JavaScript RegEx Idiosyncracy

Posted on 2010-08-18
19
194 Views
Last Modified: 2013-11-18
I'm using regular expressions in JavaScript to do on-the-fly validation of form input.  As far as I can tell, the following expression should match only valid email addresses, but it's matching an obviously invalid address too.

Expression:
^([\-\.0-9A-Z_a-z]+)@([\-0-9A-Za-z]+\.)+[A-Za-z]{2,4}$

Open in new window


Incorrectly matches:  user@domain OR user@domain.toolongextension

The problem is the validation after the @ sign.  The regex there isn't complex.  It requires one or more alphanumeric (optionally hyphened) domain/subdomain names of various lengths followed by a period, then ending with an alphabetic 2-, 3-, or 4- digit extension (e.g. .cc, .com, .info).  But it's not working.  And this RegEx works perfectly fine, as expected, in the .NET framework.

Points for anyone that show me, most importantly, a workable fix.  Points also for showing me if I'm doing anything wrong OR online documentation of this as a known bug in JS implementation.
0
Comment
Question by:Bobaran98
  • 8
  • 5
  • 3
  • +1
19 Comments
 
LVL 8

Author Comment

by:Bobaran98
ID: 33468249
LOL.  After an hour messing with this, I think I just solved my own issue... mere moments after posting it here.  Looking at the code again, I wondered if maybe JavaScript was ignoring the escape slash before the period-- if so, that period would match one or more of any character, which would make pretty much anything to the right of the @ sign valid.
I put the \. within range braces, even specifying (unnecessarily) a single character, and it now works as expected.  Like so:
[code]^([\-\.0-9A-Z_a-z]+)@([\-0-9A-Za-z]+[\.]{1})+[A-Za-z]{2,4}$[/code]
I will still grant points to anyone who can show me why (a) my original code was wrong and this is valid, or (b) online documentation of this as a known issue.  Because my understanding of RegEx says this is silly.
0
 
LVL 82

Expert Comment

by:leakim971
ID: 33468251
A good regex here : http://www.marketingtechblog.com/programming/javascript-regex-emailaddress/



/^([a-zA-Z0-9_.-])+@(([a-zA-Z0-9-])+.)+([a-zA-Z0-9]{2,4})+$/

Open in new window

0
 
LVL 8

Author Comment

by:Bobaran98
ID: 33468270
Sorry, let's try this again
^([\-\.0-9A-Z_a-z]+)@([\-0-9A-Za-z]+[\.]{1})+[A-Za-z]{2,4}$



With the operative part being:  [\.]{1}



Instead of simply:  \.

Open in new window

0
 
LVL 82

Expert Comment

by:leakim971
ID: 33468288
no problem Bobaran98, if you're satisfied by your regex just accept your last post as answer to close the question
0
 
LVL 8

Author Comment

by:Bobaran98
ID: 33468406
@leakim, can you answer any of my questions?  Your expression works too, obviously.  But I notice your period is outside of braces and has no escape character.  In any other framework I've worked within (.NET, PHP, etc.), such a period would be treated as a wildcard, and escaping it would make it a standard period.  But I'm seeing now in JavaScript the behavior appears the exact opposite.
Can you comment on this?  It would certainly explain my original issue.  My problem is I would like to use the same expression for both JavaScript and .NET validation (the JS check happens on the fly, with the .NET happening again, just to be sure, before any data is written to the DB).
0
 
LVL 82

Expert Comment

by:leakim971
ID: 33468782
Like .net and PHP, period in js : http://www.w3schools.com/jsref/jsref_regexp_dot.asp
A good link : http://www.w3schools.com/jsref/jsref_obj_regexp.asp

(([a-zA-Z0-9-])+.)

I read one or more than on alphanumerical character or - followed by any single character not alphanumerical. For example : -
No only a dot/period

Something like : bad@expert?excff is valid with this regex.
You regex is best.
0
 
LVL 8

Author Comment

by:Bobaran98
ID: 33471817
@mods-- I'd like to accept my own post (#33468270) as the solution, but award leakim971's post (#33468782) 50 points.  I've tried doing this using "Accept and Award Points," but it gives me the error message that the minimum point split is 20 points (which is a bogus message).  Thanks!
-----------
@leakim-- Thanks for your willingness to help, but I was really looking for some discussion or links about the differences between JavaScript regex and other implementations.  At least a reason why the escape character gets ignored in front of the period.  Your links were very basic.  No worries-- obviously you're very busy on here; I just wanted you to know why I'm not awarding any more points than this.
Have a good day!
0
 
LVL 8

Author Comment

by:Bobaran98
ID: 33471819
@mods-- oh, and a grade of A.  Thanks!
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 33476467
>>  (([a-zA-Z0-9-])+.)
>>  I read one or more than on alphanumerical character or - followed by any single character not
>>  alphanumerical. For example : -
>>  No only a dot/period

That pattern actually says one-or-more alphanumerics followed by ANY character, not just a character which is "not alphanumeric."


Can you explain what wasn't working about the pattern? I tested it locally and it appeared to work fine for me. The only changes I made are that inside character classes ( [] ), you do not need to escape hyphens or periods. With regard to literal hyphens, you only need to make sure that the hyphen is at either the beginning or end of the class (i.e. the first character after the opening bracket OR the last character just prior to the closing bracket. Here is what I tested with:
alert(x.match(/^([-.0-9A-Z_a-z]+)@([-0-9A-Za-z]+\.)+[A-Za-z]{2,4}$/));

Open in new window

0
 
LVL 82

Expert Comment

by:leakim971
ID: 33476527
>That pattern actually says one-or-more alphanumerics followed by ANY character, not just a character which is "not alphanumeric."
  That pattern actually says one-or-more alphanumerics or - followed by ANY character, not just a character which is "not alphanumeric."
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 33476696
I'm sorry, but where do you see an "or" in that pattern?
0
 
LVL 82

Expert Comment

by:leakim971
ID: 33476800
> where do you see an "or" in that pattern?

after the 9 in (([a-zA-Z0-9-])+.)
0
 
LVL 82

Expert Comment

by:leakim971
ID: 33476873
not this << or >> : |

but << one of this >> there's alphanum and -

bad language, sorry
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 33476953
Ah. I missed the dash in your explanation myself  :)
0
 
LVL 82

Expert Comment

by:leakim971
ID: 33479500
@Bobaran98 said :
>At least a reason why the escape character gets ignored in front of the period

Look again : http://www.w3schools.com/jsref/jsref_obj_regexp.asp
the Brackets part

So in the brackets, backslash are considered like a character and not as escape character
0
 
LVL 82

Expert Comment

by:leakim971
ID: 33479513
@kaufmed

I was unable to find the word << dash >>
Shame on me brother...

Have fun, see you later.
0
 
LVL 9

Accepted Solution

by:
Derek Jensen earned 500 total points
ID: 38536449
No...what??

Okay. Let me clear the air a little bit here...

In my experience, I have never known JS to ignore a backslash in front of a period. JS regex is closer to perl regex than PCRE is.

Looking at the original regex, the period was *not* matching any character. The problem was elsewhere in the regex...but we'll get to that in a minute. ;-)

Regarding the difference in \. and [\.], what was said by Leakim:
So in the brackets, backslash are considered like a character and not as escape character
is absolutely incorrect. At least as far as JS/PCRE regex is concerned.

The truth of the matter is,
\. === [.] == [\.]

Open in new window

period.

That isn't to say the backslash inside brackets severs no purpose; on the contrary, a backslash is always an escape character *unless immediately followed by another backslash*.
\\ === [\\] !== [\]

Open in new window

[\] will produce an error in every regex tester I know of.

For example, try this regex out in the regex tester of your choice, using this post as the haystack:
/[a-z\]-]/

Open in new window

*That* is what the purpose of a backslash inside brackets serves.
But not just that. ;-)

Now, about this regex...
^([\-\.0-9A-Z_a-z]+)@([\-0-9A-Za-z]+\.)+[A-Za-z]{2,4}$

The problem lies in the last plus. Try this regex out:
^([-.0-9A-Z_a-z]+)@[-.0-9A-Za-z]+\.[A-Za-z]{2,4}$

Open in new window

;-)

One last note about brackets(as I noticed you'd escaped the dash also):

Inside brackets, a dash serves as a range indicator *unless immediately following the opening or immediately preceding the closing bracket*.
[-a-z] === [a-z-] === [a\-b-z]

Open in new window

Thus, although there was an error in your regex, the \. was not it. :-)
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Suggested Solutions

JavaScript can be used in a browser to change parts of a webpage dynamically. It begins with the following pattern: If condition W is true, do thing X to target Y after event Z. Below are some tips and tricks to help you get started with JavaScript …
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn the basics of jQuery including how to code hide show and toggles. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now