Link to home
Start Free TrialLog in
Avatar of purplesoup
purplesoupFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Can anyone explain this email validation regex

This regex is often used to validate email addresses:

\w+([-+!$%&*/=?{|}.']\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*

Open in new window


Can anyone break it down and explain exactly what it is doing?
ASKER CERTIFIED SOLUTION
Avatar of stergium
stergium
Flag of Greece image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of purplesoup

ASKER

Thanks - I'm not clear about capturing groups - I just tried reading this

http://www.regular-expressions.info/named.html

but I couldn't link it to what you had - can you make it any clearer for me?

Sorry!
what part of this expression is not clear to you ?  please explain
I didn't understand what capturing groups were, as I think I mentioned, and the initial link I looked up didn't seem to explain it very well, however I did some more searching and I think I have the hang of it.

This is what I made of it:

([-+!$%&*/=?{|}.']\w+)*

The bits between ( and ) are the group. The * at the end refers to zero or more characters, if it had + on the end it would be one or more.

So now looking at the contents of the group, we have

[-+!$%&*/=?{|}.']\w+

Well the \w+ at the end is easy enough - a word character, one or more times.

So what of

[-+!$%&*/=?{|}.']

?

This I believe refers to any one of these characters is ok.

So valid matches might be

(nothing)

since the capturing group has * at the end, zero characters are a valid match.

+a

the plus (+) character is one of the allowed characters, but it has to be followed by at least one word character (in this case "a")

&abcd

the ampersand (&) character is one of the allowed characters, and it must be followed by one or more word characters, "abcd" is therefore acceptable.

This wouldn't be allowed:

%&*

because only one of the special characters is allowed and it isn't followed by one or more word characters.

That was the sort of explanation I was looking for.
[-+!$%&*/=?{|}.']    -> one of these characters.  
The  link that posted , which is a reference to me also , explains/breaks every regular expresion .
If you feel that your are not satisfied with the answer , you can request the help of a moderator.