Solved

Regluar Expression

Posted on 2004-08-06
28
393 Views
Last Modified: 2008-01-09
I have the following regular expression for email addresses that works pretty well.

^\w+([\.-]?\w+)*@\w+([\.-]?\w+)*(\.\w{2,3})+$

How do I modify it to only allow email addresses within a specific sub domain

i.e.

email@mydomain.com - OK
email@test.mydomain.com - OK
email@test1.test2.test3.mydomain.com - OK
email@yahoo.com - BAD

So it must end with mydomain.com

Thanks.
0
Comment
Question by:mrichmon
  • 8
  • 7
  • 5
  • +3
28 Comments
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11737902
Why not to use a second regex?
0
 
LVL 35

Author Comment

by:mrichmon
ID: 11737928
Why when the above can be modified?

I am very close.  I now have :

^\w+([\.-]?\w+)*@\w+([\.-]?\w+)*(\.com)+$

Which only allows ones ending in .com

I also tried

^\w+([\.-]?\w+)*@\w+([\.-]?\w+)*(mydomain\.com)+$

Which recognizes all of the subdomains, but not the main domain....
0
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11737929
something like:
^.+@.+mydomain\.com$
0
 
LVL 35

Author Comment

by:mrichmon
ID: 11737940
^\w+([\.-]?\w+)*@\w+([\.-]?\w+)*(mydomain\.com)+$ also has the problem of recognizing

test@testmydomain.com as OK when it should be BAD
0
 
LVL 35

Author Comment

by:mrichmon
ID: 11737957
Okay I think this does it :

^\w+([\.-]?\w+)*@(\w+[\.-])*(mydomain\.com)+$
0
 
LVL 6

Expert Comment

by:ren_b
ID: 11740131
might want to change it to: ^\w+([\.-]?\w+)*@((\w+[\.-])+\w+\.)?mydomain\.com$
unless bob@mydomain.commydomain.commydomain.com or someone@blah-mydomain.com are valid
0
 
LVL 35

Author Comment

by:mrichmon
ID: 11740699
someone@blah-mydomain.com is valid

bob@mydomain.commydomain.commydomain.com   should not be valid
0
 
LVL 6

Accepted Solution

by:
ren_b earned 500 total points
ID: 11740717
then all you have to do is delete 3 char's from yours :)
^\w+([\.-]?\w+)*@(\w+[\.-])*mydomain\.com$
0
 
LVL 8

Expert Comment

by:adg080898
ID: 11741082
If you really want to enforce "@mydomain.com":

^\w+([\.-]?\w+)*@mydomain\.com$

Is this perl? Watch out for the @ if it is. (Put a backslash before it)

Why worry so much about every character being a word and a period not being at the beginning or the end. Don't worry about the username so much. Your regex won't accept my email address, which contains digits. Why not this?:

^([^@: ]*)@mydomain\.com$

It enforces that the username does not contain spaces, or colons or @ characters and captures the username. Then it enforces that it is followed by the exact string "@mydomain.com" which must be followed by the end of the string.
You might want to discard and ignore leading and trailing spaces:

^\s*([^@: ]*)@mydomain\.com\s*$

If your program is not going to need to capture the username from the string, you can remove the "(" and ")".

^\s*[^@: ]*@mydomain\.com\s*$

0
 
LVL 8

Expert Comment

by:adg080898
ID: 11741137
I just noticed that you wanted subdomains to work, so:

^\s*[^@: ]*@([^.]+(\.[^.]+)*\.mydomain.com|mydomain\.com)\s*$

Which language are you using? You may need "ungreedyness".
0
 
LVL 6

Expert Comment

by:ren_b
ID: 11741210
adg: yours will match ``i'm*~*preciou$,*~*&*~*i*~*love*~*every*~*boy@mydomain.com'', or even just ``@mydomain.com'' which is of no use, unless he's just searching a list of company emails.
0
 
LVL 84

Expert Comment

by:ozo
ID: 11741778
mrichmon, fred&barney@stonehenge.com is a perfectly valid email address which your regular expression would not match
0
 
LVL 6

Expert Comment

by:ren_b
ID: 11741921
this will match all valid email addresses... :)

(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:
\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(
?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[
\t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\0
31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\
](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+
(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:
(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)
?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\
r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[
 \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)
?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t]
)*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[
 \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*
)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)
*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+
|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r
\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:
\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t
]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031
]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](
?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?
:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?
:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?
:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?
[ \t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\]
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|
\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>
@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"
(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?
:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[
\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-
\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(
?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;
:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([
^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\"
.\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\
]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\
[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\
r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\]
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]
|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \0
00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\
.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,
;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?
:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[
^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]
]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*(
?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(
?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[
\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t
])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t
])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?
:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|
\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:
[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\
]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)
?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["
()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)
?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>
@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[
 \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,
;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:
\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[
"()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])
*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])
+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\
.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(
?:\r\n)?[ \t])*))*)?;\s*)

it may be a valid email, but it may not fit his context.
0
 
LVL 8

Expert Comment

by:adg080898
ID: 11742548
To everybody:
Remember, the asker wants to match addresses ****ending in mydomain.com****, NOT any valid email address!!

To ren_b:

Yes, my regexp would match that ridiculous address. It is easy to add restrictions though, just add more characters to the first character class. The question is, EXACTLY which characters are not allowed? Just because the characters look invalid, doesn't mean they are. The RFC 822 specifications say to pass usernames through unvalidated (unless you can point out something in there that says something different).

You do have a very good point with the "@mydomain.com" comment. That's easy to fix, just change the "*" to a "+":

^\s*[^@: ]+@([^.]+(\.[^.]+)*\.mydomain.com|mydomain\.com)\s*$
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 4

Expert Comment

by:kolpdc
ID: 11752522
^\w+([\.-]?\w+)*@(?:\w*\.)*mydomain(\.\w{2,3})+$
did not read the whole thread, but think, this should work. try to get the little tool expresso from http://www.ultrapico.com - very helpful
0
 
LVL 4

Expert Comment

by:kolpdc
ID: 11752556
sorry, pasted wrong expression.
right one also matching domain is ^\w+([\.-]?\w+)*@(?:\w*\.)*mydomain\.com$
0
 
LVL 35

Author Comment

by:mrichmon
ID: 11753420
Wow  a lot of comments over the weekend.

Let me try to address some of them.

So far I like ren_b's solution of ^\w+([\.-]?\w+)*@(\w+[\.-])*mydomain\.com$

adg is correct in that I do not want any valid email address - just those ending in mydomain.com
adg is incorrect in that my regex WILL match email expressions with digits - he/she said it would not

ozo : fred&barney@stonehenge.com is not valid in our domain so I don't need to worry about that.  But I have never seen a site that allows an & as part of the email address.  It is considered invalid email address everywhere I have ever seen.  Can you point out somewhere that accepts this email address?  And by somewhere - I mean a major site.

kolpdc - what is the point of the ?: that you added


One general question - what does the + mean in the regular expression.

Thanks for all the comments.
0
 
LVL 6

Expert Comment

by:ren_b
ID: 11753507
+ means one or more {1,}
0
 
LVL 4

Expert Comment

by:kolpdc
ID: 11753743
+ one or more
? zero or one

do you mean after @? its a non-capturing group "(?:)". takes all alphanumerics + "." (the subdomains).
0
 
LVL 35

Author Comment

by:mrichmon
ID: 11753829
Ah tahnks.  I knew about ? and * but now + makes sense too.

However, I am still confused about the ?:

What would

^\w+([\.-]?\w+)*@(?:\w*\.)*mydomain\.com$

catch that

^\w+([\.-]?\w+)*@(\w+[\.-])*mydomain\.com$

would not?


0
 
LVL 8

Expert Comment

by:adg080898
ID: 11754937
I think ?: means "don't capture". Don't put it in $1 or $2 etc...
0
 
LVL 8

Expert Comment

by:adg080898
ID: 11754947
Are you using Perl? Regular expressions do vary in dialect.
0
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11755017
maybe my suggestion of using 2 regex is still valid.
0
 
LVL 6

Expert Comment

by:ren_b
ID: 11755183
well, ^\w+([\.-]?\w+)*@(?:\w*\.)*mydomain\.com$ would match blah.lbah@.mydomain.com whereas the second wouldn't match that as valid. but it would only put .lbah into $1. and it generally doesn't matter if you use ?: if you're not capturing parts out of a string.
0
 
LVL 35

Author Comment

by:mrichmon
ID: 11755334
I am not using perl.

I am writing a regex that can be used accross multiple languages.  Right now the one I have works accross ASP, ASP.NET, Javascript, Cold Fusion
0
 
LVL 6

Expert Comment

by:ren_b
ID: 11755437
javascript and asp should be fine perl regex's, not sure about cold fusion
0
 
LVL 35

Author Comment

by:mrichmon
ID: 11755672
I know that all of the afroemetnioned ones can accept the same regular expressions.

0
 
LVL 4

Expert Comment

by:kolpdc
ID: 11760071
no problem. i think all of the ones you mentioned will interprete regex the same way. does not matter if you are using perl, php or something of .net
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

This article will show, step by step, how to integrate R code into a R Sweave document
Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now