Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium


Regex Explanation

Posted on 2013-12-09
Medium Priority
Last Modified: 2013-12-10
Can anybody please explain what this regex accepts. Please break it down.

Question by:MacroShadow
LVL 11

Expert Comment

ID: 39705646
Looks to me email related, although I'm not convinced it's a correct email validation.

From left to right:
- Any word
- A dot followed by a word (any number of times)
- The @ sign

...and then it gets a bit funky... I think the top right part is essentially:

- Any word followed by a dot (any number of times)
- a 2-4 len word

...and the bottom right is an IP address

- 1-3 digits followed by a dot (3 times)
- 1-3 digits
LVL 11

Expert Comment

ID: 39705647
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

LVL 11

Expert Comment

ID: 39705652
This editor allows you to enter test data and visualise the match.

...it doesn't seem to understand the line start and line end in the regex though (the leading ^ and trailing $). Works well with those stripped i.e:

Open in new window

LVL 10

Assisted Solution

ienaxxx earned 400 total points
ID: 39705669
Anything starting with " (double Quotes), immed. followed (^) by a word with more than a char (+ at the and) that CAN also contain any of the chars following \w.

The first part CAN, any number of times, (not MUST, this is given by the * outta the ending ")") be followed by a DOT (must be escaped with backslash "\" cause it has a special meaning in the regexps) and any other word as per before.

Must be followed by a "@"

Must be followed by something that can be:
(([\-\w]+\.)+[a-zA-Z]{2,4})  = any word without special chars, followed by a dot and any UPPERCASE or lowercase char from two to four occurrences (example ".com")
| = OR
(([0-9]{1,3}\.){3}[0-9]{1,3})) = an IP address (any number with one to three digits, followed by a dot, for exactly 3 times and then followed again by another number from one to three digits.

Must END then, with  " (double quotes)

Hope this helps
LVL 11

Expert Comment

ID: 39705782
hi.. it is your answer

Expected one of *, +, ?, {, {,, (, [, ., \, $, |, ) at line 1, column 3 (byte 3) after ("
LVL 11

Accepted Solution

Angelp1ay earned 1200 total points
ID: 39706157
@samirbhogayta - I think you've just dumped this into an editor and copied an error message.

@ienaxxx - I'm assuming this regex is a parameter for a function and the (" ") bits are just the function wrapping it. The ^ at the start and $ at the end are too convenient - these are the start of line and end of line regex expressions.

Let me extend my answer more fully:

- start of line (or string you're comparing too)

- any of the items between the square brackets, with the plus meaning 1 or more times (i.e. a string at least 1 char long)
- "\w" stands for "word character", usually equivalent to this set of chars [A-Za-z0-9_]

- the outer brackets define a group with the * meaning the whole group can repeat 0 or more times
- the first item in this group, "\." is just an escaped "."
- the inner part is exactly as before, basically any string inc. those symbols

- the @ sign, exactly once

- this one is tricky because of the pipe in the middle, it's essentially 2 patterns, either one must match, i.e. it means:


- the first is a dash or word char, one or more times, followed by a dot...
- ...with this whole piece being repeated one or more times...
- ...and finally 2-4 alpha chars
(I think this is meant to represent a web domain e.g. abc-123.mydomain.com)

- the second is any numeric digit, repeated 1-3 times (e.g. 1, 12 or 123)...
- ...followed by a dot...
- ...with this whole piece repeated exactly 3 times...
- and one last sequence of digits without a dot
(I think this is meant to represent an IP e.g.

- end of line (or string you're comparing too)

Assisted Solution

by:Derek Jensen
Derek Jensen earned 400 total points
ID: 39706457
Ok, so I had to do a little creative interpretation of it, but yes, at first glance it does seem to be email-related. Any regexes that have the @ sign in them are almost always email-related.

^                                 -- Beginning of string
[\w!#$%&'*+\-/=?\^_`{|}~]+        -- Basically look for one or more non-number, non-space chars, or ([^0-9]|\S)+
(\.[\w!#$%&'*+\-/=?\^_`{|}~]+)*   -- Look for any number of optional(*) strings consisting of a period followed by one or more(+) non-number/space chars (and store it in capture group 1)
@                                 -- Find the @ symbol
(                                 -- Capture Group 2
    (                             -- Capture group 3
        ([\-\w]+\.)+              -- Look for at least one group of one or more alpha or dash chars, followed by a period (store the last string found in capture group 4)
        [a-zA-Z]{2,4}             -- Find between two and four alpha chars
    )                             -- End group 3
    |                             -- Find the above capture group 3, OR:
    (                             -- Capture group 5
        ([0-9]{1,3}\.){3}         -- Find exactly 3 groups of between one and 3 numbers, followed by a period (store in capture group 6)
        [0-9]{1,3}                -- Find between one and 3 numbers
    )                             -- End group 5 (Note: this group does not appear to have anything to do with validating emails, so I don't immediately see the relevancy in this expression)
)                                 -- End group 2
$                                 -- End of string

Open in new window

I may have confused the order of the groupings inside group 2; if so, I apologize. Some flavors handle the order of encountered parentheses differently.
LVL 35

Expert Comment

by:Terry Woods
ID: 39707170
@bigdogman, that's an excellent explanation.

It's worth adding that given that it looks like we're dealing with an email address, the domain of the address can be an ip address (ip4, not ip6) or just a standard domain name, though @ienaxxx already mentioned this.
LVL 28

Author Closing Comment

ID: 39707285
Wow! I wasn't expecting so many detailed explanations. Thank you all.

Expert Comment

by:Derek Jensen
ID: 39709039
@Terry, interesting, I wasn't aware of that. Thanks for the explanation. :-)

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Do you hate spam? I do, and I am willing to bet you do as well. I often wonder, though, "if people hate spam so much, why do they still post their email addresses on the web?" I'm not talking about a plain-text posting here. I am referring to the fa…
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

572 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question