Solved

Pattern matching help - various entries

Posted on 2010-11-14
21
232 Views
Last Modified: 2012-05-10
I am trying to look at some regex docs and get a expression to match any one of these below:

3N-2N-4N
6N-5N
2N-7N
2N-7N2A

Any ideas on how I can do that?


Thanks
0
Comment
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 7
  • 4
  • +1
21 Comments
 
LVL 2

Expert Comment

by:cap2501
ID: 34133316
[362]N-[257]N-*[24]*[NA]*

would match any of those

All regex is very similar as such I use the java api's docs for reference no matter which language I am using the regex in (http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html)

[]s designate groups so [362]N would be either 2,6, or 2 followed by an N.

* designates 0+ of the previous thing so -* means 0,1,2... -s

Depending on your entire dataset you may need to refine this pattern to prevent false positives.
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 34133328
Or just match against pattern:
^(3N-2N-4N|6N-5N|2N-7N|2N-7N2A)$
0
 
LVL 2

Expert Comment

by:cap2501
ID: 34133340
regex models a finite state machine one char at a time, as such trying to match WHOLE_STRING|WHOLE_STRING doesn't work well.


From what I understand what your regex would logically end up as would be something like: 3N-2N-4(N or 6)N-5(N or 2)N-7N2A

to match the whole string you would need to use "look-head" logic in the regex which is a bit more complicated.
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 34133371
>>  All regex is very similar ...

Alas young padawan, nothing could be further from the truth. There are many different engines in the wild.

Regex is all about patterns. Are you defining your pattern to be those characters above, or should this be more abstract (e.g. alpha num - alpha num - alpha num)?
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 34133400
^ and $ in my regex match the start and end of the string, respectively. That technique might not be the most efficient, but it definitely works and is very easy to understand and thus easy to maintain.

However, kaufmed's comment is right on the mark - you might be wanting to match a pattern rather than just that list of values.
0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34136765
Thank you all for the responses. I should have been more elaborate so I am looking for:

333-24-2352 (SSN)
123456-12345
12-1234567
12-1234567AA (two alpha at end)

@cap thanks for the explanation, helped it break it down

Thoughts?
0
 
LVL 75

Accepted Solution

by:
käµfm³d   👽 earned 250 total points
ID: 34136822
That looks like a combination of SSN, FEIN, and some other codes. For that group, I'd suggest:
\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([A-Z]{2})?

Open in new window

0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34136967
Hey kaufmed, I tried that and my validation doesnt seem to be passing.

Gonna check my code again, I am using jQuery validation custom method and this regex above
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 34137261
Perhaps you are passing "aa" rather than "AA"? I made it case-sensitive going by your previous data sample, but you could make it insensitive by expanding the range in the last group:
\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?

Open in new window

0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34137293
Thanks! I keep trying the SSN format as well and its not working:

123-12-1231

Maybe its user error? haha

Just for the future where do you specify case? Here?

[a-zA-Z]
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 34137347
You can define a range in a bracket expression by inserting a hyphen between the start and end characters. In the above, there are actually two ranges:  one from "a" to "z" and one from "A" to "Z". The combination of these two ranges make the expression case-insensitive. Alternatively, you could use pattern modifiers to turn on case-insensitivity, but they usually apply to the entire expression (which wouldn't really affect this particular pattern). Here is an example of a pattern modifier:
// The "i" at the end of the pattern (outside of the slash)
//  turns on case-insensivity so either "a-z" or "A-Z" 
//  would match alpha characters of any case.
"12-1234567AA".match(/\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-z]{2})?/i);

Open in new window

0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 34137398
You may also need to bound the pattern as TerryAtOpus demonstrated. If you are passing simply one of the above strings, bounding the pattern with ^ (start of line/string) and $ (end of line/string) might suffice. If you are passing this as part of some larger string, you can use word boundaries (\b) to bound the pattern. Here are examples of both:
// Whole string, use ^ and $
^\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?$

// As a substring, use \b
\b\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?\b

Open in new window

0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 34137431
Correction:

Grouping (parentheses) is needed for the bounds to function correctly:
// Whole string, use ^ and $
^(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?)$

// As a substring, use \b
\b(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?)\b

Open in new window

0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34164377
Thanks for this information.

That code will validate for:

333-24-2352 (SSN)
123456-12345
12-1234567
12-1234567AA (two alpha at end)

Right?

Thanks for the the quick explanation. Always been confused about reg ex
0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34164615
Just tried this reg ex at this site: http://www.regular-expressions.info/javascriptexample.html

REG EX: ^(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?)$

It tested true for: 12-1234567AA - which is correct?

What if im trying to validate for all the types above?

Thanks again
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 34165573
>>  What if im trying to validate for all the types above?

I don't understand the question. The pattern supplied will return true if the source data matches any of the alternatives, but you won't actually get an indication of which sub-pattern matched. For that, AFIK, you will need to create a separate regex search for each different sub-pattern.
0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34168628
I understand now. I made changes to my validation method to run an if else.

I have:

^(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?)$ = 12-1234567AA
^\d{3}-\d{2}-\d{4}$ = 333-35-1361 (SSN)

Now for these last two:

123456-12345
12-1234567

For this one: 12-1234567 - would it be:

^(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7})$
0
 
LVL 35

Assisted Solution

by:Terry Woods
Terry Woods earned 250 total points
ID: 34168645
The last pattern you specify will also match:
123-12-1234
unless you reduce it to:
^(\d{6}-\d{5}|\d\d-\d{7})$
which just matches the formats:
123456-12345
12-1234567
0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34168713
Oh nice so: ^(\d{6}-\d{5}|\d\d-\d{7})$

Will match?

123456-12345
12-1234567

I will give this a run.

Thank you!
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 34168730
Yes, exactly
0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34168770
Works like a charm! I am finally understanding reg ex alot more after this question. Thanks all

0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have been reconstructing a PHP-based application that has grown into a full blown interface system over the last ten years by a developer that has now gone into business for himself building websites. I am not incredibly fond of writing PHP code o…
As most anyone who uses or has come across them can attest to, regular expressions (regex) are a complicated bit of magic. Packed so succinctly within their cryptic syntax lies a great deal of power. It's not the "take over the world" kind of power,…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Suggested Courses

751 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question