Solved

Pattern matching help - various entries

Posted on 2010-11-14
21
198 Views
Last Modified: 2012-05-10
I am trying to look at some regex docs and get a expression to match any one of these below:

3N-2N-4N
6N-5N
2N-7N
2N-7N2A

Any ideas on how I can do that?


Thanks
0
Comment
  • 8
  • 7
  • 4
  • +1
21 Comments
 
LVL 2

Expert Comment

by:cap2501
ID: 34133316
[362]N-[257]N-*[24]*[NA]*

would match any of those

All regex is very similar as such I use the java api's docs for reference no matter which language I am using the regex in (http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html)

[]s designate groups so [362]N would be either 2,6, or 2 followed by an N.

* designates 0+ of the previous thing so -* means 0,1,2... -s

Depending on your entire dataset you may need to refine this pattern to prevent false positives.
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 34133328
Or just match against pattern:
^(3N-2N-4N|6N-5N|2N-7N|2N-7N2A)$
0
 
LVL 2

Expert Comment

by:cap2501
ID: 34133340
regex models a finite state machine one char at a time, as such trying to match WHOLE_STRING|WHOLE_STRING doesn't work well.


From what I understand what your regex would logically end up as would be something like: 3N-2N-4(N or 6)N-5(N or 2)N-7N2A

to match the whole string you would need to use "look-head" logic in the regex which is a bit more complicated.
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 34133371
>>  All regex is very similar ...

Alas young padawan, nothing could be further from the truth. There are many different engines in the wild.

Regex is all about patterns. Are you defining your pattern to be those characters above, or should this be more abstract (e.g. alpha num - alpha num - alpha num)?
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 34133400
^ and $ in my regex match the start and end of the string, respectively. That technique might not be the most efficient, but it definitely works and is very easy to understand and thus easy to maintain.

However, kaufmed's comment is right on the mark - you might be wanting to match a pattern rather than just that list of values.
0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34136765
Thank you all for the responses. I should have been more elaborate so I am looking for:

333-24-2352 (SSN)
123456-12345
12-1234567
12-1234567AA (two alpha at end)

@cap thanks for the explanation, helped it break it down

Thoughts?
0
 
LVL 74

Accepted Solution

by:
käµfm³d   👽 earned 250 total points
ID: 34136822
That looks like a combination of SSN, FEIN, and some other codes. For that group, I'd suggest:
\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([A-Z]{2})?

Open in new window

0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34136967
Hey kaufmed, I tried that and my validation doesnt seem to be passing.

Gonna check my code again, I am using jQuery validation custom method and this regex above
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 34137261
Perhaps you are passing "aa" rather than "AA"? I made it case-sensitive going by your previous data sample, but you could make it insensitive by expanding the range in the last group:
\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?

Open in new window

0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34137293
Thanks! I keep trying the SSN format as well and its not working:

123-12-1231

Maybe its user error? haha

Just for the future where do you specify case? Here?

[a-zA-Z]
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 34137347
You can define a range in a bracket expression by inserting a hyphen between the start and end characters. In the above, there are actually two ranges:  one from "a" to "z" and one from "A" to "Z". The combination of these two ranges make the expression case-insensitive. Alternatively, you could use pattern modifiers to turn on case-insensitivity, but they usually apply to the entire expression (which wouldn't really affect this particular pattern). Here is an example of a pattern modifier:
// The "i" at the end of the pattern (outside of the slash)
//  turns on case-insensivity so either "a-z" or "A-Z" 
//  would match alpha characters of any case.
"12-1234567AA".match(/\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-z]{2})?/i);

Open in new window

0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 34137398
You may also need to bound the pattern as TerryAtOpus demonstrated. If you are passing simply one of the above strings, bounding the pattern with ^ (start of line/string) and $ (end of line/string) might suffice. If you are passing this as part of some larger string, you can use word boundaries (\b) to bound the pattern. Here are examples of both:
// Whole string, use ^ and $
^\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?$

// As a substring, use \b
\b\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?\b

Open in new window

0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 34137431
Correction:

Grouping (parentheses) is needed for the bounds to function correctly:
// Whole string, use ^ and $
^(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?)$

// As a substring, use \b
\b(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?)\b

Open in new window

0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34164377
Thanks for this information.

That code will validate for:

333-24-2352 (SSN)
123456-12345
12-1234567
12-1234567AA (two alpha at end)

Right?

Thanks for the the quick explanation. Always been confused about reg ex
0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34164615
Just tried this reg ex at this site: http://www.regular-expressions.info/javascriptexample.html

REG EX: ^(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?)$

It tested true for: 12-1234567AA - which is correct?

What if im trying to validate for all the types above?

Thanks again
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 34165573
>>  What if im trying to validate for all the types above?

I don't understand the question. The pattern supplied will return true if the source data matches any of the alternatives, but you won't actually get an indication of which sub-pattern matched. For that, AFIK, you will need to create a separate regex search for each different sub-pattern.
0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34168628
I understand now. I made changes to my validation method to run an if else.

I have:

^(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?)$ = 12-1234567AA
^\d{3}-\d{2}-\d{4}$ = 333-35-1361 (SSN)

Now for these last two:

123456-12345
12-1234567

For this one: 12-1234567 - would it be:

^(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7})$
0
 
LVL 35

Assisted Solution

by:Terry Woods
Terry Woods earned 250 total points
ID: 34168645
The last pattern you specify will also match:
123-12-1234
unless you reduce it to:
^(\d{6}-\d{5}|\d\d-\d{7})$
which just matches the formats:
123456-12345
12-1234567
0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34168713
Oh nice so: ^(\d{6}-\d{5}|\d\d-\d{7})$

Will match?

123456-12345
12-1234567

I will give this a run.

Thank you!
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 34168730
Yes, exactly
0
 
LVL 1

Author Comment

by:catonthecouchproductions
ID: 34168770
Works like a charm! I am finally understanding reg ex alot more after this question. Thanks all

0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

by Batuhan Cetin Regular expression is a language that we use to edit a string or retrieve sub-strings that meets specific rules from a text. A regular expression can be applied to a set of string variables. There are many RegEx engines for u…
I have been reconstructing a PHP-based application that has grown into a full blown interface system over the last ten years by a developer that has now gone into business for himself building websites. I am not incredibly fond of writing PHP code o…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now