Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 248
  • Last Modified:

Pattern matching help - various entries

I am trying to look at some regex docs and get a expression to match any one of these below:

3N-2N-4N
6N-5N
2N-7N
2N-7N2A

Any ideas on how I can do that?


Thanks
0
catonthecouchproductions
Asked:
catonthecouchproductions
  • 8
  • 7
  • 4
  • +1
2 Solutions
 
cap2501Commented:
[362]N-[257]N-*[24]*[NA]*

would match any of those

All regex is very similar as such I use the java api's docs for reference no matter which language I am using the regex in (http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html)

[]s designate groups so [362]N would be either 2,6, or 2 followed by an N.

* designates 0+ of the previous thing so -* means 0,1,2... -s

Depending on your entire dataset you may need to refine this pattern to prevent false positives.
0
 
Terry WoodsIT GuruCommented:
Or just match against pattern:
^(3N-2N-4N|6N-5N|2N-7N|2N-7N2A)$
0
 
cap2501Commented:
regex models a finite state machine one char at a time, as such trying to match WHOLE_STRING|WHOLE_STRING doesn't work well.


From what I understand what your regex would logically end up as would be something like: 3N-2N-4(N or 6)N-5(N or 2)N-7N2A

to match the whole string you would need to use "look-head" logic in the regex which is a bit more complicated.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
käµfm³d 👽Commented:
>>  All regex is very similar ...

Alas young padawan, nothing could be further from the truth. There are many different engines in the wild.

Regex is all about patterns. Are you defining your pattern to be those characters above, or should this be more abstract (e.g. alpha num - alpha num - alpha num)?
0
 
Terry WoodsIT GuruCommented:
^ and $ in my regex match the start and end of the string, respectively. That technique might not be the most efficient, but it definitely works and is very easy to understand and thus easy to maintain.

However, kaufmed's comment is right on the mark - you might be wanting to match a pattern rather than just that list of values.
0
 
catonthecouchproductionsAuthor Commented:
Thank you all for the responses. I should have been more elaborate so I am looking for:

333-24-2352 (SSN)
123456-12345
12-1234567
12-1234567AA (two alpha at end)

@cap thanks for the explanation, helped it break it down

Thoughts?
0
 
käµfm³d 👽Commented:
That looks like a combination of SSN, FEIN, and some other codes. For that group, I'd suggest:
\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([A-Z]{2})?

Open in new window

0
 
catonthecouchproductionsAuthor Commented:
Hey kaufmed, I tried that and my validation doesnt seem to be passing.

Gonna check my code again, I am using jQuery validation custom method and this regex above
0
 
käµfm³d 👽Commented:
Perhaps you are passing "aa" rather than "AA"? I made it case-sensitive going by your previous data sample, but you could make it insensitive by expanding the range in the last group:
\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?

Open in new window

0
 
catonthecouchproductionsAuthor Commented:
Thanks! I keep trying the SSN format as well and its not working:

123-12-1231

Maybe its user error? haha

Just for the future where do you specify case? Here?

[a-zA-Z]
0
 
käµfm³d 👽Commented:
You can define a range in a bracket expression by inserting a hyphen between the start and end characters. In the above, there are actually two ranges:  one from "a" to "z" and one from "A" to "Z". The combination of these two ranges make the expression case-insensitive. Alternatively, you could use pattern modifiers to turn on case-insensitivity, but they usually apply to the entire expression (which wouldn't really affect this particular pattern). Here is an example of a pattern modifier:
// The "i" at the end of the pattern (outside of the slash)
//  turns on case-insensivity so either "a-z" or "A-Z" 
//  would match alpha characters of any case.
"12-1234567AA".match(/\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-z]{2})?/i);

Open in new window

0
 
käµfm³d 👽Commented:
You may also need to bound the pattern as TerryAtOpus demonstrated. If you are passing simply one of the above strings, bounding the pattern with ^ (start of line/string) and $ (end of line/string) might suffice. If you are passing this as part of some larger string, you can use word boundaries (\b) to bound the pattern. Here are examples of both:
// Whole string, use ^ and $
^\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?$

// As a substring, use \b
\b\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?\b

Open in new window

0
 
käµfm³d 👽Commented:
Correction:

Grouping (parentheses) is needed for the bounds to function correctly:
// Whole string, use ^ and $
^(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?)$

// As a substring, use \b
\b(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?)\b

Open in new window

0
 
catonthecouchproductionsAuthor Commented:
Thanks for this information.

That code will validate for:

333-24-2352 (SSN)
123456-12345
12-1234567
12-1234567AA (two alpha at end)

Right?

Thanks for the the quick explanation. Always been confused about reg ex
0
 
catonthecouchproductionsAuthor Commented:
Just tried this reg ex at this site: http://www.regular-expressions.info/javascriptexample.html

REG EX: ^(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?)$

It tested true for: 12-1234567AA - which is correct?

What if im trying to validate for all the types above?

Thanks again
0
 
käµfm³d 👽Commented:
>>  What if im trying to validate for all the types above?

I don't understand the question. The pattern supplied will return true if the source data matches any of the alternatives, but you won't actually get an indication of which sub-pattern matched. For that, AFIK, you will need to create a separate regex search for each different sub-pattern.
0
 
catonthecouchproductionsAuthor Commented:
I understand now. I made changes to my validation method to run an if else.

I have:

^(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7}([a-zA-Z]{2})?)$ = 12-1234567AA
^\d{3}-\d{2}-\d{4}$ = 333-35-1361 (SSN)

Now for these last two:

123456-12345
12-1234567

For this one: 12-1234567 - would it be:

^(\d{3}-\d\d-\d{4}|\d{6}-\d{5}|\d\d-\d{7})$
0
 
Terry WoodsIT GuruCommented:
The last pattern you specify will also match:
123-12-1234
unless you reduce it to:
^(\d{6}-\d{5}|\d\d-\d{7})$
which just matches the formats:
123456-12345
12-1234567
0
 
catonthecouchproductionsAuthor Commented:
Oh nice so: ^(\d{6}-\d{5}|\d\d-\d{7})$

Will match?

123456-12345
12-1234567

I will give this a run.

Thank you!
0
 
Terry WoodsIT GuruCommented:
Yes, exactly
0
 
catonthecouchproductionsAuthor Commented:
Works like a charm! I am finally understanding reg ex alot more after this question. Thanks all

0

Featured Post

Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

  • 8
  • 7
  • 4
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now