Solved

regular expression in c# - capture pattern jan10 or jan08 but not janis

Posted on 2007-11-27
9
439 Views
Last Modified: 2010-04-15
I have descriptions that contain strings like:

"Jan09 blah blah blah"

or

"blah blah feb12 blah"

Basically the string may contain a token in the form MMMYY where MMM is a three letter month abbrieviation, and YY is a two digit year.   This token form may appear 0-2 times in the string (if it appears twice, there will be two months listed, ie apr07-oct07)

I need, for each string, to determine which month appears in the string - if any.   I want to write a c# method that will take the string and return either null or the three character month code.

So, if jan07 appears, or jan14 appears, I want it to return jan.  But if some other value that is missing the year digits appears, such as "janis feb08" - the test should not return feb.  If two months appear, I want the method to return the first month code - "blah apr07-oct07 blah blah" should have a value of "apr"

I figure I need a sequence of regular expressions that test for each month code in sequence - and, if found, note the index location it was found at.  If more than one was found, return the one that has the smaller index location value.

So, I need a c# regular expression test that will return true for both "jan08" and "jan12" , but false for "janis" -- and some way to determine the index it was found at, so I'd get an index location value of 11 and not 0 for this string: "janis blah jan08"  (if I counted it right).

Or any other code that will get the job done.

Thanks!
0
Comment
Question by:MarFarMa
  • 6
  • 3
9 Comments
 
LVL 27

Expert Comment

by:ddrudik
ID: 20360777
(?<!-)(?:jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)(?=\d\d)

"blah apr07-oct07 blah blah":

Array
(
    [0] => Array
        (
            [0] => apr
        )

)

0
 
LVL 1

Author Comment

by:MarFarMa
ID: 20360874
I don't get it.  Is (?<!-) C# syntax?  It seems more like perl.  Same with the Array construct.  I don't begin to understand what it's doing, or how it's related to the first line of code.

If it is C#, then I need baby steps, because I've never seen a code like this before, and I don't know how to use it.  If it's not, I need help to translate it into C#.

Thanks.
0
 
LVL 27

Accepted Solution

by:
ddrudik earned 500 total points
ID: 20360936
It's a regex pattern, you would need to include it with c# syntax etc.

Something like:
Regex reg = new Regex(@"(?<!-)(?:jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)(?=\d\d)", RegexOptions.IgnoreCase);
MatchCollection matchColl = reg.Matches("blah apr07-oct07 blah blah");
foreach (Match m in matchColl)
  {
  Console.WriteLine(m.Captures[0].Value);
  }
0
 
LVL 1

Author Comment

by:MarFarMa
ID: 20361119
I just ran it in the debugger - works a treat.  How is it that it only matches the first occurance in the string?
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 27

Expert Comment

by:ddrudik
ID: 20361123
The array construct was only to show the matches received with your sample text and my regex pattern, it is not C# code.
0
 
LVL 27

Expert Comment

by:ddrudik
ID: 20361145
(?<!-)

This regex construct says "match but do not capture absence of '-'"  Anything following it (our date construct) would fail to match if it followed a '-'.
0
 
LVL 27

Expert Comment

by:ddrudik
ID: 20361148
Thanks for the question and the points.
0
 
LVL 1

Author Comment

by:MarFarMa
ID: 20361210
Ok - just tested this:

MatchCollection matchColl = reg.Matches("janis oct07apr07 blah blah");

returns oct and apr - but since oct is first in the array, I'm still OK if I just take the first element.  Was that coincidence?  or can I rely on it?

ie - if I have multiple matches, the first one in the array will have been the first in order in the string?
0
 
LVL 27

Expert Comment

by:ddrudik
ID: 20361323
If the source could vary you might be best in matching on:
Regex reg = new Regex(@"(?:jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)(?=\d\d)", RegexOptions.IgnoreCase);

Then just use just m[0].Captures[0].Value, ignoring the remaining matches, if any.  The captures will always be in the order found in the string, from start to end.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

We all know that functional code is the leg that any good program stands on when it comes right down to it, however, if your program lacks a good user interface your product may not have the appeal needed to keep your customers happy. This issue can…
Do you hate spam? I do, and I am willing to bet you do as well. I often wonder, though, "if people hate spam so much, why do they still post their email addresses on the web?" I'm not talking about a plain-text posting here. I am referring to the fa…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

910 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now