regular expression in c# - capture pattern jan10 or jan08 but not janis

I have descriptions that contain strings like:

"Jan09 blah blah blah"


"blah blah feb12 blah"

Basically the string may contain a token in the form MMMYY where MMM is a three letter month abbrieviation, and YY is a two digit year.   This token form may appear 0-2 times in the string (if it appears twice, there will be two months listed, ie apr07-oct07)

I need, for each string, to determine which month appears in the string - if any.   I want to write a c# method that will take the string and return either null or the three character month code.

So, if jan07 appears, or jan14 appears, I want it to return jan.  But if some other value that is missing the year digits appears, such as "janis feb08" - the test should not return feb.  If two months appear, I want the method to return the first month code - "blah apr07-oct07 blah blah" should have a value of "apr"

I figure I need a sequence of regular expressions that test for each month code in sequence - and, if found, note the index location it was found at.  If more than one was found, return the one that has the smaller index location value.

So, I need a c# regular expression test that will return true for both "jan08" and "jan12" , but false for "janis" -- and some way to determine the index it was found at, so I'd get an index location value of 11 and not 0 for this string: "janis blah jan08"  (if I counted it right).

Or any other code that will get the job done.

Who is Participating?
ddrudikConnect With a Mentor Commented:
It's a regex pattern, you would need to include it with c# syntax etc.

Something like:
Regex reg = new Regex(@"(?<!-)(?:jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)(?=\d\d)", RegexOptions.IgnoreCase);
MatchCollection matchColl = reg.Matches("blah apr07-oct07 blah blah");
foreach (Match m in matchColl)

"blah apr07-oct07 blah blah":

    [0] => Array
            [0] => apr


MarFarMaAuthor Commented:
I don't get it.  Is (?<!-) C# syntax?  It seems more like perl.  Same with the Array construct.  I don't begin to understand what it's doing, or how it's related to the first line of code.

If it is C#, then I need baby steps, because I've never seen a code like this before, and I don't know how to use it.  If it's not, I need help to translate it into C#.

Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

MarFarMaAuthor Commented:
I just ran it in the debugger - works a treat.  How is it that it only matches the first occurance in the string?
The array construct was only to show the matches received with your sample text and my regex pattern, it is not C# code.

This regex construct says "match but do not capture absence of '-'"  Anything following it (our date construct) would fail to match if it followed a '-'.
Thanks for the question and the points.
MarFarMaAuthor Commented:
Ok - just tested this:

MatchCollection matchColl = reg.Matches("janis oct07apr07 blah blah");

returns oct and apr - but since oct is first in the array, I'm still OK if I just take the first element.  Was that coincidence?  or can I rely on it?

ie - if I have multiple matches, the first one in the array will have been the first in order in the string?
If the source could vary you might be best in matching on:
Regex reg = new Regex(@"(?:jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)(?=\d\d)", RegexOptions.IgnoreCase);

Then just use just m[0].Captures[0].Value, ignoring the remaining matches, if any.  The captures will always be in the order found in the string, from start to end.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.