C# regular expression

Hi, experts

I need some help about using regular expression to find {month}+{day} from text string in C#

Here is My regular expression
((J(AN(.*)|UN(.*)|UL(.*)))|FEB(.*)|MAR(.*)|(A(PR(.*)|UG(.*)))|MAY|SEP(.*)|NOV(.*)|DEC(.*)|OCT(.*))\s*(\b0?[1-9]|1[0-9]|2[0-9]|3[0-1]\b)

(Case 1)
it works ok to find date in the string like
input::  xxxxxx Apri 21  xxxxx    
output: Apri 21

(Case 2)
however it also do
input : xxxxxx Apri 21  xxxxx   May 13 xxxxxx
output: Apri 21  xxxxx   May 13

I don't expect  the result of (Case 2) Since  my goal is to write 2 regular expressions.
one is find only singe date (like case), and the other regular expression parse date range format (like case2). So I can program them differently once I know they are singe date or date range.

Now it looks like my pattern got overlap  result. Could someone help me about this 2 regular expressions.

By the way, I use something like OCT(.*) because I want to make it works in "OCT", "OCT." "OCTOBER" and some possible typo. I am not sue this is write approach or not. Please correct me if I am wrong

thanks in advance

rmtogetherAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Patrick MatthewsCommented:
Well, keep in mind that "J(AN(.*)" will match "JANE", "JANSEN", "JANZZZ", etc., so you may want to rethink that approach :)

Patrick
0
rmtogetherAuthor Commented:
Hi, matthewspatrick:
Thank you for you reminding ,

I assume my input string will not have non-month follow by date number. like "JANE 23" . and anything like JANxxxxxxxx + {date number) would be a date in January

Could you please help me the overlap issue as well?
0
Patrick MatthewsCommented:
I'd go for something like:

(JAN(\.|UARY)?|FEB(\.|RUARY)?|MAR(\.|CH)?|APR(\.|IL)?|MAY|JUN(\.|E)?|JUL(\.|Y)?|AUG(\.|UST)?|SEP(\.|TEMBER)?|OCT(\.|OBER)?|NOV(\.|EMBER)?|DEC(\.|EMBER)?)\s+(0[1-9]|1\d|2\d|3[0-1])\b

Of course, that will do goofy things like say FEB 31 is OK, so you should couple it with a test to see if the return value is a valid date.
0
Cloud Class® Course: Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

DaveJellisonCommented:
You're having a simple lazy vs. greedy regular expression issue. Please see http://www.regular-expressions.info/repeat.html for more information but put simply I think the following will resolve your issue without testing on my end...



((J(AN(.+?)|UN(.+?)|UL(.+?)))|FEB(.+?)|MAR(.+?)|(A(PR(.+?)|UG(.+?)))|MAY|SEP(.+?)|NOV(.+?)|DEC(.+?)|OCT(.+?))\s*(\b0?[1-9]|1[0-9]|2[0-9]|3[0-1]\b)

Open in new window

0
Patrick MatthewsCommented:
Tweak to the pattern:

(JAN(\.|UARY)?|FEB(\.|RUARY)?|MAR(\.|CH)?|APR(\.|IL)?|MAY|JUN(\.|E)?|JUL(\.|Y)?|AUG(\.|UST)?|SEP(\.|TEMBER)?|OCT(\.|OBER)?|NOV(\.|EMBER)?|DEC(\.|EMBER)?)\s+(0?[1-9]|1\d|2\d|3[0-1])\b

As for date ranges, simply have your code interrogate the source string for all matches.  If there is only one match, then it is a single date.  If you have >1 match, then I guess you have a date range.
0
käµfm³d 👽Commented:
Why not just spell out the expected strings. Yes there will be a lot of text in the regex, but it will be easier to read later. You could also add any mispellings you expect to receive. Also, I'm not sure why you have such a complicated regex for numeric portion. All it will validate is that a date falls between 1 and 31 and will not gurantee that the number is valid for the given month. I would suggest taking the two values and trying to parse them with DateTime.Parse (you need to supply a valid year of course). You would still need to figure out what when a date is supposed to be a leap year (in order to pass a correct year to the Parse() function).
((?:JAN|JANU|JANUA|JANUAR|JANUARY)|
(?:FEB|FEBR|FEBRU|FEBRUA|FEBRUAR| FEBRUARY)|
(?:MAR|MARC|MARCH)|
(?:APR|ARPI|APRIL)|
(?:MAY)|
(?:JUN|JUNE)|
(?:JUL|JULY)|
(?:AUG|AUGU|AUGUS|AUGUST)|
(?:SEP|SEPTE|SEPTEM|SEPTEMB|SEPTEMBE|SEPTEMBER)
(?:OCT|OCTO|OCTOB|OCTOBE|OCTOBER)|
(?:NOV|NOVE|NOVEM|NOVEMB|NOVEMBE|NOVEMBER)|
(?:DEC|DECE|DECEM|DECEMB|DECEMBE|DECEMBER)
\.?)\s*(\d{1,2})

Open in new window

0
rmtogetherAuthor Commented:
Hi, matthewspatrick:

the one you gave to me
(JAN(\.|UARY)?|FEB(\.|RUARY)?|MAR(\.|CH)?|APR(\.|IL)?|MAY|JUN(\.|E)?|JUL(\.|Y)?|AUG(\.|UST)?|SEP(\.|TEMBER)?|OCT(\.|OBER)?|NOV(\.|EMBER)?|DEC(\.|EMBER)?)\s+(0?[1-9]|1\d|2\d|3[0-1])\b

only parse second part of date
like xxxxxx Apri 21 xxx May 21 --> only give me May 21
0
rmtogetherAuthor Commented:
Hi,  DaveJellison:

I check you pattern

  ((J(AN(.+?)|UN(.+?)|UL(.+?)))|FEB(.+?)|MAR(.+?)|(A(PR(.+?)|UG(.+?)))|MAY|SEP(.+?)|NOV(.+?)|DEC(.+?)|OCT(.+?))\s*(\b0?[1-9]|1[0-9]|2[0-9]|3[0-1]\b)

it return me incomplete date.Could you help me about this

input:  xxxxxx Apri 21 xxx May 21 xxx
return me
Apr 2
May 2
0
rmtogetherAuthor Commented:


Hi, kaufmed:

could you please prove me sample code of using Parse() based on your suggested pattern?
0
Patrick MatthewsCommented:
rmtogether,

Well, "Apri" is a non-standard abbreviation for April.

If you really want "Apri" to stand in for April, then I think kaufmed's approach is the way to go.

Patrick
0
rmtogetherAuthor Commented:
Thank you for all your help.  

In other words, I plan do the following program


if  Regex(singe date)
 
 // do something

else Regex (date range)

// do something

end

So I think I want have two regular expressions.  or using 1 regular expression and see it return me 1 or 2 result. Could you please help me about this. thank you
0
rmtogetherAuthor Commented:
Hi,  matthewspatrick:

I got you.
so your approach is based on check how many returns and if 1 return is regular date. if 2 returns is date range?

Could you please give me a sample code to check number of returns? thank you
0
käµfm³d 👽Commented:
I propose something like the following. Please note:  there was a spelling mistake in my original pattern (ARPI). I have corrected it below.
static void Main(string[] args)
{
    Match match;
    string test = "xxxxxx Apri 21  xxxxx   hay 13 xxxxxx";
    Regex reg = new Regex(@"(?i)((?:JAN|JANU|JANUA|JANUAR|JANUARY)|
                            (?:FEB|FEBR|FEBRU|FEBRUA|FEBRUAR| FEBRUARY)|
                            (?:MAR|MARC|MARCH)|
                            (?:APR|APRI|APRIL)|
                            (?:MAY)|
                            (?:JUN|JUNE)|
                            (?:JUL|JULY)|
                            (?:AUG|AUGU|AUGUS|AUGUST)|
                            (?:SEP|SEPTE|SEPTEM|SEPTEMB|SEPTEMBE|SEPTEMBER)
                            (?:OCT|OCTO|OCTOB|OCTOBE|OCTOBER)|
                            (?:NOV|NOVE|NOVEM|NOVEMB|NOVEMBE|NOVEMBER)|
                            (?:DEC|DECE|DECEM|DECEMB|DECEMBE|DECEMBER)
                            \.?)\s*(\d{1,2})", RegexOptions.IgnorePatternWhitespace);

    MatchCollection matches = reg.Matches(test);

    if (matches.Count == 1) // Single date
    {
        match = matches[0];

        if (CheckDateIsValid(match.Groups[1].Value, match.Groups[2].Value, true))
        {
            Console.WriteLine("Found single date!");
        }
    }
    else if (matches.Count == 2) // Date range
    {
        match = matches[0];

        if (CheckDateIsValid(match.Groups[1].Value, match.Groups[2].Value, true))
        {
            match = matches[1];

            if (CheckDateIsValid(match.Groups[1].Value, match.Groups[2].Value, true))
            {
                Console.WriteLine("Found date range!");
            }
            else
            {
                Console.WriteLine("No valid dates found!");
            }
        }
        else
        {
            Console.WriteLine("No valid dates found!");
        }
    }
    else
    {
        Console.WriteLine("No valid dates found!");
    }

    Console.ReadKey();
}

static bool CheckDateIsValid(string month, string day, bool isLeapYear)
{
    DateTime d;
    string year = isLeapYear ? "2000" : "1999";
    string testDate = string.Format("{0} {1}, {2}", month.Substring(0, 3), day, year);

    return DateTime.TryParse(testDate, out d);
}

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
käµfm³d 👽Commented:
Credit for the date-range check going to matthewspatrick, of cousre  :)
0
DaveJellisonCommented:

using System.Diagnostics;
using System.Text.RegularExpressions;

namespace ConsoleTestBed
{
	public class Program
	{
		public static Regex Expression = new Regex(@"((J(AN(.*)|UN(.*)|UL(.*)))|FEB(.*)|MAR(.*)|(A(PR(.+?)|UG(.*)))|MAY|SEP(.*)|NOV(.*)|DEC(.*)|OCT(.*))([1-9][1-9])", RegexOptions.IgnoreCase);
		public static void Main()
		{
			var input = "xxxxxx Apri 21 xxx May 21 xxx";

			var match = Expression.Match(input);

			Debug.Assert(match != null);
			Debug.Assert(match.Value == "Apri 21");
		}
	}
}

Open in new window

0
käµfm³d 👽Commented:
@DaveJellison

            var input = "xxxxxx this is Janitor 21 reporting for work.";

Is this a valid date?   ;)
0
käµfm³d 👽Commented:
Just ribbing you. I realize the OP said he/she wouldn't have strings of this type.
0
DaveJellisonCommented:
Oh I know :). It all depends on the variety of input of course. I mean if this is for context processing I think your solution is more comprehensive certainly. This reminds me of the quote "A programmer had a problem. He thought 'Aha, I'll use regular expressions!' Now the programmer had two problems".
0
käµfm³d 👽Commented:
Heheh.
0
Patrick MatthewsCommented:
DaveJellison,

>>"A programmer had a problem. He thought 'Aha, I'll use regular expressions!' Now the programmer had
>>two problems"

Ha!  I wish I had that one handy when I wrote my "using RegExp in VB6/VBA" article a few months back :)

Patrick
0
käµfm³d 👽Commented:
Very amusing :)
0
rmtogetherAuthor Commented:
Thank you for all of your help.
0
käµfm³d 👽Commented:
NP. Glad to help  :)
0
rmtogetherAuthor Commented:
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
.NET Programming

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.