Solved

C# Regular Expression help.

Posted on 2013-11-16
1
551 Views
Last Modified: 2013-11-24
Hi,

I am trying to get my regular expression in C# to extract the following

<![LOG[==========[ ccmsetup started in process 3172 ]==========]LOG]!><time="14:32:31.717-600"{ date="07-29-2013" component="ccmsetup" context="" type="1" thread="908" file="ccmsetup.cpp:9100">


so it would gather the following:

Capture 1: ==========[ ccmsetup started in process 3172 ]==========
Capture 2: 14:32:31.717-600
Capture 3: 07-29-2013

The problem happens if I have a multi-line entry:

<![LOG[Running installation package
  Package:     C:\Windows\ccmsetup\MicrosoftPolicyPlatformSetup.msi
  Log:         C:\Windows\ccmsetup\Logs\MicrosoftPolicyPlatformSetup.msi.log
  Properties:  REBOOT=Suppress ALLUSERS=1]LOG]!><time="14:33:26.552-600" date="07-29-2013" component="ccmsetup" context="" type="1" thread="908" file="msiutil.cpp:791">


I have got the following regular expression which almost works - except the new line.


<!\[LOG\[(.*)]LOG]!><time="(\d{1,2}:\d{1,2}:\d{1,2}.\d{1,3}-\d{1,3})"\sdate="(\d{1,2}-\d{1,2}-\d{1,4})"


Any suggestions on how to change the expression to make it work.


Thanks,

Ward.
0
Comment
Question by:whorsfall
1 Comment
 
LVL 35

Accepted Solution

by:
Robert Schutt earned 500 total points
ID: 39654953
Try this:
Regex re = new Regex(@"<!\[LOG\[(.*)]LOG]!><time=""(\d{1,2}:\d{1,2}:\d{1,2}\.\d{1,3}-\d{1,3})""\sdate=""(\d{1,2}-\d{1,2}-\d{1,4})""", RegexOptions.Singleline);

Open in new window

The option SingleLine changes matching so that a period matches any character, including a newline which it normally doesn't match.

The code for the function I used to test this and provide the output you specified:
        private void doTest(string s) {
            //Regex re = new Regex("<!\\[LOG\\[(.*)]LOG]!><time=\"(\\d{1,2}:\\d{1,2}:\\d{1,2}\\.\\d{1,3}-\\d{1,3})\"\\sdate=\"(\\d{1,2}-\\d{1,2}-\\d{1,4})\"", RegexOptions.Singleline);
            Regex re = new Regex(@"<!\[LOG\[(.*)]LOG]!><time=""(\d{1,2}:\d{1,2}:\d{1,2}\.\d{1,3}-\d{1,3})""\sdate=""(\d{1,2}-\d{1,2}-\d{1,4})""", RegexOptions.Singleline);
            MatchCollection ms = re.Matches(s);
            int gi = -1;
            foreach (Match m in ms) {
                foreach (Group g in m.Groups) {
                    if (++gi > 0) {
                        Console.WriteLine("Capture {0}: {1}", gi, g.Value);
                    }
                }
            }
        }

Open in new window

This is the code that calls that function:
          //string s1 = @"<![LOG[==========[ ccmsetup started in process 3172 ]==========]LOG]!><time=""14:32:31.717-600""{ date=""07-29-2013"" component=""ccmsetup"" context="""" type=""1"" thread=""908"" file=""ccmsetup.cpp:9100"">";
          //                                                                                                              ^- typo here
            string s1 = @"<![LOG[==========[ ccmsetup started in process 3172 ]==========]LOG]!><time=""14:32:31.717-600"" date=""07-29-2013"" component=""ccmsetup"" context="""" type=""1"" thread=""908"" file=""ccmsetup.cpp:9100"">";
            string s2 = @"<![LOG[Running installation package
  Package:     C:\Windows\ccmsetup\MicrosoftPolicyPlatformSetup.msi
  Log:         C:\Windows\ccmsetup\Logs\MicrosoftPolicyPlatformSetup.msi.log
  Properties:  REBOOT=Suppress ALLUSERS=1]LOG]!><time=""14:33:26.552-600"" date=""07-29-2013"" component=""ccmsetup"" context="""" type=""1"" thread=""908"" file=""msiutil.cpp:791"">";

            Console.WriteLine("Test 1");
            doTest(s1);
            Console.WriteLine("Test 2");
            doTest(s2);

Open in new window

Note: there's a typo in your "control" string (indicated in the comment within the code above) which caused a mismatch.

Here's a screen capture of the output:
capture of output
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Summary: Persistence is the capability of an application to store the state of objects and recover it when necessary. This article compares the two common types of serialization in aspects of data access, readability, and runtime cost. A ready-to…
This article introduced a TextBox that supports transparent background.   Introduction TextBox is the most widely used control component in GUI design. Most GUI controls do not support transparent background and more or less do not have the…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now