• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 181
  • Last Modified:

Regular Expression for PICS

Hello,

I have been trying (unsuccessfullly) for a short time to right a regex to parse a RAT file as used to load content rating information.

Forgetting the file header, the data is grouped like this:

(category (transmit-as "SS~~000") (name "Age Range")
   (min 1) (max 2) (label-only true) (integer true)
   (label
     (name "All Ages")
     (value 1))
   (label
     (name "Older Children")
     (value 2)))

 (category
  (transmit-as "og")
  (name "Other Topics - Material that might be perceived as setting a bad example for young children.")
   (label
   (name "")
   (description "Do not allow access to sites that contain images, portrayals or descriptions that might be perceived as setting a bad example for young children.")
   (value 0) )
   (label
   (name "")
   (description "Allow access to sites that contain images, portrayals or descriptions that might be perceived as setting a bad example for young children.")
   (value 1) ))

(Yes the catagories can end with either a ) )) or a ))) just to make it fun)

The above two categroies are from different RAT files, but I would like a single expression (if at all possible) to look after either format.

I have a regex that matches Categories only, but I am stuck on one that will match the labels.  If this could include Capture-By-Name (?<name>) it would be good.
I would like to match:

<label> <name> <description> <value>
-- NEXT MATCH ---
<label> <name> <description> <value>
-- NEXT MATCH ---

I hope this makes sense?  I'm being lazy here - I could spend another few hours on this, but it is Zzzz time - so I'm offerring 500 to the first person who supplies the required regex for me :-)

Many thanks
0
Si-clone
Asked:
Si-clone
1 Solution
 
NipNFriar_TuckCommented:
You might try something like:

     public struct ARec {
        public string CATEGORY;
        public string LABEL;
        public string VALUE;
        public string DESCRIPTION;
    }


           Regex regex = new Regex(
                @"category|label[\s\r\n]+|name\s\""(?<NAME>[0-9A-Za-z\s\-\.]+)"
                + @"|description\s\""(?<DESCRIPTION>[0-9A-Za-z\s\-\.\,]+)|value\s"
                + @"(?<VALUE>[0-9A-Za-z\s\-\.]+)",
                RegexOptions.IgnoreCase
                | RegexOptions.Multiline
                | RegexOptions.ExplicitCapture
                | RegexOptions.IgnorePatternWhitespace
                | RegexOptions.Compiled
                );

            MatchCollection matches = regex.Matches(pic);
            bool isCategory = false;
            bool isComplete = false;
            ARec[] recs = new ARec[10];
            int count = 0;
            foreach ( Match match in matches ) {
                if ( isComplete ) {
                    count++;
                    isComplete = false;
                }

                if ( match.Value.IndexOf("category") >= 0 ) {
                    isCategory = true;
                }

                if ( match.Groups["NAME"].Success ) {
                    if ( isCategory ) {
                        recs[count].CATEGORY = match.Groups["NAME"].Value;
                        isCategory = false;
                    } else {
                        recs[count].LABEL = match.Groups["NAME"].Value;
                    }
                } else if ( match.Groups["DESCRIPTION"].Success ) {
                    recs[count].DESCRIPTION = match.Groups["DESCRIPTION"].Value;
                } else if  ( match.Groups["VALUE"].Success ) {
                    recs[count].VALUE = match.Groups["VALUE"].Value;
                    isComplete = true;
                }
            }

This will give you an array of structures with the fields of category, label, value and description... of course a class may server your purpose better...


HTH
0
 
Si-cloneAuthor Commented:
Thanks for your post.  This was exactly what I was looking for, but showed me to open my eyes to other avenues.

Cheers,
0

Featured Post

Prep for the ITIL® Foundation Certification Exam

December’s Course of the Month is now available! Enroll to learn ITIL® Foundation best practices for delivering IT services effectively and efficiently.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now