Solved

regular expressions, find text that starts with a word ends with another word but does not contain the last word.

Posted on 2003-10-27
3
1,523 Views
Last Modified: 2007-12-19
Quick one for anyone good with regular expressions.

Ok, say I have a XML file with this layout.

<event  time="23424" name="2325">
      <param name="5645645" data="1231">
      <param name="11231213" data="346">
      <param name="2351324" data="123">
</event>
<event  time="43747" name="2435">
      <param name="32452345" data="5674">
      <param name="2345" data="4567">
      <param name="2435" data="2345">
</event>

I want to use regular expressions to find blocks missing the </event> tag.

I can search and find the blocks from <event...> to the next <event...> but I can't figure out how to only list ones without </event>

Thanks
0
Comment
Question by:waynegs
3 Comments
 
LVL 2

Accepted Solution

by:
ultimatemike earned 250 total points
ID: 9633516
It's kinda tricky to do with regular expressions....Generally XML/HTML is recursive, and it's structure doesn't lend well to RE's.  Here's a perl script that does the job though - It'll print out and event tags that aren't closed:


Just change "xmlfile.xml" to whatever the filename is.


#!perl -w
use strict;


      open XML, "xmlfile.xml";

      $_ = <XML>;


s/<event/#/gm;
s/>\s*</></gm;


my @array = split /#/;


foreach my $element (@array) {
      if  ( $element && $element !~ /<\/event>$/g ) {
            print "FOUND: ";
            print "<event$element\n";
            
      }
};
0
 
LVL 2

Expert Comment

by:scully00000
ID: 9636188
What language are you using? You could use a regex to find the blocks and then search recursively through them for ones missing the </event> tag. Also, what do you want to do with the 'blocks' when you've flagged them? Has a bearing on how the script is written.

Cheers
0
 
LVL 8

Expert Comment

by:fozylet
ID: 9636224
Try your luck at http://www.regexp.org/

A previously answered question there may fit your need.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

There are two main kinds of selectors in CSS: One is base selector like h1, h2, body, table or any existing HTML tags.  For instance, the following rule sets all paragraphs (<p> elements) to red: (CODE) CSS also allows us to define our own custom …
I will show you how to create a ASP.NET Captcha control without using any HTTP HANDELRS or what so ever. you can easily plug it into your web pages. For Example a = 2 + 3 (where 2 and 3 are 2 random numbers) Session("Answer") = 5 then we…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to dynamically set the form action using jQuery.

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

24 Experts available now in Live!

Get 1:1 Help Now