Solved

finding and manipulating multiple occurences in same line

Posted on 2003-12-01
6
236 Views
Last Modified: 2010-03-04
Hello:

I have an XML file and do not want to use XML::DOM.
The data I need is in one line like this:
...<DAT>123A</DAT>..otherstuff ...<DAT>123B</DAT>......<DAT>123C</DAT>....

I neet to pull the content of <DAT>..</DAT> possibly to an arry.
So the arry would have 123A, 123B, 123C.
How do I do that using regex without doing splits and so on.  Thanks
Yours Truely.

0
Comment
Question by:basilo
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
6 Comments
 
LVL 28

Accepted Solution

by:
FishMonger earned 125 total points
ID: 9851556
There are several ways to pull out the info; here's one method.

$str = '...<DAT>123A</DAT>..otherstuff ...<DAT>123B</DAT>......<DAT>123C</DAT>...';

(@dat) = $str =~ /<DAT>([^<]+)<\/DAT>/g;
print "$_\n" foreach @dat;
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9851646
Since you're going to be reading in an xml file, you'd use something closer to this:

open XML, "xml filename" or die "couldn't open xml file $!";

while (<XML>) {
   while (/<DAT>([^<]+)<\/DAT>/g) {
      push @dat, $1;
   }
}
print "$_\n" foreach @dat;
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9851730
If the info you want is broken up into seperate lines like this:

...<DAT>123A</DAT>..otherstuff ...<DAT>123B
</DAT>......<DAT>123C</DAT>...'
...<DAT>234A
</DAT>..otherstuff ...<DAT>234B</DAT>......<DAT>234C</DAT>...'
...<DAT>345A</DAT>..otherstuff ...<DAT>345B</DAT>......<DAT>
345C</DAT>...'

You can do something like this:

open XML, "xml filename" or die "couldn't open xml file $!";
{
   local $/;
   $dat = <XML>;
}
@dat = $dat =~ /<DAT>([^<]+)<\/DAT>/gs;
foreach (@dat) {s/\n//g}
print Dumper @dat;
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 

Author Comment

by:basilo
ID: 9852015
This is good.    I'm trying to understand the logic behined ([^<]+) and how it is pushed onto @dat.  Thank you.
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9852132
[^<]
Is a negated character class that says to match any character that is not a <
The + tells it to repeat the match as mush as possible.
The (  ) surrounding it, captures the match into $1 var.
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9852173
I forgot to add;

Since this is a direct assignment, $1 is assigned to the first element of the @dat array.
The g at the end of the regex tells it find all matches and since it's in list context, each match is assigned to the next element of the array.
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
On Microsoft Windows, if  when you click or type the name of a .pl file, you get an error "is not recognized as an internal or external command, operable program or batch file", then this means you do not have the .pl file extension associated with …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

623 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question