Solved

finding and manipulating multiple occurences in same line

Posted on 2003-12-01
6
205 Views
Last Modified: 2010-03-04
Hello:

I have an XML file and do not want to use XML::DOM.
The data I need is in one line like this:
...<DAT>123A</DAT>..otherstuff ...<DAT>123B</DAT>......<DAT>123C</DAT>....

I neet to pull the content of <DAT>..</DAT> possibly to an arry.
So the arry would have 123A, 123B, 123C.
How do I do that using regex without doing splits and so on.  Thanks
Yours Truely.

0
Comment
Question by:basilo
  • 5
6 Comments
 
LVL 28

Accepted Solution

by:
FishMonger earned 125 total points
ID: 9851556
There are several ways to pull out the info; here's one method.

$str = '...<DAT>123A</DAT>..otherstuff ...<DAT>123B</DAT>......<DAT>123C</DAT>...';

(@dat) = $str =~ /<DAT>([^<]+)<\/DAT>/g;
print "$_\n" foreach @dat;
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9851646
Since you're going to be reading in an xml file, you'd use something closer to this:

open XML, "xml filename" or die "couldn't open xml file $!";

while (<XML>) {
   while (/<DAT>([^<]+)<\/DAT>/g) {
      push @dat, $1;
   }
}
print "$_\n" foreach @dat;
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9851730
If the info you want is broken up into seperate lines like this:

...<DAT>123A</DAT>..otherstuff ...<DAT>123B
</DAT>......<DAT>123C</DAT>...'
...<DAT>234A
</DAT>..otherstuff ...<DAT>234B</DAT>......<DAT>234C</DAT>...'
...<DAT>345A</DAT>..otherstuff ...<DAT>345B</DAT>......<DAT>
345C</DAT>...'

You can do something like this:

open XML, "xml filename" or die "couldn't open xml file $!";
{
   local $/;
   $dat = <XML>;
}
@dat = $dat =~ /<DAT>([^<]+)<\/DAT>/gs;
foreach (@dat) {s/\n//g}
print Dumper @dat;
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:basilo
ID: 9852015
This is good.    I'm trying to understand the logic behined ([^<]+) and how it is pushed onto @dat.  Thank you.
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9852132
[^<]
Is a negated character class that says to match any character that is not a <
The + tells it to repeat the match as mush as possible.
The (  ) surrounding it, captures the match into $1 var.
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9852173
I forgot to add;

Since this is a direct assignment, $1 is assigned to the first element of the @dat array.
The g at the end of the regex tells it find all matches and since it's in list context, each match is assigned to the next element of the array.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This Micro Tutorial will teach you how to censor certain areas of your screen. The example in this video will show a little boy's face being blurred. This will be demonstrated using Adobe Premiere Pro CS6.

895 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now