Solved

finding and manipulating multiple occurences in same line

Posted on 2003-12-01
6
210 Views
Last Modified: 2010-03-04
Hello:

I have an XML file and do not want to use XML::DOM.
The data I need is in one line like this:
...<DAT>123A</DAT>..otherstuff ...<DAT>123B</DAT>......<DAT>123C</DAT>....

I neet to pull the content of <DAT>..</DAT> possibly to an arry.
So the arry would have 123A, 123B, 123C.
How do I do that using regex without doing splits and so on.  Thanks
Yours Truely.

0
Comment
Question by:basilo
  • 5
6 Comments
 
LVL 28

Accepted Solution

by:
FishMonger earned 125 total points
ID: 9851556
There are several ways to pull out the info; here's one method.

$str = '...<DAT>123A</DAT>..otherstuff ...<DAT>123B</DAT>......<DAT>123C</DAT>...';

(@dat) = $str =~ /<DAT>([^<]+)<\/DAT>/g;
print "$_\n" foreach @dat;
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9851646
Since you're going to be reading in an xml file, you'd use something closer to this:

open XML, "xml filename" or die "couldn't open xml file $!";

while (<XML>) {
   while (/<DAT>([^<]+)<\/DAT>/g) {
      push @dat, $1;
   }
}
print "$_\n" foreach @dat;
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9851730
If the info you want is broken up into seperate lines like this:

...<DAT>123A</DAT>..otherstuff ...<DAT>123B
</DAT>......<DAT>123C</DAT>...'
...<DAT>234A
</DAT>..otherstuff ...<DAT>234B</DAT>......<DAT>234C</DAT>...'
...<DAT>345A</DAT>..otherstuff ...<DAT>345B</DAT>......<DAT>
345C</DAT>...'

You can do something like this:

open XML, "xml filename" or die "couldn't open xml file $!";
{
   local $/;
   $dat = <XML>;
}
@dat = $dat =~ /<DAT>([^<]+)<\/DAT>/gs;
foreach (@dat) {s/\n//g}
print Dumper @dat;
0
Gigs: Get Your Project Delivered by an Expert

Select from freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely and get projects done right.

 

Author Comment

by:basilo
ID: 9852015
This is good.    I'm trying to understand the logic behined ([^<]+) and how it is pushed onto @dat.  Thank you.
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9852132
[^<]
Is a negated character class that says to match any character that is not a <
The + tells it to repeat the match as mush as possible.
The (  ) surrounding it, captures the match into $1 var.
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9852173
I forgot to add;

Since this is a direct assignment, $1 is assigned to the first element of the @dat array.
The g at the end of the regex tells it find all matches and since it's in list context, each match is assigned to the next element of the array.
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
SIMPLE Perl Regex 1 157
Perl for loop for 2000 ms 7 98
How to search multiple patterms in a file with perl? 4 79
Call Shell Script from Perl Script 6 97
Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Along with being a a promotional video for my three-day Annielytics Dashboard Seminor, this Micro Tutorial is an intro to Google Analytics API data.

808 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question