Link to home
Start Free TrialLog in
Avatar of khanzada19
khanzada19

asked on

find a string in a xml file

I need to extract values from a xml file which would look like this
<E Name="E" RdfName="EnumA" Namespace="abc">
                  <Label Label="E"A/>
                  <EnumValue Name="S" RdfName="xyz" Value="1" Namespace="abc"> <Label Label="xyz"/>
</EnumValue>

I need to find E Name="E" stirng and than extract the value of Label Label= and Value=. After extracting the vaues I need ot insert into a table.
could someone help me with this
Avatar of FishMonger
FishMonger
Flag of United States of America image

Avatar of khanzada19
khanzada19

ASKER

I am not using perl 5.10.0, so I can't use examples from link
The XML::Simple module, and examples on the link, will work in versions of perl prior to 5.10.0 - as long as it's not to old.  What version of perl are you using?  Did you install the XML::Simple module?
I am using  perl, v5.6.0
v5.6.0 is a little old, but still within reason, so the current version of XML::Simple should work for you.  However, I'd recommend upgrading to 5.8.8 or 5.10
becuase of the restriction on upgrade I won't be able to use perl. Dose anyone how to do this oracle PL/SQL.
You will not need to upgrade your perl in order to use perl with the XML::Simple module.  Have you tried to install the XML::Simple module?  Did it install successfully?  Any errors?
I can't even install the XML::Simple module, I mean I can install in dev but in prod, so this potion is out. I need to do it with PL/SQL
This will get the values from your XML file.  It works on the example provided, but there could be XML files that will break it.

As for getting it into a database, what interface do you want to use?  The DBI module is the most common method.  Can you use this?
#if $str contains the above XML (or replace $str here with whatever variable contains the XML)
if($str =~ /E\s+Name="E".*?<Label\s+Label="(.*?)".*?Value="(.*?)"/s) {
	print "Label=$1\nValue=$2\n";
}

Open in new window

I am doing following, seems like it reading a line at time instead of whole string

open(IN,"foo.xml")
          || die HisTools::output("ERROR: Could not open file $datapoint\n\n");
while ($str = <IN>){
print"\n a = $str\n";
if($str =~ /E\s+Name="E".*?<Label\s+Label="(.*?)".*?Value="(.*?)"/s) {
      print "Label=$1\nValue=$2\n";
}

close("foo.xml");
}
You have 2 '<Label Label='  but only 1 'Value=' within '<E Name='
Do you need to extract both Label Label= tags or only one of them?

I need both 'Label Label='
The code I gave before will get only the first label.  Also, you need to read the entire file into memory, not one line at a time.  Try this
open(IN,"foo.xml")
  || die HisTools::output("ERROR: Could not open file $datapoint\n\n");
local $/;
my $str = <IN>;
close("foo.xml");
 
if($str =~ /<E\s+Name="E".*?<Label\s+Label="(.*?)".*?Value="(.*?)".*?<Label\s+Label="(.*?)"/s) {
	print "Label1=$1\nValue=$2\nLabel2=$3\n";
}

Open in new window

we are almost there, just one last thing how to handle if there are multiple entires

Thanks for your help!
ASKER CERTIFIED SOLUTION
Avatar of Adam314
Adam314

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Instead of loading the entire file into a scalar, you could loop over it line by line using the flip-flop operator.
while(<DATA>) {
   if ( /<E Name="E"/../<\/E>/ ) {
      if (/<Label Label="(.+?)"/) {
         print "Label=$1\n";
      }
      if (/Value="(.+?)"/) {
         print "Value=$1\n";
      }
   }
}
 
__DATA__
<E Name="E" RdfName="EnumA" Namespace="abc">
                  <Label Label="E"A/>
                  <EnumValue Name="S" RdfName="xyz" Value="1" Namespace="abc"> <Label Label="xyz"/></EnumValue>
                  <Label Label="F"A/>
                  <EnumValue Name="S" RdfName="xyz" Value="2" Namespace="abc"> <Label Label="zyx"/></EnumValue>
</E>

Open in new window