• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 205
  • Last Modified:

parsing xml file

here's the xml file that i have:
<SOURCE>
<CONTROL SRC='controlstation' >
<CABINET NAME='8900000' TYPE='SIP'
 P_NAME='SALES700' SERIAL_NO=''
 SLOTS='4' CS='2'>
 <CONTROL_STATION HOSTNAME='hawkeye' VERSION='1.3-0'
  DATE_TIME_GMT='2005/04/01-20:37:50' DATE_TIME_LOCAL='2005/04/01-15:37:50'/>
 <PHYSICAL_STORE>
  <TRIX ID="890" ACL="00" TRIX_IDENT="" MODEL="100" TYPE="FA"
   CODE_V="" CODE_V_NUM="" CODE_DATE=""
   POWER_DATE=""
   API_V="V1.2"
   DISK="20" >
  </TRIX>
  <TRIX ID="891" ACL="00" TRIX_IDENT="" MODEL="100" TYPE="DA"
   CODE_V="" CODE_V_NUM="" CODE_DATE=""
   POWER_DATE=""
   API_V="V1.1"
   DISK="10" >
  </TRIX>
  </PHYSICAL_STORE>
  </CABINET>
  </CONTROL>
</SOURCE>

And i got this code from ozo:

#!/usr/bin/perl -w

use strict;
use XML::Simple;
#my $file = 'files/source.xml';
my $file=shift||'files/celerra.xml';
my $xs1 = XML::Simple->new();

my $doc = $xs1->XMLin($file);

printattributes($doc);
sub printattributes{
    my $node = shift;
    foreach( keys %$node ){
        if( ref $node->{$_} eq 'HASH' ){
            printattributes($node->{$_});
        }elsif( ref $node->{$_} eq 'ARRAY' ){
            printattributes($_) for @{$node->{$_}};
        }else{
            print "$_\t$$node{$_}\n";
        }
    }
}

the problem i'm having with this code is that :

From the XML above, under the cabinet-> there's the attribute {type}, the problem is that in another node, there's also an attribute {type} and this is totally different attribute. How can I differentiate cabinet->type from trix->type? Also as you can see, <trix> is repeated a few times, how can i count how many <trix> there is?

thanks
0
ericworldz
Asked:
ericworldz
  • 8
  • 6
1 Solution
 
kanduraCommented:
You can distinguish between nodes by using more knowledge of the particular xml file.

    $cabinet = $doc->{CONTROL}->{CABINET};
    print "Cabinet type is ", $cabinet->{type}, $/;


But here's a version that prints tag names, counts array elements, and indents the output:


#!/usr/bin/perl -w

use strict;
use XML::Simple;
my $xs1 = XML::Simple->new();

my $xml = join '', <DATA>;
my $doc = $xs1->XMLin($xml);

printattributes($doc,0);
sub printattributes{
    my $node = shift;
    my $indent = shift;
    foreach( keys %$node ){
        if( ref $node->{$_} eq 'HASH' ){
            print "\t"x$indent, $_, $/;
            printattributes($node->{$_}, $indent+1);
        }elsif( ref $node->{$_} eq 'ARRAY' ){
            my $tag = $_;
            my $c = 0;
            for( @{$node->{$_}}) {
                print "\t"x$indent, $tag, " ", $c++, $/;
                printattributes($_, $indent+1);
            }
        }else{
            print "\t"x$indent, "$_\t$$node{$_}\n";
        }
    }
}

__DATA__
<SOURCE>
<CONTROL SRC='controlstation' >
<CABINET NAME='8900000' TYPE='SIP'
 P_NAME='SALES700' SERIAL_NO=''
 SLOTS='4' CS='2'>
 <CONTROL_STATION HOSTNAME='hawkeye' VERSION='1.3-0'
  DATE_TIME_GMT='2005/04/01-20:37:50' DATE_TIME_LOCAL='2005/04/01-15:37:50'/>
 <PHYSICAL_STORE>
  <TRIX ID="890" ACL="00" TRIX_IDENT="" MODEL="100" TYPE="FA"
   CODE_V="" CODE_V_NUM="" CODE_DATE=""
   POWER_DATE=""
   API_V="V1.2"
   DISK="20" >
  </TRIX>
  <TRIX ID="891" ACL="00" TRIX_IDENT="" MODEL="100" TYPE="DA"
   CODE_V="" CODE_V_NUM="" CODE_DATE=""
   POWER_DATE=""
   API_V="V1.1"
   DISK="10" >
  </TRIX>
  </PHYSICAL_STORE>
  </CABINET>
  </CONTROL>
</SOURCE>
0
 
kanduraCommented:
here's the output I get:

CONTROL
    SRC    controlstation
    CABINET
        NAME    8900000
        CONTROL_STATION
            DATE_TIME_GMT    2005/04/01-20:37:50
            HOSTNAME    hawkeye
            DATE_TIME_LOCAL    2005/04/01-15:37:50
            VERSION    1.3-0
        TYPE    SIP
        PHYSICAL_STORE
            TRIX 0
                ID    890
                DISK    20
                CODE_V_NUM    
                API_V    V1.2
                POWER_DATE    
                TYPE    FA
                TRIX_IDENT    
                ACL    00
                CODE_V    
                CODE_DATE    
                MODEL    100
            TRIX 1
                ID    891
                DISK    10
                CODE_V_NUM    
                API_V    V1.1
                POWER_DATE    
                TYPE    DA
                TRIX_IDENT    
                ACL    00
                CODE_V    
                CODE_DATE    
                MODEL    100
        CS    2
        SLOTS    4
        SERIAL_NO    
        P_NAME    SALES700
0
 
ericworldzAuthor Commented:
actually, parsing through the xml, i have the following error:
Can't use string ("anon=0") as a HASH ref while "strict refs" in use

at this line:
foreach( keys %$node ){

this is the new xml:

<SOURCE>
<CONTROL SRC='controlstation' >
<CABINET NAME='8900000' TYPE='SIP'
 P_NAME='SALES700' SERIAL_NO=''
 SLOTS='4' CS='2'>
 <CONTROL_STATION HOSTNAME='hawkeye' VERSION='1.3-0'
  DATE_TIME_GMT='2005/04/01-20:37:50' DATE_TIME_LOCAL='2005/04/01-15:37:50'/>
<MOVERS>
  <MOVER NAME="krb12mover" VERSION="1.2.3" MODEL="ABC"
      <PDEVICES>
          <DEV NAME="fxp1" DESC=""><DEV_OPT>speed=auto</DEV_OPT></DEV>
          <DEV NAME="fxp2" DESC=""><DEV_OPT>speed=auto</DEV_OPT></DEV>
      </PDEVICES>
      <LDEVICES/>
    </MOVER>
  </MOVERS>
<PHYSICAL_STORE>
  <TRIX ID="890" ACL="00" TRIX_IDENT="" MODEL="100" TYPE="FA"
   CODE_V="" CODE_V_NUM="" CODE_DATE=""
   POWER_DATE=""
   API_V="V1.2"
   DISK="20" >
  </TRIX>
  <TRIX ID="891" ACL="00" TRIX_IDENT="" MODEL="100" TYPE="DA"
   CODE_V="" CODE_V_NUM="" CODE_DATE=""
   POWER_DATE=""
   API_V="V1.1"
   DISK="10" >
  </TRIX>
  </PHYSICAL_STORE>
  </CABINET>
  </CONTROL>
</SOURCE>

0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
kanduraCommented:
ericworldz,
>   <MOVER NAME="krb12mover" VERSION="1.2.3" MODEL="ABC"

this tag is not closed properly. I guess that explains the error message.
0
 
ericworldzAuthor Commented:
sorry that's a typo, i dont believe that's the error.

the new xml file that i posted has more depth than the previous one. i tested your printattributes codes with the new xml file that i posted above and that's the error I got.

0
 
kanduraCommented:
here's my output with your broken tag:

    CONTROL
        SRC    controlstation
        CABINET
            NAME    8900000
            CONTROL_STATION
                DATE_TIME_GMT    2005/04/01-20:37:50
                HOSTNAME    hawkeye
                DATE_TIME_LOCAL    2005/04/01-15:37:50
                VERSION    1.3-0
            TYPE    SIP
            CS    2
            MOVERS    
     
              speed=auto
              speed=auto
         
         
       
       
   
     
     
     
     
     
     
     
            SLOTS    4
            SERIAL_NO    
            P_NAME    SALES700
   
   

And here is my output with that typo corrected:

CONTROL
    SRC    controlstation
    CABINET
        NAME    8900000
        CONTROL_STATION
            DATE_TIME_GMT    2005/04/01-20:37:50
            HOSTNAME    hawkeye
            DATE_TIME_LOCAL    2005/04/01-15:37:50
            VERSION    1.3-0
        TYPE    SIP
        PHYSICAL_STORE
            TRIX 0
                ID    890
                DISK    20
                CODE_V_NUM    
                API_V    V1.2
                POWER_DATE    
                TYPE    FA
                TRIX_IDENT    
                ACL    00
                CODE_V    
                CODE_DATE    
                MODEL    100
            TRIX 1
                ID    891
                DISK    10
                CODE_V_NUM    
                API_V    V1.1
                POWER_DATE    
                TYPE    DA
                TRIX_IDENT    
                ACL    00
                CODE_V    
                CODE_DATE    
                MODEL    100
        CS    2
        MOVERS
            MOVER
                NAME    krb12mover
                MODEL    ABC
                PDEVICES
                    DEV 0
                        NAME    fxp1
                        DESC    
                        DEV_OPT    speed=auto
                    DEV 1
                        NAME    fxp2
                        DESC    
                        DEV_OPT    speed=auto
                VERSION    1.2.3
                LDEVICES
        SLOTS    4
        SERIAL_NO    
        P_NAME    SALES700

No errors reported.
0
 
kanduraCommented:
Anyway, I think you'd be better off with more specific code if you'd want to do anything useful with this xml:

$cabinet = $doc->{CONTROL}->{CABINET};
@trix = @{ $cabinet->{PHYSICAL_STORE}->{TRIX} };

print "I have ", scalar(@trix), " trixes, and they have types:", join(', ', map { $_->{TYPE} } @trix), $/;
0
 
ericworldzAuthor Commented:
can you explain what does this error mean?

Can't use string ("anon=0") as a HASH ref while "strict refs" in use at... this line:
 foreach( keys %$node ){
0
 
ericworldzAuthor Commented:
how can i use this code:

$cabinet = $doc->{CONTROL}->{CABINET};
@trix = @{ $cabinet->{PHYSICAL_STORE}->{TRIX} };

print "I have ", scalar(@trix), " trixes, and they have types:", join(', ', map { $_->{TYPE} } @trix), $/;



0
 
ericworldzAuthor Commented:
kandura,

i found the problem, somewhere in my xml, i have this:
 <OPTION>anon=0</OPTION>
// there is an empty line here...
 <OPTION>umask=0</OPTION>
 
  how to fix that?
0
 
kanduraCommented:
ericworldz,
> can you explain what does this error mean?

> Can't use string ("anon=0") as a HASH ref while "strict refs" in use
> at... this line:
>  foreach( keys %$node ){

Quite simple: somehow, $node is a regular string containing "anon=0". And you dereference a string as a hash. That is, you can't say %"anon=0" and have it mean something.
But the fact that you get that error, and I don't, probably means you copied my script wrong.


> $cabinet = $doc->{CONTROL}->{CABINET};
> @trix = @{ $cabinet->{PHYSICAL_STORE}->{TRIX} };

> print "I have ", scalar(@trix), " trixes, and they have types:", join(', ', map { $_->{TYPE} } @trix), $/;

You can use this like this:


     #!/usr/bin/perl -w
   
    use strict;
    use XML::Simple;
    my $xs1 = XML::Simple->new();
   
    my $xml = join '', <DATA>;
    my $doc = $xs1->XMLin($xml);
   
    my $cabinet = $doc->{CONTROL}->{CABINET};
    my @trix = @{ $cabinet->{PHYSICAL_STORE}->{TRIX} };
   
    print "I have ", scalar(@trix), " trixes, and they have types: ", join(', ', map { $_->{TYPE} } @trix), $/;

Can you see how I get at nodes in the xml by pointing to the right element in the hash structure?
0
 
kanduraCommented:
ericworldz,
> i found the problem, somewhere in my xml, i have this:

Empty lines should not be a problem. The issue must be somewhere else around there.
0
 
ericworldzAuthor Commented:
print "I have ", scalar(@trix), " trixes, and they have types: ", join(', ', map { $_->{TYPE} } @trix), $/;

how do i assign $/ to an array variable?

my @trix_array = join(', ', map { $_->{TYPE} } @trix); ?
0
 
kanduraCommented:
The "join" creates a single string. The $/ is my prefered way of writing "\n". To put both strings in an array, you'd say:

    @array = ( join( ... ), $/ );

In other words, you make a list of them.
0

Featured Post

Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

  • 8
  • 6
Tackle projects and never again get stuck behind a technical roadblock.
Join Now