Solved

extracting value of an element using XML::Parser

Posted on 2011-02-27
4
583 Views
Last Modified: 2012-05-11

I am trying to take an input xml file and convert it to a text file but I can only use
the XML::Parser library (not the XML:LibXML library). I have been following the online
O Reilly tutorial. I can parse the file, I can print the element name but I cannot print the
value of the elements (eventually I will save these in a file). How can I print the value
ot the element.


example:
I can print MS but cannot print 27724372855
<MS>27724372855</MS>   stream-based1.txt
sample5.xml
0
Comment
Question by:Johannne1
  • 2
  • 2
4 Comments
 
LVL 25

Accepted Solution

by:
clockwatcher earned 250 total points
ID: 34993649
The char call-back handles the values between the mark-up.  Here's your sample slightly modified to print out the values.
use XML::Parser;

 

# initialize the parser

my $parser = XML::Parser->new( Handlers => 

                                     {

                                      Start=>\&handle_start,

                                      End=>\&handle_end,

                                      Char=>\&handle_char,

     				      

                                });



eval { $parser->parsefile( shift @ARGV ); };

#$parser->parsefile( shift @ARGV );



# report any error that stopped parsing, or announce success

if( $@ ) {

    $@ =~ s/at \/.*?$//s;               # remove module line number

    print STDERR "\nERROR in :\n$@\n";

} else {

    print STDERR "is well-formed\n";

}

 

my @element_stack;                # remember which elements are open

 

# process a start-of-element event: print message about element

#

sub handle_start {

    my( $expat, $element, %attrs ) = @_;

 

    # ask the expat object about our position

    my $line = $expat->current_line;

   

 

    print "I see an $element element starting on line $line!\n";

 

    # remember this element and its starting position by pushing a

    # little hash onto the element stack

    

    push( @element_stack, { element=>$element, line=>$line });

 

    if( %attrs ) {

        print "It has these attributes:\n";

        while( my( $key, $value ) = each( %attrs )) {

            print "\t$key => $value\n";

        }

    }

}


sub handle_char {

    my ( $expat, $string )  = @_;

    # retrieve the current element 
    my $element = $element_stack[-1];

    # append our string into it
    $element->{value} .= $string if $element;
  
}


# process an end-of-element event

#

sub handle_end {

    my( $expat, $element ) = @_;

 

    # We'll just pop from the element stack with blind faith that

    # we'll get the correct closing element, unlike what our

    # homebrewed well-formedness did, since XML::Parser will scream

    # bloody murder if any well-formedness errors creep in.

    my $element_record = pop( @element_stack );

    print "I see that $element element that started on line ",

          $$element_record{ line }, " is closing now.\n";

    print "It's value was: $element_record->{value}\n\n";


}

Open in new window

0
 

Author Comment

by:Johannne1
ID: 34993703
Thanks I am having a look at this.
0
 

Author Comment

by:Johannne1
ID: 34993720
Looks great! Thanks!
0
 
LVL 25

Expert Comment

by:clockwatcher
ID: 34994138
Think the questioner may have accidentally selected his own comment as the answer or something.  Pretty sure he meant to select my comment.
0

Featured Post

3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This Micro Tutorial demonstrates using Microsoft Excel pivot tables, how to reverse engineer competitors' marketing strategies through backlinks.

813 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now