Inserting text strings in parsed output using XML::Twig

I wrote this script (see below) to extract chosen data points from one file to create a new (smaller) xml file and the file itself extract the data fine, but I would like output to add additional tags. I would like the blocks <HEADER> & </HEADER> in-between the m174/m175 data and <PRODUCT> & </PRODUCT> in-between the a00.. and b00.. data

So I would like the output to be:

<HEADER>
<m174>From Calif</m174>
<m175>YYYYMMDD</m174>
</HEADER>

<PRODUCT>
<a001>Lamps</a001>
<b001>Green</b001>
<b002>Money Saver</b002>
</PRODUCT>

<PRODUCT>
<a001>Flashlight</a001>
<a002>Small</a002>
<b001>Yellow</b001>
<b002>Bargain</b002>
</PRODUCT>

Right now the output is:

<m174>From Calif</m174>
<m175>YYYYMMDD</m174>

<a001>Lamps</a001>
<b001>Green</b001>
<b002>Money Saver</b002>

<a001>Flashlight</a001>
<a002>Small</a002>
<b001>Yellow</b001>
<b002>Bargain</b002>
 
All the tags are suppose to be there, but there's no guarantee ... the person who will be running this script isn't a programmer, but the current structure is one that they can understand.

#!/usr/bin/perl

use strict;
use XML::Twig;
use diagnostics;
use Encode qw(encode decode);
use Time::HiRes qw(gettimeofday);
use File::Copy;


my $t0 = gettimeofday;
our @files = glob("*.xml");      
foreach my $file(@files) {

                      open FILE, '<:encoding(UTF-8)', $file or warn "Can't open $file: $!";  
                      open PARSED, '>:encoding(UTF-8)', ($file . "_PARSED.txt") or warn "Cannot open file for write: $!";  

                      
         my $t= XML::Twig->new(
                  twig_roots   => {
                                   
                                'header/m174' => \&print_m174,
                             'header/m175' => \&print_m175,
                              
                                'product/a001' => \&print_a001,
                               'product/a002' => \&print_a002,
                             'product/b001' => \&print_b012,
                             'product/b002' => \&print_b061,
 

                                 
                 }
                            );
                           
        eval {$t->parsefile( $file);};

        print PARSED"\n\n";
       
        close FILE;



}            
            my $t1 = gettimeofday;
            my $elapsed = $t1 - $t0;


##  SUB ROUTINES  
         
                sub print_m174
                      { my( $t, $elt)= @_;
                            print PARSED "<m174>" . $elt->text . "<\/m174>\n";
                $t->purge;           # frees the memory
          }

                sub print_m175
                      { my( $t, $elt)= @_;
                            print PARSED "<m175>" . $elt->text . "<\/m175>\n";
                $t->purge;           # frees the memory
          }

                sub print_a001
                      { my( $t, $elt)= @_;
                            print PARSED "<a001>" . $elt->text . "<\/a001>\n";
                $t->purge;           # frees the memory
          }
         
                sub print_a002
                      { my( $t, $elt)= @_;
                            print PARSED "<a001>" . $elt->text . "<\/a001>\n";
                $t->purge;           # frees the memory
          }
         
                sub print_b001
                      { my( $t, $elt)= @_;
                            print PARSED "<a002>" . $elt->text . "<\/a002>\n";
                $t->purge;           # frees the memory
          }
         
                sub print_b002
                      { my( $t, $elt)= @_;
                            print PARSED "<a001>" . $elt->text . "<\/a001>\n";
                $t->purge;           # frees the memory
          }
         

##  END OF SUB ROUTINES  

            close PARSED;
hadronsAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ozoCommented:
What was in the input file you used to get the output you have right now?
0
hadronsAuthor Commented:
The input file basically looks like the desired output file (see below) ... I simplified the example a bit, but basically these files are huge and there's a lot of data this person doesn't need to see so I want to recreate the same type of file but without the extra data.

This input has extra tags, but I don't need to extract it so I don't call it;

<HEADER>
<m174>From Calif</m174>
<m175>YYYYMMDD</m174>
</HEADER>

<PRODUCT>
<a001>Lamps</a001>
<b001>Green</b001>
<b002>Money Saver</b002>
<d001>HUGE AMOUNT OF DATA</d001>
<j001>JUNK SHE DOESN'T CARE ABOUT</j001>
</PRODUCT>

<PRODUCT>
<a001>Flashlight</a001>
<a002>Small</a002>
<b001>Yellow</b001>
<b002>Bargain</b002>
<h001>SOMEONE ELSE'S CONCERN</h001>
</PRODUCT>

I can give an exact - but small - example if you wish
0
ozoCommented:
# after correcting the mismatched tag at
# <m175>YYYYMMDD</m174>
# I was able to parse it
         my $t= XML::Twig->new(

                 start_tag_handlers => {
                     HEADER  => \&print_start,
                     PRODUCT => \&print_start,
                 },
                 end_tag_handlers => {
                     HEADER  => \&print_end,
                     PRODUCT => \&print_end,
                 },
                 twig_roots   => {

                                'HEADER/m174' => \&print_m174,
                             'HEADER/m175' => \&print_m175,

                                'PRODUCT/a001' => \&print_a001,
                               'PRODUCT/a002' => \&print_a002,
                             'PRODUCT/b001' => \&print_b001,
                             'PRODUCT/b002' => \&print_b002,
                             #  _default_ => \&print_,  

                 }
                            );

        eval {$t->parsefile( $file);};
        warn $@ if $@;

        print PARSED"\n\n";

        close FILE;



}
my $t1 = gettimeofday;
my $elapsed = $t1 - $t0;


##  SUB ROUTINES                                                                                                                                                                                                        

                sub print_start
{ my( $t, $elt)= @_;
  eval{ print PARSED "<$elt>\n" };
  warn $@ if $@;
}

                sub print_end
{ my( $t, $elt)= @_;
  eval{ print PARSED "<\/$elt>\n\n" };
  warn $@ if $@;
}
                sub print_
{ my( $t, $elt)= @_;
  eval{  print PARSED "<" . $elt->name . " > " . $elt->text . " <\/" . $elt->name . ">\n"; };
  warn $@ if $@;
  $t->purge;           # frees the memory                                                                                                                                                                              
}
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
hadronsAuthor Commented:
I've requested that this question be closed as follows:

Accepted answer: 0 points for hadrons's comment #a39511394

for the following reason:

The script provided as the answer not only did as I wanted, but it was more flexible and ultimately a great improvement on my own; thanks
0
hadronsAuthor Commented:
The script not only did what was requested, but it was simpler and more flexible than my original script. Thanks for the great work.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Programming Languages-Other

From novice to tech pro — start learning today.