Link to home
Start Free TrialLog in
Avatar of Richard Kreidl
Richard KreidlFlag for United States of America

asked on

Parse records from a textfile to populate a XML file

I have a XML file that I’d like to update from data records in a text file.

This is a small sample of the text file(MSR.txt) record layout.  There would be approx. 14 records when the file would be completed and some nights there might be less than 14 records depending on the day of the month:

BT30004|8:10pm
LASTISA|9:11pm
FM00124|2:30am
SM18034|12:34am

What I mean by completed is that these records are the completion times for certain nightly batch cycles.
As cycles complete, the last program and its completion time are entered as a record entry to this file.
So, basically this file is growing throughout the night and would be cleared out in the morning.

What I need to do is to read this text file every 15 minutes and update the XML file with entries that are present in the text file.
Not all entries from the text file correspond by exact name in the XML file.


I plan on using a scheduling system to run this Perl script every 15 minutes to see if there’s any new entries in the text file that need to be updated to the XML file.

Record names don’t match with corresponding XML element names. As you can see only a couple of them match each other, but most of them don’t.
Also two of the entries(CM10009 & FM18024) can have different suffixes in the text file.

Text File                      XML File
BT30004                       BT30004
CM10009A/B/C/P        CM10009
FM17157                       FADS
FM00124                       FIXA
FM18024A/B               FM24FHR
FM30160                       FMDNFHR
IH30011                        INSH
LASTISA                        LASTISA
PY17002                        MADYFHR
HR00013                        PAYFHR
CN44PDBN                 PDBN
FM30146                        SAMS
SR80253                        LASTSRM1
SM18034                        LASTSRM2


thanks
XML(Today.xml) File:
<?xml version='1.0' standalone='yes'?>
<MSR>
  <Info>
    <BT30004>8:10pm</BT30004>
    <CICS></CICS>
    <CICST></CICST>
    <CM10009></CM10009>
    <DISTM></DISTM>
    <DISTS></DISTS>
    <EBIS></EBIS>
    <ES00014></ES00014>
    <FADS></FADS>
    <FIXA>2:30am</FIXA>
   <FM24FHR></FM24FHR>
    <FMDNFHR></FMDNFHR>
    <FMI></FMI>
    <INSH></INSH>
    <LASTISA>9:11pm</LASTISA>
    <LASTSRM1></LASTSRM1>
    <LASTSRM2>12:34am</LASTSRM2>
    <LASTSRM2A></LASTSRM2A>
    <MADYFHR></MADYFHR>
    <PAYFHR></PAYFHR>
   <PDBN></PDBN>
    <PSF></PSF>
    <PSLDFHR></PSLDFHR>
    <TODAY>Wednesday, February 23, 2011</TODAY>
  </Info>
</MSR>

Open in new window

Avatar of wilcoxon
wilcoxon
Flag of United States of America image

You can't really add to an XML file but you can easily recreate it.  This script should do what you want...
#!/usr/local/bin/perl

use strict;
use warnings;
use POSIX;

my %map = (
    BT30004 => 'BT30004',
    CM10009A => 'CM10009',
    CM10009B => 'CM10009',
    CM10009C => 'CM10009',
    CM10009P => 'CM10009',
    FM17157 => 'FADS',
    FM00124 => 'FIXA',
    FM18024A => 'FM24FHR',
    FM18024B => 'FM24FHR',
    FM30160 => 'FMDNFHR',
    IH30011 => 'INSH',
    LASTISA => 'LASTISA',
    PY17002 => 'MADYFHR',
    HR00013 => 'PAYFHR',
    CN44PDBN => 'PDBN',
    FM30146 => 'SAMS',
    SR80253 => 'LASTSRM1',
    SM18034 => 'LASTSRM2',
);

my %sys;

open IN, 'MSR.txt' or die "could not open MSR.txt: $!";
while (<IN>) {
    chomp;
    my ($job, $tm) = split /\|/;
    $sys{$job} = $tm;
}
close IN;

open OUT, '>Today.xml' or die "could not write Today.xml: $!";
print OUT "<?xml version='1.0' standalone='yes'?>\n<MSR>\n  <Info>\n";
foreach my $job (sort keys %map) {
    my $tag = $map{$job};
    if (exists $sys{$job}) {
        print OUT "    <$tag>$sys{$job}</$tag>\n";
    } else {
        print OUT "    <$tag></$tag>\n";
        # this would be considered best practice but either works
        # print OUT "    <$tag/>\n";
    }
}
print OUT "    <TODAY>", strftime('%A, %B %d, %Y', localtime), "</TODAY>\n",
          "  </Info>\n</MSR>\n";
close OUT;

Open in new window

Avatar of Richard Kreidl

ASKER

I get the following error message:
 Undefined subroutine &main::strftime called at ./dailyopsUpdateMSRXMLFile line 56

This is line 56:
print OUT "    <TODAY>", strftime('%A, %B %d, %Y', localtime), "</TODAY>\n",
Actually, I removed this line:
print OUT "    <TODAY>", strftime('%A, %B %d, %Y', localtime), "</TODAY>\n",   The date is updated with another script.

So, this line isn't necessary.

This is what I left:
print OUT   "  </Info>\n</MSR>\n";  

The job runs but doesn't update the XML file.
If strftime is giving that error, you can just change "use POSIX" to "use POSIX qw(strftime)".

What do you mean it doesn't update the XML file?  Every time the script runs, it will create the XML file based on the text file.  If the XML file isn't updating then it's likely that the data in the text file didn't change.
Ok, I'm getting this error: Use of uninitialized value in hash element at ./dailyopsUpdateMSRXMLFile line 41, <IN> line 1


Line 41 is:
$sys{$job} = $tm;
Ok, I got it partially working.

This is the XML file yuor script creates:

<?xml version="1.0" standalone="yes" ?>
- <MSR>
- <Info>
  <BT30004 />
  <CM10009>11:23pm</CM10009>
  <CM10009 />
  <CM10009 />
  <CM10009 />
  <PDBN />
  <FIXA>10:21am</FIXA>
  <FADS>10:20am</FADS>
  <FM24FHR />
  <FM24FHR />
  <SAMS>10:23am</SAMS>
  <FMDNFHR />
  <PAYFHR>10:23am</PAYFHR>
  <INSH />
  <LASTISA />
  <MADYFHR>10:22am</MADYFHR>
  <LASTSRM2>10:59pm</LASTSRM2>
  <LASTSRM1>10:24am</LASTSRM1>
  </Info>
  </MSR>

Here is the text File:

FM17157|10:20am
FM00124|10:21am
PY17002|10:22am
HR00013|10:23am
FM30146|10:23am
SR80253|10:24am
SM18034|10:59pm
CM10009A|11:23pm

First of all, you're creating a XML file each time the script runs. I really need it to update it instead of creating it. The reason is because there are some tag entries with data already within them that I don't want to lose.

Here is an example of  a completed XML file:
<?xml version="1.0" standalone="yes" ?>
- <MSR>
- <Info>
  <AutoSys>None</AutoSys>
  <BT30004>9:01pm</BT30004>
  <CICS />
  <CICST />
  <CM10009>3:56am Saturday</CM10009>
  <CMHM />
  <COMP>9:36pm</COMP>
  <CSI>CSI outage report will be sent out by the CSI team on Next Business Day.</CSI>
  <Comment />
  <DISTM />
  <DISTS>9:15am Saturday</DISTS>
  <EBIS />
  <ES00014>8:13am Saturday</ES00014>
  <FADS>11:04pm</FADS>
  <FIXA>7:37pm</FIXA>
  <FM24FHR>1:55am Saturday</FM24FHR>
  <FMDNFHR>2:10am Saturday</FMDNFHR>
  <FMI>2:53am Saturday</FMI>
  <GDC>8:26pm</GDC>
  <ICCA>9:27pm</ICCA>
  <ICTM>7:00pm</ICTM>
  <IMZ />
  <IMZT />
  <INSH>9:00pm</INSH>
  <LASTISA>11:47pm</LASTISA>
  <LASTSRM1 />
  <LASTSRM2 />
  <LASTSRM2A />
  <MADYFHR>8:20pm</MADYFHR>
  <MVS>None</MVS>
  <NITECOMM>6:24am Saturday</NITECOMM>
  <ODSFHR>2:34am Saturday</ODSFHR>
  <PAYFHR />
  <PDBN>9:29pm</PDBN>
  <PSF>1:57am Saturday</PSF>
  <PSLDFHR>2:00am Saturday</PSLDFHR>
  <RPI />
  <SALES>3:02am Saturday</SALES>
  <SAMS>1:25am Saturday</SAMS>
  <SIVU>3:08am Saturday</SIVU>
  <SSEC>11:31pm</SSEC>
  <STMT>CSI outage on Sunday from 12:00am to hh:mm am/pm due to scheduled maintenance.</STMT>
  <TEEFHR />
  <TEPP>8:05pm</TEPP>
  <TNC />
  <TODAY>Friday, May 06, 2011</TODAY>
  <UMZ />
  <UMZT />
  <UPDATE />
  <maxAutoSys>0</maxAutoSys>
  <numAutoSys>0</numAutoSys>
  <numMVS>0</numMVS>
</Info>
  </MSR>



Secondly, you'll notice that the 'TODAY' tag will be filled by another Perl script I have that runs once a day and inserts the date. So, that won't be required in your script.

I hope this better clears up the request I'm looking for.

Thanks!
 
Here is a snippet portion of an old EE solution that I got from OZO when trying update an XML file. In case it was looking for the specific XML tag and insert the current UNIX time.

Now, I'm trying to look for multiple tags that are listed in a text file and updating the XML file. In this example I'm looking for the tag "BT30004".
===========================

open INPUTXML, "+<$InputXMLFile";
flock INPUTXML,LOCK_EX;

my $self = bless {}, "main";
my $xml = "";
{local $/; $xml .= <INPUTXML>; }
my $xml_ar = new XML::Simple->XMLin($xml);

my $xml_ar = new XML::Simple->XMLin($xml);
if(ref($xml_ar->{Info}{BT30004}) eq 'HASH' && scalar(keys(%{$xml_ar->{Info}{BT30004}})) == 0) {
  $xml_ar->{Info}{BT30004} = $dt;
      }
seek INPUTXML,0,0;
$_ = XMLout($xml_ar, NoAttr=>1, RootName=>'MSR',XMLDecl => 1);
s/\n/\r\n/g;
print INPUTXML;
truncate INPUTXML,tell INPUTXML;
close INPUTXML;


Maybe this will help in explaining what I'm trying to accomplish..

Thanks
Yes, you can use XML::Simple to update an XML file (thanks for posting the previous code - it made it easier to rework my code for using XML::Simple).  It was just easier to overwrite the file as your original question did not include the information about the extra tags.

Where does the day name (Saturday) come from in the example complete XML file?  It is not in the input text file.

This should do what you want (but will not include the day name in the times)...
#!/usr/local/bin/perl

use strict;
use warnings;

my %map = (
    BT30004 => 'BT30004',
    CM10009A => 'CM10009',
    CM10009B => 'CM10009',
    CM10009C => 'CM10009',
    CM10009P => 'CM10009',
    FM17157 => 'FADS',
    FM00124 => 'FIXA',
    FM18024A => 'FM24FHR',
    FM18024B => 'FM24FHR',
    FM30160 => 'FMDNFHR',
    IH30011 => 'INSH',
    LASTISA => 'LASTISA',
    PY17002 => 'MADYFHR',
    HR00013 => 'PAYFHR',
    CN44PDBN => 'PDBN',
    FM30146 => 'SAMS',
    SR80253 => 'LASTSRM1',
    SM18034 => 'LASTSRM2',
);

my %sys;

open IN, 'MSR.txt' or die "could not open MSR.txt: $!";
while (<IN>) {
    chomp;
    my ($job, $tm) = split /\|/;
    $sys{$map{$job}} = $tm if $tm;
}
close IN;

open INPUTXML, "+<Today.xml";
flock INPUTXML,LOCK_EX;

my $self = bless {}, "main";
my $xml = "";
{local $/; $xml .= <INPUTXML>; }
my $xml_ar = new XML::Simple->XMLin($xml);

foreach my $tag (sort keys %sys) {
    if(ref($xml_ar->{Info}{$tag}) eq 'HASH' && scalar(keys(%{$xml_ar->{Info}{$tag}})) == 0) {
        $xml_ar->{Info}{$tag} = $dt;
    }
}

seek INPUTXML,0,0;
$_ = XMLout($xml_ar, NoAttr=>1, RootName=>'MSR',XMLDecl => 1);
s/\n/\r\n/g;
print INPUTXML;
truncate INPUTXML,tell INPUTXML;
close INPUTXML;

Open in new window

Obviously, I forgot to include the line "use XML::Simple;" after "use warnings".
I think this statement is wring:
 $xml_ar->{Info}{$tag} = $dt;

In the old EE  solution  "$dt" was the system time:

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
my $dt = "";
$dt .= ($hour>12) ? $hour - 12  :  $hour;
$dt .= sprintf ":%02d", $min;
$dt .= ($hour>12) ? 'pm' : 'am';

Where now we're pulling it from the text file.. So I think that line has to be changed to something else...
ASKER CERTIFIED SOLUTION
Avatar of wilcoxon
wilcoxon
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks for all your help!!