Parsing

tomatocans
tomatocans used Ask the Experts™
on
Here is the current code

#! /usr/local/bin/perl -w

###prompt for input####
print "Input path and file name:\n";
($a=<STDIN>);

###If cannot open provide error output####
open(IN,$a) || die "Cannot open file $a:$!";

####open file for output called test.def
open(OUT,">/home/tmclaugh/test.def") || die "Cannot open file>:!";
while(<IN>){

        @field = split(/#\n#/); #array to hold parsed query split by  
        $new_field = split(@field,/|/);
               
        print OUT ("$new_field3|$new_field4,$new_field5|$new_field4|$new_field5\n");
      }
close(IN);
close(OUT);

Following is an example of the input:

|BEGIN_RESOURCE|
TYPE|VOLUME|
NAME|ZTL0112M0
|LOCATION|272800|0775330
|LOCATION|272800|0780330
|LOCATION|272800|0781900
|LOCATION|280000|0790200
|LOCATION|300930|0784000
|LOCATION|300900|0783400
|LOCATION|300640|0780520
|LOCATION|293330|0780730
|DATA|0|240
#
#
|BEGIN_RESOURCE|
TYPE|VOLUME|
NAME|ZTL0112M1
|LOCATION|280000|0790200
|LOCATION|300930|0784000
|LOCATION|300900|0783400
|LOCATION|300640|0780520
|LOCATION|293330|0780730
|LOCATION|272800|0775330
|LOCATION|272800|0780330
|LOCATION|270000|0781900
|LOCATION|263000|0783600
|LOCATION|261453|0783713
|LOCATION|260701|0783806
|LOCATION|255800|0785800
|LOCATION|260400|0793300
|LOCATION|260000|0793430
|LOCATION|261300|0800315
|LOCATION|262300|0800900
|LOCATION|262300|0794200
|LOCATION|272830|0790700
|DATA|240|999
#
#


I am receiving an unitialized value error when I run the script.


I am looking to parse the input file and output it like so


ZTL0112M0|272800:0775330|272800|0775330
ZTL0112M0|272800:0780330|272800|0780330
ZTL0112M0|272800:0781900|272800|0781900
ZTL0112M0|280000:0790200|280000|0790200
ZTL0112M0|300930:0784000|300930|0784000
ZTL0112M0|300900:0783400|300900|0783400
ZTL0112M0|300640:0780520|300640|0780520
ZTL0112M0|293330:0780730|293330|0780730
ZTL0112M1|280000:0790200|280000|0790200
ZTL0112M1|300930:0784000|300930|0784000

etc.


How would I eliminate the initialization error and get it to print out in the proper format.

Thanks
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Author

Commented:
trying to add points but database will not let me.
1. You are reading line by line. No all the lines you get don't have the delimiter ("#\n#").
2. The line:

$new_field = split(@field,/|/);

has a few problems.
  - The split pattern should be the first argument, not the second.
  - The return value of split is a list. You are assigning to a scalar.
  - The character '|' is special. You need to escape it.

Here's a better way.

{
    local $/ = "#\n#\n";

    while (<IN>) {
        @new_fields = split/\|/;
        print OUT, "$new_field[3]|$new_field[4],$new_field[5]|$new_field[4]|$new_field[5]\n");
    }
}

Author

Commented:
Thanks prakashk.

In the following awk script

BEGIN{
        FS="|"
}
{
        if($1=="NAME"){
            VOLUME_NAME=$2
        }
        if($1=="LOCATION"){
           LOCATION_LAT[i]=$2
           LOCATION_LNG[i]=$3
             LAT_DEG[i]=substr($2,1,2)
             LAT_MIN[i]=substr($2,3,2)
             LAT_SEC[i]=substr($2,5,2)
             LNG_DEG[i]=substr($3,1,3)
             LNG_MIN[i]=substr($3,4,2)
             LNG_SEC[i]=substr($3,6,2)
             LAT[i]=LAT_DEG[i] + LAT_MIN[i] / 60. + LAT_SEC[i] / 3600.
             LNG[i]=LNG_DEG[i] + LNG_MIN[i] / 60. + LNG_SEC[i] / 3600.
             ?????????????
          }
          if($1="DATA"){
             ALT_LOW=$2
             ALT_HIGH=$3
          }
          for(i=1;i<=???????????;i++){
             print (VOLUME_NAME"|"LOCATION_LAT[i]","LOCATION_LNG[i]"|"LAT[i]"|"LNG[i]"|"ALT_LOW"|"ALT_HIGH)
      }
}

What is needed to make this for loop print correctly. I think I have to assign a number based on LOCATION_LAT[i] but I can't seem to get it right.

Thanks
Introduction to R

R is considered the predominant language for data scientist and statisticians. Learn how to use R for your own data science projects.

Author

Commented:
Hello ozo can u help

Author

Commented:
hello

Author

Commented:
anybody there
ozo
Most Valuable Expert 2014
Top Expert 2015

Commented:
#!/usr/local/bin/perl -w
while( <> ){
  $name = $1 if /NAME\|(\S*)/;
  print "$name|$1:$2|$1|$2\n" if /LOCATION\|(\d+)\|(\d+)/;
}
How does the awk script figure in your problem? Are you trying to implement this in both perl and awk? Or, are you trying to port the awk script to perl?

Anyway, what you need is a counter variable in the LOCATION handling part. Whenver you see a LOCATION line increment the counter. Later use the counter in the for loop.

i = 0;

if ($1 == "LOCATION") {
    i++;
    LOCATION_LAT[i]=$2
    ....
}


for (j = 1; j <= i; j++) {
    print (VOLUME_NAME"|"LOCATION_LAT[j]","LOCATION_LNG[j]"|"LAT[j]"|"LNG[j]"|"ALT_LOW"|"ALT_HIGH)
}


I am not sure about the awk syntax. Its been ages since I used it. Why use awk, when you have perl?

Author

Commented:
Solved the problem. Thanks

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial