Link to home
Start Free TrialLog in
Avatar of isaacr25
isaacr25

asked on

Perl script for data manipulation

I'm just getting into the specifics of what I need to do, but for right now I would like just basic guidance. I have a fixed width text file from which I need to pull data and append to another fixed width file, with a different schema. What is the basic, perl script structure to pull data in certain positions from the first file and append this data in certain positions to the output file?

For instance, how would I pull the data from positions 9-15 from the source file and place this data in positions 25-31 in the output file? Thanks in advance.
ASKER CERTIFIED SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of isaacr25
isaacr25

ASKER

ozo,
Where would I specify the source and destination file names?
And how did you come up with the 35.7 and _,8,7?

Excuse my ignorance. I'm not a perl expert, so any clarification would be appreciated.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
How would I include multiple manipulations?
what manipulations do you want to do?
I would need to do many more instances of similary manipulations, taking strings from certain source file positions and placing them in specific destination positions.
I also get the following error when running the script:
Substitution pattern not terminated at e:\testing\datascript.pl line 2.
You can seperate a line into it's parts using several methods.
One is the substr function ozo showed.  Another is unpack.  Another is regular expressions.

Eg, unpack:
my $str = "aaabbbbbcccccccccc";
my ($a, $b, $c) = unpack("A3A5A10", $str);
#Now
#    $a="aaa"
#    $b="bbbbb"
#    $c="cccccccccc"

eg, regular expression:
my $str = "aaabbbbbcccccccccc";
my ($a, $b, $c) = $str =~ /(.{3})(.{5})(.{10})/;
#Now
#    $a="aaa"
#    $b="bbbbb"
#    $c="cccccccccc"
I mistyped an extra " 
""%35.7s\n"
should have been
"%31.7s\n"
Concerning the multiple manipulations, here is what I'm trying to do:

while( <> ){
printf"%4.4s\n",substr($_,12,4)
printf"%33.7s\n",substr($_,126,7)
}

And so on. Obviously this is not the correct syntax since I get an error. How should I separate the commands? Thanks.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks Ozo,
I have another question!

The script is placing each parsed piece on a separate line in the output file. So if I run the above code with two manipulations on a source file that has 3 lines of data, then the output file has 6 lines of data. I need for the parsed pieces in one record to be output to one record in the output file. How do I do this?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
If you want both fields on the same line you might do
while( <> ){
   printf"%4.4s%29.7s\n",unpack"x12A4x110A7",$_;
}