?
Solved

Perl script logic help

Posted on 2010-09-06
18
Medium Priority
?
299 Views
Last Modified: 2012-05-10
Can anyone please advise on following-
The input file is in the following format(data is in double quotes with pipe seperator)-
header1
header2
header3
blank line
header5
"abc"|"123"|"2010-09-03"|" "|"344"|"jpu"
"dfg"|"456"|"2010-09-03"|" "|"567"|""usd"
and so on...

The output file we want is as follows(columns swapped from input file with some dummy columns at end separated by pipe and no double quotes)Below is the standard format---
col1|col0|col2(in  'yyyymmdd' format)|col4|null|88

Eg -to create output file from above input file-
123|abc|20100903|344|null|88
456|dfg|20100903|567|null|88

How can we create this output file using above input file through perl script. The perl script will be passed with input file name as a parameter.Please advise

Thx in advance

0
Comment
Question by:sunilbains
  • 8
  • 5
  • 3
  • +2
18 Comments
 
LVL 10

Expert Comment

by:jeromee
ID: 33613923
try this...

Good luck!
use strict;

my $file = $ARGV[0];
open(FILE, $file) || die "Can't open $file: $!\n";	
<FILE> for 1..5;	# skip the first 5 lines

while( <FILE> ) {
   chomp;
   s/(^")|("$)//;	 # Get rid of leading/traing doublequotes
   my @fields = split /"\|"/;
   $fields[2] =~ s/-//g;
   print join("|", @fields[1,0,2,4], qw(null 88))."\n";
}
close(FILE);

Open in new window

0
 
LVL 85

Expert Comment

by:ozo
ID: 33613959
perl -lne 'print "$2|$1|$3$4$5|$6|null|88" if/"(.*?)"\|"(.*?)"\|"(\d+)-(\d+)-(\d+)"\|.*?\|"(.*?)"\|/' <<ENDHERE
header1
header2
header3
blank line
header5
"abc"|"123"|"2010-09-03"|" "|"344"|"jpu"
"dfg"|"456"|"2010-09-03"|" "|"567"|""usd"
and so on...
ENDHERE
0
 

Expert Comment

by:adelara
ID: 33614214
Here it is ready to go.
Save it to something like ...  replaceMe.pl  and use it like this: (works on Windows and Unix)

c:/> replaceMe.pl  fileInput.txt  fileOut.txt

# -------------- snip ---------------------------
#!/bin/perl

open( FILE_IN,  "<", "$ARGV[0]" ); # open file for input
open( FILE_OUT, ">", "$ARGV[1]" ); # open the output file

while(my $line = <FILE_IN>)
{
    next if ( $line =~ /header1/ );       # skip if line starts with # (comments ...)
    next if ( $line =~ /header2/ );       # skip if line starts with # (comments ...)
    next if ( $line =~ /header3/ );       # skip if line starts with # (comments ...)
    next if ( $line =~ /^$/ );            # skip blank lines
    chomp( $line );                       # remove newline so it won't go to last value

    my @col = split( /\|/, $line );       # now you have every field in an array's record
    $col[2] =~ tr/\-//d;                  # fix date (remove dashes)

    # now re-assemble the line in the order you want
    my $newLine = $col[1] . '|' . $col[0] . '|' . $col[2] . '|' . $col[4] . '|null|88';
    $newLine    =~ tr/\"//d;                   # delete whatever quotes are left

    print FILE_OUT  "$newLine\n";  # write modified line to output file

}

close( FILE_IN );
close( FILE_OUT );
# -------------- snip ---------------------------


0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 

Author Comment

by:sunilbains
ID: 33619182
Hello Jeromee..
The requirement is slight change...Please advsie on how can i accomplish this-

The input file is in the following format-
20090915|AAAAA|T E S T  B R O K E R  T E S T  T E S T |FUND|XXXXXXX|984121103|  |XEROX CORP|  |  |DTC|9300.000|US9841211033|ISN|
20090916|AAAAB|T E S T  B R O K E R  T E S T  T E S T |FUND|XXXXXXX|984121103|  |XEROX CORP|  |  |DTC|9300.000|US9841211033|ISN|
and so on...

The output file we want is as follows(columns swapped from input file with some dummy columns at end separated by pipe )Below is the standard format---
col1|CU|col0|col5

Eg -to create output file from above input file-
AAAAAA|CU|20100915|984121103
AAAAAB|CU|20100916|984121103

How can we create this output file using above input file through perl script. The perl script will be passed with input file name as a parameter.Please advise
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33619269
Hi sunilbains,
Here's a solution to your new problem

$ perl -ne'@f=split/\|/; print join("|", $f[1], "CU", @f[0,5])."\n"' input_file
AAAAA|CU|20090915|984121103
AAAAB|CU|20090916|984121103
0
 
LVL 14

Expert Comment

by:sentner
ID: 33622216
Note you can also do this with other unix utilites, not even requiring perl...

For example:

$ awk -F"|" '{print $2 "|CU|" $1 "|" $6}' broker.txt
AAAAA|CU|20090915|984121103
AAAAB|CU|20090916|984121103

0
 
LVL 10

Expert Comment

by:jeromee
ID: 33622432
Here's awk-like version of what I offered above:
perl -F'\|' -ane'print join ( "|", $F[1], "CU", @F[0,5] ) ."\n"' input_file
AAAAA|CU|20090915|984121103
AAAAB|CU|20090916|984121103

Note the -a and -F to replace the @f=split
0
 
LVL 85

Expert Comment

by:ozo
ID: 33623691
perl -F'\|' -aple '$"="|";$F[2]="CU";$_="@F[1,2,0,5]"'
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33624189
cryptic but nice, ozo!
0
 

Author Comment

by:sunilbains
ID: 33630653
Hello Jeromee..
The requirement is again change...Please advsie on how can i accomplish this-

The input file is in the following format-
kAccount Code|Account Type|Position Date|SEC ID|Quantity|Currency
AAAAAA||07-Sep-2010|00826T108|586|JPY
AAAAAA||07-Sep-2010|00826T108|586| and so on...

The output file we want is as follows(columns swapped from input file with col5 default to USD in case it is null and date format to convert to YYYYMMDD from dd-mon-yyyy)Below is the format we want---
col0|CU|col2(in format 'yyyymmdd')|col5(if column 5 is null, then default to USD)

Eg -to create output file from above input file-
AAAAAA|CU|20100907|JPY
AAAAAA|CU|20100907|USD

How can we create this output file using above input file through perl script. The perl script will be passed with input file name as a parameter.Please advise
0
 

Author Comment

by:sunilbains
ID: 33632257
Can somebody help?
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33632288
Looking into it...
BTW, please provide the name of your employer to make sure that I never work there... too many changes... :-)

Give a couple of minutes.
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33632385
Try this for size...
use strict;

my @monthToMM = qw(jan feb mar apr may jun jul aug sep oct nov dec);
my %monthToMM = map{ $monthToMM[$_] => $_+1 } 0..11;

while( <> ) {
   chomp;
   my($acct, $date, $cty) = (split '\|')[0, 2, 5];
   my($dd, $month, $ccyy) = split /-/, $date;
   $cty ||= 'USD';
   print join("|", $acct, 'CU', sprintf("%d%02d%02d", $ccyy, $monthToMM{lc($month)}, $dd), $cty)."\n";
}

Open in new window

0
 

Author Comment

by:sunilbains
ID: 33632853
HI Jerome,
Thanks for reply!
When i run this script, am getting--
Account Code|CU|00000|Currency
P 02134|CU|20100907|USD
How can i remove this line--Account Code|CU|00000|Currency which seems to be header.

thx in advance
0
 
LVL 85

Expert Comment

by:ozo
ID: 33632923
perl -F'\|' -lane 'BEGIN{@m{qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)}=('01'..'12')}
print "$F[0]|CU|$3$m{$2}$1|$F[5]" if $F[2]=~ /(\d+)-(\w+)-(\d+)/' << ENDHERE
kAccount Code|Account Type|Position Date|SEC ID|Quantity|Currency
AAAAAA||07-Sep-2010|00826T108|586|JPY
AAAAAA||07-Sep-2010|00826T108|586| and so on...
ENDHERE
0
 
LVL 10

Accepted Solution

by:
jeromee earned 2000 total points
ID: 33633734
use strict;

my @monthToMM = qw(jan feb mar apr may jun jul aug sep oct nov dec);
my %monthToMM = map{ $monthToMM[$_] => $_+1 } 0..11;

<>;
while( <> ) {
   chomp;
   my($acct, $date, $cty) = (split '\|')[0, 2, 5];
   my($dd, $month, $ccyy) = split /-/, $date;
   $cty ||= 'USD';
   print join("|", $acct, 'CU', sprintf("%d%02d%02d", $ccyy, $monthToMM{lc($month)}, $dd), $cty)."\n";
}
0
 

Author Closing Comment

by:sunilbains
ID: 33703307
Thanks All.. Its working now.
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33703362
Good to hear.
Happy Perling!
0

Featured Post

Receive 1:1 tech help

Solve your biggest tech problems alongside global tech experts with 1:1 help.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans
Suggested Courses

616 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question