Solved

Perl script logic help

Posted on 2010-09-06
18
290 Views
Last Modified: 2012-05-10
Can anyone please advise on following-
The input file is in the following format(data is in double quotes with pipe seperator)-
header1
header2
header3
blank line
header5
"abc"|"123"|"2010-09-03"|" "|"344"|"jpu"
"dfg"|"456"|"2010-09-03"|" "|"567"|""usd"
and so on...

The output file we want is as follows(columns swapped from input file with some dummy columns at end separated by pipe and no double quotes)Below is the standard format---
col1|col0|col2(in  'yyyymmdd' format)|col4|null|88

Eg -to create output file from above input file-
123|abc|20100903|344|null|88
456|dfg|20100903|567|null|88

How can we create this output file using above input file through perl script. The perl script will be passed with input file name as a parameter.Please advise

Thx in advance

0
Comment
Question by:sunilbains
  • 8
  • 5
  • 3
  • +2
18 Comments
 
LVL 10

Expert Comment

by:jeromee
ID: 33613923
try this...

Good luck!
use strict;

my $file = $ARGV[0];
open(FILE, $file) || die "Can't open $file: $!\n";	
<FILE> for 1..5;	# skip the first 5 lines

while( <FILE> ) {
   chomp;
   s/(^")|("$)//;	 # Get rid of leading/traing doublequotes
   my @fields = split /"\|"/;
   $fields[2] =~ s/-//g;
   print join("|", @fields[1,0,2,4], qw(null 88))."\n";
}
close(FILE);

Open in new window

0
 
LVL 84

Expert Comment

by:ozo
ID: 33613959
perl -lne 'print "$2|$1|$3$4$5|$6|null|88" if/"(.*?)"\|"(.*?)"\|"(\d+)-(\d+)-(\d+)"\|.*?\|"(.*?)"\|/' <<ENDHERE
header1
header2
header3
blank line
header5
"abc"|"123"|"2010-09-03"|" "|"344"|"jpu"
"dfg"|"456"|"2010-09-03"|" "|"567"|""usd"
and so on...
ENDHERE
0
 

Expert Comment

by:adelara
ID: 33614214
Here it is ready to go.
Save it to something like ...  replaceMe.pl  and use it like this: (works on Windows and Unix)

c:/> replaceMe.pl  fileInput.txt  fileOut.txt

# -------------- snip ---------------------------
#!/bin/perl

open( FILE_IN,  "<", "$ARGV[0]" ); # open file for input
open( FILE_OUT, ">", "$ARGV[1]" ); # open the output file

while(my $line = <FILE_IN>)
{
    next if ( $line =~ /header1/ );       # skip if line starts with # (comments ...)
    next if ( $line =~ /header2/ );       # skip if line starts with # (comments ...)
    next if ( $line =~ /header3/ );       # skip if line starts with # (comments ...)
    next if ( $line =~ /^$/ );            # skip blank lines
    chomp( $line );                       # remove newline so it won't go to last value

    my @col = split( /\|/, $line );       # now you have every field in an array's record
    $col[2] =~ tr/\-//d;                  # fix date (remove dashes)

    # now re-assemble the line in the order you want
    my $newLine = $col[1] . '|' . $col[0] . '|' . $col[2] . '|' . $col[4] . '|null|88';
    $newLine    =~ tr/\"//d;                   # delete whatever quotes are left

    print FILE_OUT  "$newLine\n";  # write modified line to output file

}

close( FILE_IN );
close( FILE_OUT );
# -------------- snip ---------------------------


0
 

Author Comment

by:sunilbains
ID: 33619182
Hello Jeromee..
The requirement is slight change...Please advsie on how can i accomplish this-

The input file is in the following format-
20090915|AAAAA|T E S T  B R O K E R  T E S T  T E S T |FUND|XXXXXXX|984121103|  |XEROX CORP|  |  |DTC|9300.000|US9841211033|ISN|
20090916|AAAAB|T E S T  B R O K E R  T E S T  T E S T |FUND|XXXXXXX|984121103|  |XEROX CORP|  |  |DTC|9300.000|US9841211033|ISN|
and so on...

The output file we want is as follows(columns swapped from input file with some dummy columns at end separated by pipe )Below is the standard format---
col1|CU|col0|col5

Eg -to create output file from above input file-
AAAAAA|CU|20100915|984121103
AAAAAB|CU|20100916|984121103

How can we create this output file using above input file through perl script. The perl script will be passed with input file name as a parameter.Please advise
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33619269
Hi sunilbains,
Here's a solution to your new problem

$ perl -ne'@f=split/\|/; print join("|", $f[1], "CU", @f[0,5])."\n"' input_file
AAAAA|CU|20090915|984121103
AAAAB|CU|20090916|984121103
0
 
LVL 14

Expert Comment

by:sentner
ID: 33622216
Note you can also do this with other unix utilites, not even requiring perl...

For example:

$ awk -F"|" '{print $2 "|CU|" $1 "|" $6}' broker.txt
AAAAA|CU|20090915|984121103
AAAAB|CU|20090916|984121103

0
 
LVL 10

Expert Comment

by:jeromee
ID: 33622432
Here's awk-like version of what I offered above:
perl -F'\|' -ane'print join ( "|", $F[1], "CU", @F[0,5] ) ."\n"' input_file
AAAAA|CU|20090915|984121103
AAAAB|CU|20090916|984121103

Note the -a and -F to replace the @f=split
0
 
LVL 84

Expert Comment

by:ozo
ID: 33623691
perl -F'\|' -aple '$"="|";$F[2]="CU";$_="@F[1,2,0,5]"'
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33624189
cryptic but nice, ozo!
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 

Author Comment

by:sunilbains
ID: 33630653
Hello Jeromee..
The requirement is again change...Please advsie on how can i accomplish this-

The input file is in the following format-
kAccount Code|Account Type|Position Date|SEC ID|Quantity|Currency
AAAAAA||07-Sep-2010|00826T108|586|JPY
AAAAAA||07-Sep-2010|00826T108|586| and so on...

The output file we want is as follows(columns swapped from input file with col5 default to USD in case it is null and date format to convert to YYYYMMDD from dd-mon-yyyy)Below is the format we want---
col0|CU|col2(in format 'yyyymmdd')|col5(if column 5 is null, then default to USD)

Eg -to create output file from above input file-
AAAAAA|CU|20100907|JPY
AAAAAA|CU|20100907|USD

How can we create this output file using above input file through perl script. The perl script will be passed with input file name as a parameter.Please advise
0
 

Author Comment

by:sunilbains
ID: 33632257
Can somebody help?
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33632288
Looking into it...
BTW, please provide the name of your employer to make sure that I never work there... too many changes... :-)

Give a couple of minutes.
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33632385
Try this for size...
use strict;

my @monthToMM = qw(jan feb mar apr may jun jul aug sep oct nov dec);
my %monthToMM = map{ $monthToMM[$_] => $_+1 } 0..11;

while( <> ) {
   chomp;
   my($acct, $date, $cty) = (split '\|')[0, 2, 5];
   my($dd, $month, $ccyy) = split /-/, $date;
   $cty ||= 'USD';
   print join("|", $acct, 'CU', sprintf("%d%02d%02d", $ccyy, $monthToMM{lc($month)}, $dd), $cty)."\n";
}

Open in new window

0
 

Author Comment

by:sunilbains
ID: 33632853
HI Jerome,
Thanks for reply!
When i run this script, am getting--
Account Code|CU|00000|Currency
P 02134|CU|20100907|USD
How can i remove this line--Account Code|CU|00000|Currency which seems to be header.

thx in advance
0
 
LVL 84

Expert Comment

by:ozo
ID: 33632923
perl -F'\|' -lane 'BEGIN{@m{qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)}=('01'..'12')}
print "$F[0]|CU|$3$m{$2}$1|$F[5]" if $F[2]=~ /(\d+)-(\w+)-(\d+)/' << ENDHERE
kAccount Code|Account Type|Position Date|SEC ID|Quantity|Currency
AAAAAA||07-Sep-2010|00826T108|586|JPY
AAAAAA||07-Sep-2010|00826T108|586| and so on...
ENDHERE
0
 
LVL 10

Accepted Solution

by:
jeromee earned 500 total points
ID: 33633734
use strict;

my @monthToMM = qw(jan feb mar apr may jun jul aug sep oct nov dec);
my %monthToMM = map{ $monthToMM[$_] => $_+1 } 0..11;

<>;
while( <> ) {
   chomp;
   my($acct, $date, $cty) = (split '\|')[0, 2, 5];
   my($dd, $month, $ccyy) = split /-/, $date;
   $cty ||= 'USD';
   print join("|", $acct, 'CU', sprintf("%d%02d%02d", $ccyy, $monthToMM{lc($month)}, $dd), $cty)."\n";
}
0
 

Author Closing Comment

by:sunilbains
ID: 33703307
Thanks All.. Its working now.
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33703362
Good to hear.
Happy Perling!
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Internet Business Fax to Email Made Easy - With eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now