Link to home
Start Free TrialLog in
Avatar of deepak_tyco
deepak_tyco

asked on

AIX scripts to split a file in to multipule files depending on matching patteren without any chages to exiting data in file?

Hi,
I have a requirement to split the file on different pattern and write to different files.
My scripts doing the work properly but i am facing a strange problem. while writing to different files the white space at the end of the each line is trimming.
That means if the original line in the file is 339 chars ; its trimming to 239 char in new file by trimming the white spaces in the end of the line ; which i dont want.

my scripts is as below:-

#!/usr/bin/ksh
#Mtv extract file directory
MTVFSDIR="D:/home/dsadm"
MTVFTPDIR="$MTVFSDIR/inbound/mtv"
#Mtv extract log directory
MTVLOGSDIR="$MTVFSDIR/log"
#Mtv stag folder directory
MTVSTAGEDIR="$MTVFSDIR/stage/mtv"
MTVLOG=$MTVLOGSDIR/$1".MTVEXTRACT.LOG."$CURRDATESTAMP
MTVARCH="$MTVFSDIR/archive"

# Spliting of file start

echo "spliting start"

         while read Line
          do
            match_char=`echo $Line | cut -c1-4`
              echo $match_char
              if test $match_char -eq 0001
             then
             echo "$Line"  >> $MTVSTAGEDIR/stage_header.txt
               elif test $match_char -eq 0004
               then
                 echo "$Line"  >> $MTVSTAGEDIR/stage_bl_header.txt
             elif test $match_char -eq 0010
               then
                 echo "$Line"  >> $MTVSTAGEDIR/stage_invoice.txt
             elif test $match_char -eq 0020
                    then
                 echo "$Line"  >> $MTVSTAGEDIR/stage_contact_detail.txt
             elif test $match_char -eq 0030
               then
             echo "$Line"  >> $MTVSTAGEDIR/stage_cash_detail.txt
             elif test $match_char -eq 0040
               then
                 echo "$Line"  >> $MTVSTAGEDIR/stage_adj_detail.txt
             elif test $match_char -eq 0050
               then
                 echo "$Line"  >> $MTVSTAGEDIR/stage_message.txt
             elif test $match_char -eq 0060
               then
                 echo "$Line"  >> $MTVSTAGEDIR/stage_corresp.txt
             elif test $match_char -eq 0070
               then
                 echo "$Line"  >> $MTVSTAGEDIR/stage_department.txt
             elif test $match_char -eq 0080
               then
                 echo "$Line"  >> $MTVSTAGEDIR/stage_broker.txt
               elif test $match_char -eq 1000
               then
                 echo "$Line"  >> $MTVSTAGEDIR/stage_component.txt
             elif test $match_char -eq 1100
               then
                 echo "$Line"  >> $MTVSTAGEDIR/stage_component_rider.txt
             elif test $match_char -eq 1200
               then
                 echo "$Line"  >> $MTVSTAGEDIR/stage_contract_summary.txt
               else
              echo "$Line"  >> $MTVSTAGEDIR/stage_reject.txt
            fi

          done < $MTVFTPDIR/ABE.txt

echo "Spliting of file end"



STAMP=`date +%Y-%m-%d:%H-%M-%OS`

echo "Job Completed !!!..."

#everything goes well, exit with 0



Can any body help me out..

Thanks
deepak

Avatar of nognew
nognew
Flag of United Kingdom of Great Britain and Northern Ireland image

Hey!
"That means if the original line in the file is 339 chars ; its trimming to 239 char in new file "
do you mean 339-> 338 ?
How do you know its a white space? It might be Windows style CR which has not been transfered.

Regards,
t.
Avatar of deepak_tyco
deepak_tyco

ASKER

Hi,
for example:--
I am giving a comand as to original file ABE.txt

head -1 ABE.txt | wc
      1       2     380

The splited file count as below
head -1 stage_broker.txt | wc
      1       2      21

i have attach the original file and one of the splited file also.

thanks




ABE.txt
stage-broker.txt
The script is working for me but on Linux. Unfortunatelly I dont' have AIX installed, but what I think is the default echo behavior is to output all the STRING (single word) arguments separated by a single space. Effectively if there are more then one space between words, echo removes all but one.  In order to output a string with multiple spaces between words one need to enclose variable into quotes. In your script all $Line vriables are quoted. My assumtion is AIX's echo have default behaviour even on quoted variables.
AS you are using KSH I'd suggest you to use print instead of echo. That might solve the problem.
Try to replace all echo with print.
Kind regads,
t.
ASKER CERTIFIED SOLUTION
Avatar of Tintin
Tintin

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
BTW, how much data are you processing?  A shell script will be very slow.
hi,

I had try with print statement also i am getting the same value.

When i am using the IFS ; i can read the file with white space but while writing i am getting same problem.

In a file i have 5 million records.

thanks
kunnal
I'm surprised that changing the IFS doesn't solve the problem as it works for me as you can see below

$ od -c bar
0000000   a   b   c              \n   d   e   f                  \n
0000017

$ cat test.ksh
#!/bin/ksh
rm -f bar
while IFS='\n' read  a
do
  echo "$a" >>bar
done
Hi,
Yes i have perl on system.

But how to do it on perl.

thanks
Hi,

IFS option is working now. My problem is resolved.

Thanks a lot to all of you for your help.

Regards
kunnal
Hi,

Thanks all of you for your help.

Now its working fine after using IFS.

Regards
kunnal
Here's a stripped down Perl version that will much, much quicker.


#!/usr/bin/perl
use strict;
 
my $MTVFSDIR="D:/home/dsadm";
my $MTVFTPDIR="$MTVFSDIR/inbound/mtv";
my $MTVSTAGEDIR="$MTVFSDIR/stage/mtv";
 
open my $stage_header, '>>', "$MTVSTAGEDIR/stage_header.txt" or die "Can nnot open stage_header.txt$!\n";
open my $stage_b1_header, '>>', "$MTVSTAGEDIR/stage_b1_header.txt" or die "Can not open stage_b1_header.txt$!\n";
open my $stage_reject, '>>',  "$MTVSTAGEDIR/stage_reject.txt" or die "Can not open stage_reject.txt $!\n";
 
open my $abe, '<', "MTVFTPDIR/ABE.txt" or die "Can not open ABE.txt $!\n";
print "spliting start\n";
 
while (<$abe>) {
  /^(....)/;
  my $code = $1;
  print "$code\n";
  if ($code eq '0001') {
     print $stage_header;
  }
  elsif ($code eq '0004') {
     print $stage_b1_header;
  }
  else {
     print $stage_reject;
  }
}
 
print "Spliting of file end\n";
print "Job Completed !!!...\n";

Open in new window

Hi,

Thanks

The performance is very good now.

Regards
kunnal