Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 296
  • Last Modified:

Delete some rows and columns and create output file using shell script or perl script

I have a file attached it looks like this below. It may also contain points with 0.00 in addition to just whole numbers. I simply want to

1) keep only the rows 10 -19 (in other words ignore the header row and first 8 rows of data, keep the next 8 rows of data, and ignore any rows after that) In this case the first row of data to be kept would be 08/14/2011 11 8 14 2 14 3 0 0 0 0 1 0 14 0 0 3 0

2. keep only columns 1 (date stamp), 11 (corresponds to header S00000004), 12 (corresponds to header S000000034), 14 (corresponds to header S000000075), & 17 (corresponds to header S000000045). The new line ignoring all other rows would look like this

08/14/2011 0 1 0 14

3. Add a hardcoded stampe of 00:00:00 after the day and change the date format to hyphens instead of slashes. The new line would look like this

08-14-2011 00:00:00 0 1 0 14

4. output results to a separate file and perform that function on ALL files in a directory. Lets call the directory /path2/so2 and output a modified file for each input file. The output files would have the same name as the input files but just have a different path at /path1 instead of the input files

sampleinputfile

/path2/samplefile.txt

     JDAY  YR  MO DA1 HR1 DA2 HR2 S00000037 S00000021 S00000002 S00000004 S00000034 S00000035 S00000075 S00000038 S00000044 S00000045 S00000046
08/13/2011 11 8 13 2 13 3 0 0 0 0 2 0 21 0 0 0 0
08/13/2011 11 8 13 5 13 6 0 0 0 0 6 0 0 0 0 0 0
08/13/2011 11 8 13 8 13 9 0 0 0 0 1 0 0 0 0 0 0
08/13/2011 11 8 13 11 13 12 0 0 0 0 49 0 0 0 0 0 0
08/13/2011 11 8 13 14 13 15 0 0 0 0 15 0 0 0 0 1 0
08/13/2011 11 8 13 17 13 18 0 0 0 0 11 0 0 0 0 6 0
08/13/2011 11 8 13 20 13 21 0 0 0 0 56 0 0 0 0 1 0
08/13/2011 11 8 13 23 14 0 0 0 0 0 13 0 0 0 0 9 0
08/14/2011 11 8 14 2 14 3 0 0 0 0 1 0 14 0 0 3 0
08/14/2011 11 8 14 5 14 6 0 0 0 0 10 0 14 0 0 16 0
08/14/2011 11 8 14 8 14 9 0 0 0 0 8 0 1 0 0 7 0
08/14/2011 11 8 14 11 14 12 0 0 0 0 2 0 0 0 0 0 0
08/14/2011 11 8 14 14 14 15 0 0 0 0 7 0 0 0 0 0 0
08/14/2011 11 8 14 17 14 18 0 0 0 0 30 0 0 0 0 0 0
08/14/2011 11 8 14 20 14 21 0 0 0 0 10 0 0 0 0 1 0
08/14/2011 11 8 14 23 15 0 0 0 0 0 6 0 0 0 0 23 0
08/15/2011 11 8 15 2 15 3 0 0 0 0 5 0 0 0 0 3 0
08/15/2011 11 8 15 5 15 6 0 0 0 0 13 0 0 0 0 1 0
08/15/2011 11 8 15 8 15 9 0 0 0 0 1 0 0 0 0 0 0
08/15/2011 11 8 15 10 15 11 0 0 0 0 23 0 0 0 0 0 0

outputfile should look like this

/path1/samplefile.txt

08-14-2011 00:00:00 0 1 0 14
08-14-2011 00:00:00 0 10 0 7
08-14-2011 00:00:00 0 8 0 3
08-14-2011 00:00:00 0 2 0 3
08-14-2011 00:00:00 0 7 0 3
08-14-2011 00:00:00 0 30 0 3
08-14-2011 00:00:00 0 10 0 1
08-14-2011 00:00:00 0 6 0 23
samplefile.txt
0
libertyforall2
Asked:
libertyforall2
  • 2
  • 2
1 Solution
 
wilcoxonCommented:
This should do what you want.  Let me know if there are any issues...

You say lines 10-19 but say 8 lines (10-19 would be 10 lines).

I set a bunch of vars at the top of the script you can alter to change dirs, lines to keep, cols to keep, etc.
#!/usr/local/bin/perl

use strict;
use warnings;

# change these to suit
my $in_dir = '/path1';
my $out_dir = '/path2/so2';
my $min_line = 10;
my $max_line = 19;
my @cols = (0, 10, 11, 13, 16); # 0-offset rather than 1-offset

# get all files in $in_dir
opendir DIR, $in_dir or die "could not open dir $in_dir: $!";
my @files = grep { -f "$in_dir/$_" } readdir DIR;
closedir DIR;

# loop over files
foreach my $fil (@files) {
    open IN, "$in_dir/$fil" or die "could not open $in_dir/$fil: $!";
    open OUT, '>', "$out_dir/$fil" or die "could not write $out_dir/$fil: $!";
    while (<IN>) {
        last if ($. > $max_line);
        next if ($. < $min_line);
        chomp;
        # get only the cols we want
        my @vals = (split /\s+/)[@cols];
        # add the hard-coded timestamp
        splice @vals, 1, 0, '00:00:00';
        print OUT join(' ', @vals), "\n";
    }
    close OUT;
    close IN;
}

Open in new window

0
 
libertyforall2Author Commented:
Almost. It left the time stamp with slashes instead of hyphens time should look like 08-27-2011 instead of 08/27/2011
0
 
wilcoxonCommented:
Sorry.  Forgot to do that.
#!/usr/local/bin/perl

use strict;
use warnings;

# change these to suit
my $in_dir = '/path1';
my $out_dir = '/path2/so2';
my $min_line = 10;
my $max_line = 19;
my @cols = (0, 10, 11, 13, 16); # 0-offset rather than 1-offset

# get all files in $in_dir
opendir DIR, $in_dir or die "could not open dir $in_dir: $!";
my @files = grep { -f "$in_dir/$_" } readdir DIR;
closedir DIR;

# loop over files
foreach my $fil (@files) {
    open IN, "$in_dir/$fil" or die "could not open $in_dir/$fil: $!";
    open OUT, '>', "$out_dir/$fil" or die "could not write $out_dir/$fil: $!";
    while (<IN>) {
        last if ($. > $max_line);
        next if ($. < $min_line);
        chomp;
        # get only the cols we want
        my @vals = (split /\s+/)[@cols];
        # change / to - in timestamp
        $vals[0] =~ s{/}{-}g;
        # add the hard-coded timestamp
        splice @vals, 1, 0, '00:00:00';
        print OUT join(' ', @vals), "\n";
    }
    close OUT;
    close IN;
}

Open in new window

0
 
libertyforall2Author Commented:
Great!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Making Bulk Changes to Active Directory

Watch this video to see how easy it is to make mass changes to Active Directory from an external text file without using complicated scripts.

  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now