Solved

Delete some rows and columns and create output file using shell script or perl script

Posted on 2011-09-02
4
280 Views
Last Modified: 2012-05-12
I have a file attached it looks like this below. It may also contain points with 0.00 in addition to just whole numbers. I simply want to

1) keep only the rows 10 -19 (in other words ignore the header row and first 8 rows of data, keep the next 8 rows of data, and ignore any rows after that) In this case the first row of data to be kept would be 08/14/2011 11 8 14 2 14 3 0 0 0 0 1 0 14 0 0 3 0

2. keep only columns 1 (date stamp), 11 (corresponds to header S00000004), 12 (corresponds to header S000000034), 14 (corresponds to header S000000075), & 17 (corresponds to header S000000045). The new line ignoring all other rows would look like this

08/14/2011 0 1 0 14

3. Add a hardcoded stampe of 00:00:00 after the day and change the date format to hyphens instead of slashes. The new line would look like this

08-14-2011 00:00:00 0 1 0 14

4. output results to a separate file and perform that function on ALL files in a directory. Lets call the directory /path2/so2 and output a modified file for each input file. The output files would have the same name as the input files but just have a different path at /path1 instead of the input files

sampleinputfile

/path2/samplefile.txt

     JDAY  YR  MO DA1 HR1 DA2 HR2 S00000037 S00000021 S00000002 S00000004 S00000034 S00000035 S00000075 S00000038 S00000044 S00000045 S00000046
08/13/2011 11 8 13 2 13 3 0 0 0 0 2 0 21 0 0 0 0
08/13/2011 11 8 13 5 13 6 0 0 0 0 6 0 0 0 0 0 0
08/13/2011 11 8 13 8 13 9 0 0 0 0 1 0 0 0 0 0 0
08/13/2011 11 8 13 11 13 12 0 0 0 0 49 0 0 0 0 0 0
08/13/2011 11 8 13 14 13 15 0 0 0 0 15 0 0 0 0 1 0
08/13/2011 11 8 13 17 13 18 0 0 0 0 11 0 0 0 0 6 0
08/13/2011 11 8 13 20 13 21 0 0 0 0 56 0 0 0 0 1 0
08/13/2011 11 8 13 23 14 0 0 0 0 0 13 0 0 0 0 9 0
08/14/2011 11 8 14 2 14 3 0 0 0 0 1 0 14 0 0 3 0
08/14/2011 11 8 14 5 14 6 0 0 0 0 10 0 14 0 0 16 0
08/14/2011 11 8 14 8 14 9 0 0 0 0 8 0 1 0 0 7 0
08/14/2011 11 8 14 11 14 12 0 0 0 0 2 0 0 0 0 0 0
08/14/2011 11 8 14 14 14 15 0 0 0 0 7 0 0 0 0 0 0
08/14/2011 11 8 14 17 14 18 0 0 0 0 30 0 0 0 0 0 0
08/14/2011 11 8 14 20 14 21 0 0 0 0 10 0 0 0 0 1 0
08/14/2011 11 8 14 23 15 0 0 0 0 0 6 0 0 0 0 23 0
08/15/2011 11 8 15 2 15 3 0 0 0 0 5 0 0 0 0 3 0
08/15/2011 11 8 15 5 15 6 0 0 0 0 13 0 0 0 0 1 0
08/15/2011 11 8 15 8 15 9 0 0 0 0 1 0 0 0 0 0 0
08/15/2011 11 8 15 10 15 11 0 0 0 0 23 0 0 0 0 0 0

outputfile should look like this

/path1/samplefile.txt

08-14-2011 00:00:00 0 1 0 14
08-14-2011 00:00:00 0 10 0 7
08-14-2011 00:00:00 0 8 0 3
08-14-2011 00:00:00 0 2 0 3
08-14-2011 00:00:00 0 7 0 3
08-14-2011 00:00:00 0 30 0 3
08-14-2011 00:00:00 0 10 0 1
08-14-2011 00:00:00 0 6 0 23
samplefile.txt
0
Comment
Question by:libertyforall2
  • 2
  • 2
4 Comments
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
This should do what you want.  Let me know if there are any issues...

You say lines 10-19 but say 8 lines (10-19 would be 10 lines).

I set a bunch of vars at the top of the script you can alter to change dirs, lines to keep, cols to keep, etc.
#!/usr/local/bin/perl

use strict;
use warnings;

# change these to suit
my $in_dir = '/path1';
my $out_dir = '/path2/so2';
my $min_line = 10;
my $max_line = 19;
my @cols = (0, 10, 11, 13, 16); # 0-offset rather than 1-offset

# get all files in $in_dir
opendir DIR, $in_dir or die "could not open dir $in_dir: $!";
my @files = grep { -f "$in_dir/$_" } readdir DIR;
closedir DIR;

# loop over files
foreach my $fil (@files) {
    open IN, "$in_dir/$fil" or die "could not open $in_dir/$fil: $!";
    open OUT, '>', "$out_dir/$fil" or die "could not write $out_dir/$fil: $!";
    while (<IN>) {
        last if ($. > $max_line);
        next if ($. < $min_line);
        chomp;
        # get only the cols we want
        my @vals = (split /\s+/)[@cols];
        # add the hard-coded timestamp
        splice @vals, 1, 0, '00:00:00';
        print OUT join(' ', @vals), "\n";
    }
    close OUT;
    close IN;
}

Open in new window

0
 

Author Comment

by:libertyforall2
Comment Utility
Almost. It left the time stamp with slashes instead of hyphens time should look like 08-27-2011 instead of 08/27/2011
0
 
LVL 26

Accepted Solution

by:
wilcoxon earned 500 total points
Comment Utility
Sorry.  Forgot to do that.
#!/usr/local/bin/perl

use strict;
use warnings;

# change these to suit
my $in_dir = '/path1';
my $out_dir = '/path2/so2';
my $min_line = 10;
my $max_line = 19;
my @cols = (0, 10, 11, 13, 16); # 0-offset rather than 1-offset

# get all files in $in_dir
opendir DIR, $in_dir or die "could not open dir $in_dir: $!";
my @files = grep { -f "$in_dir/$_" } readdir DIR;
closedir DIR;

# loop over files
foreach my $fil (@files) {
    open IN, "$in_dir/$fil" or die "could not open $in_dir/$fil: $!";
    open OUT, '>', "$out_dir/$fil" or die "could not write $out_dir/$fil: $!";
    while (<IN>) {
        last if ($. > $max_line);
        next if ($. < $min_line);
        chomp;
        # get only the cols we want
        my @vals = (split /\s+/)[@cols];
        # change / to - in timestamp
        $vals[0] =~ s{/}{-}g;
        # add the hard-coded timestamp
        splice @vals, 1, 0, '00:00:00';
        print OUT join(' ', @vals), "\n";
    }
    close OUT;
    close IN;
}

Open in new window

0
 

Author Closing Comment

by:libertyforall2
Comment Utility
Great!
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

This article shows how a content item can be identified directly or through translation of a navigation type. It then shows how this information can be used to create a menu for further navigation.
The article will show you how you can maintain a simple logfile of all Startup and Shutdown events on Windows servers and desktops with PowerShell. The script can be easily adapted into doing more like gracefully silencing/updating your monitoring s…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now