[Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Pick the row for each day with highest value in the last column then delete the rest using perl

Posted on 2011-09-22
3
Medium Priority
?
267 Views
Last Modified: 2012-05-12
I have a text file in this format. Below. I have the perl script or shell script to simply locate the highest row for each day, keep that line for each day then delete the rest. Some days may be missing. The output file will look just like the input file except smaller with one row per day. INput file may have hundreds of days worth of data. Output file would ideally have the final value in the last column round to the nearest hundredth i.e. 0.00 but it is not critical.

inputfile.txt
11-21-10 00:00:00 0.033
11-21-10 00:00:00 0.0146666666666667
11-21-10 00:00:00 0.00366666666666667
11-21-10 00:00:00 0.000333333333333333
11-22-10 00:00:00 0.00466666666666667
11-22-10 00:00:00 0.031
11-22-10 00:00:00 0.0276666666666667
11-22-10 00:00:00 0.005
11-22-10 00:00:00 0.00133333333333333
11-22-10 00:00:00 0
11-22-10 00:00:00 0
11-23-10 00:00:00 0
11-23-10 00:00:00 0
11-23-10 00:00:00 0
11-23-10 00:00:00 0.000666666666666667
11-23-10 00:00:00 0
11-23-10 00:00:00 0
11-23-10 00:00:00 0
11-24-10 00:00:00 0
11-24-10 00:00:00 0
11-24-10 00:00:00 0
11-24-10 00:00:00 0
11-24-10 00:00:00 0
11-24-10 00:00:00 0
11-24-10 00:00:00 0
11-25-10 00:00:00 0

output file.txt
11-21-10 00:00:00 0.033
11-22-10 00:00:00 0.031
11-23-10 00:00:00 0.000666666666666667
11-24-10 00:00:00 0
11-25-10 00:00:00 0
vcnow.txt
0
Comment
Question by:libertyforall2
3 Comments
 
LVL 10

Accepted Solution

by:
jeromee earned 1000 total points
ID: 36584489
Try this for size:

perl -ane'$k=join(" ",@F[0,1]); $s{$k}=$F[2] if $F[2]>$s{$k}; }{print map{"$_ $s{$_}\n"} sort keys %s' vcnow.txt > output_file
0
 
LVL 27

Assisted Solution

by:wilcoxon
wilcoxon earned 1000 total points
ID: 36586979
This should do what you want.
#!/usr/local/bin/perl

use strict;
use warnings;

# change these values if necessary
my $infile = 'inputfile.txt';
my $outfile = 'outputfile.txt';

open IN, $infile or die "could not open $infile: $!";
my %max;
while (<IN>) {
    chomp;
    my ($dt, $tm, $val) = split;
    if (not exists $max{$dt} or $val > $max{$dt}[0]) {
        $max{$dt} = [$val, $_];
    }
}
close IN;

open OUT, '>', $outfile or die "could not write $outfile: $!";
foreach my $dt (sort keys %max) {
    print OUT $max{$dt}[1], "\n";
}
close OUT;

Open in new window

0
 

Author Closing Comment

by:libertyforall2
ID: 36898443
Great!
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Utilizing an array to gracefully append to a list of EmailAddresses
When you discover the power of the R programming language, you are going to wonder how you ever lived without it! Learn why the language merits a place in your programming arsenal.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question