Solved

Pick the row for each day with highest value in the last column then delete the rest using perl

Posted on 2011-09-22
3
231 Views
Last Modified: 2012-05-12
I have a text file in this format. Below. I have the perl script or shell script to simply locate the highest row for each day, keep that line for each day then delete the rest. Some days may be missing. The output file will look just like the input file except smaller with one row per day. INput file may have hundreds of days worth of data. Output file would ideally have the final value in the last column round to the nearest hundredth i.e. 0.00 but it is not critical.

inputfile.txt
11-21-10 00:00:00 0.033
11-21-10 00:00:00 0.0146666666666667
11-21-10 00:00:00 0.00366666666666667
11-21-10 00:00:00 0.000333333333333333
11-22-10 00:00:00 0.00466666666666667
11-22-10 00:00:00 0.031
11-22-10 00:00:00 0.0276666666666667
11-22-10 00:00:00 0.005
11-22-10 00:00:00 0.00133333333333333
11-22-10 00:00:00 0
11-22-10 00:00:00 0
11-23-10 00:00:00 0
11-23-10 00:00:00 0
11-23-10 00:00:00 0
11-23-10 00:00:00 0.000666666666666667
11-23-10 00:00:00 0
11-23-10 00:00:00 0
11-23-10 00:00:00 0
11-24-10 00:00:00 0
11-24-10 00:00:00 0
11-24-10 00:00:00 0
11-24-10 00:00:00 0
11-24-10 00:00:00 0
11-24-10 00:00:00 0
11-24-10 00:00:00 0
11-25-10 00:00:00 0

output file.txt
11-21-10 00:00:00 0.033
11-22-10 00:00:00 0.031
11-23-10 00:00:00 0.000666666666666667
11-24-10 00:00:00 0
11-25-10 00:00:00 0
vcnow.txt
0
Comment
Question by:libertyforall2
3 Comments
 
LVL 10

Accepted Solution

by:
jeromee earned 250 total points
ID: 36584489
Try this for size:

perl -ane'$k=join(" ",@F[0,1]); $s{$k}=$F[2] if $F[2]>$s{$k}; }{print map{"$_ $s{$_}\n"} sort keys %s' vcnow.txt > output_file
0
 
LVL 26

Assisted Solution

by:wilcoxon
wilcoxon earned 250 total points
ID: 36586979
This should do what you want.
#!/usr/local/bin/perl

use strict;
use warnings;

# change these values if necessary
my $infile = 'inputfile.txt';
my $outfile = 'outputfile.txt';

open IN, $infile or die "could not open $infile: $!";
my %max;
while (<IN>) {
    chomp;
    my ($dt, $tm, $val) = split;
    if (not exists $max{$dt} or $val > $max{$dt}[0]) {
        $max{$dt} = [$val, $_];
    }
}
close IN;

open OUT, '>', $outfile or die "could not write $outfile: $!";
foreach my $dt (sort keys %max) {
    print OUT $max{$dt}[1], "\n";
}
close OUT;

Open in new window

0
 

Author Closing Comment

by:libertyforall2
ID: 36898443
Great!
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article is meant to give a basic understanding of how to use R Sweave as a way to merge LaTeX and R code seamlessly into one presentable document.
Utilizing an array to gracefully append to a list of EmailAddresses
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
With the power of JIRA, there's an unlimited number of ways you can customize it, use it and benefit from it. With that in mind, there's bound to be things that I wasn't able to cover in this course. With this summary we'll look at some places to go…

864 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now