Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 546
  • Last Modified:

Select highest value in each column and delete all other values in a column of a file using shell or perl

Ok. I only want to do one thing then make an output file based on the results.

If I have files in a directory with whole numbers or numbers rounded to the nearest hundredth, I want to locate the highest value in each column and delete all other values leaving me with a file that has only one row of data.

sampleinput file

/path1/samplefile.txt

08-14-2011 00:00:00 0 1 0 14
08-14-2011 00:00:00 0 10 0 7
08-14-2011 00:00:00 0 8 0 3
08-14-2011 00:00:00 0 2 0 3
08-14-2011 00:00:00 0 7 0 3
08-14-2011 00:00:00 0 30 0 3
08-14-2011 00:00:00 0 10 0 1
08-14-2011 00:00:00 0 6 0 23

sample output file

/path2/sampleoutputfile.txt

08-14-2011 00:00:00 0 30 0 23

There would be a row for each file. There would be a single file with all output rows in chrono order
0
libertyforall2
Asked:
libertyforall2
  • 3
  • 3
1 Solution
 
Kent DyerIT Security Analyst SeniorCommented:
0
 
wilcoxonCommented:
This should do what you want in perl...
#!/usr/local/bin/perl

use strict;
use warnings;
use List::Util qw(max);

# change these as needed
my $in_dir = '/path1';
my $out = '/path2/sampleoutputfile.txt';

opendir DIR, $in_dir or die "could not open dir $in_dir: $!";
my @files = grep { -f "$in_dir/$_" } readdir DIR;
closedir DIR;

my %data;

foreach my $fil (@files) {
    open IN, "$in_dir/$fil" or die "could not open $in_dir/$fil: $!";
    my $curr = 0;
    my $row;
    while (<IN>) {
        chomp;
        my ($dt, $ts, @vals) = split /\s+/;
        my $max = max @vals;
        if ($max > $curr) {
            $curr = $max;
            my ($mon, $day, $yr) = split /-/, $dt;
            $row = [$yr, $mon, $day, $td, $_];
        }
    }
    close IN;
    if (exists $data{$row[0]}{$row[1]}{$row[2]}{$row[3]}) {
        push @{$data{$row[0]}{$row[1]}{$row[2]}{$row[3]}}, $row[4];
    } else {
        $data{$row[0]}{$row[1]}{$row[2]}{$row[3]} = [$row[4]];
    }
}

# output each row from input files into output file in chronological order
open OUT, '>', $out or die "could not write $out: $!";
foreach my $yr (sort { $a <=> $b } keys %data) {
    foreach my $mon (sort { $a <=> $b } keys %{$data{$yr}}) {
        foreach my $day (sort { $a <=> $b } keys %{$data{$yr}{$mon}}) {
            foreach my $ts (sort keys %{$data{$yr}{$mon}{$day}}) {
                print join("\n", @{$data{$yr}{$mon}{$day}{$ts}}), "\n";
            }
        }
    }
}
close OUT;

Open in new window

0
 
libertyforall2Author Commented:
I got these error messages.

[rhuff@huina ~/scripts]$ perl fcstvaluesso2.pl
Can't open perl script "fcstvaluesso2.pl": No such file or directory
[rhuff@huina ~/scripts]$ perl fcsthvalueso2.pl
Global symbol "$td" requires explicit package name at fcsthvalueso2.pl line 28.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 32.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 32.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 32.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 32.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 33.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 33.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 33.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 33.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 33.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 35.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 35.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 35.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 35.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line
0
Easily Design & Build Your Next Website

Squarespace’s all-in-one platform gives you everything you need to express yourself creatively online, whether it is with a domain, website, or online store. Get started with your free trial today, and when ready, take 10% off your first purchase with offer code 'EXPERTS'.

 
wilcoxonCommented:
Oops - make the changes below and it should work:

my @row; # line 20
@row = ($yr, $mon, $day, $ts, $_); # line 28
0
 
libertyforall2Author Commented:
I'm using this script

#!/usr/local/bin/perl

use strict;
use warnings;
use List::Util qw(max);

# change these as needed
my $in_dir = '/share/huina/rhuff/forecastfiles/so2b';
my $out = '/share/huina/rhuff/forecastfiles/so2c.txt';

opendir DIR, $in_dir or die "could not open dir $in_dir: $!";
my @files = grep { -f "$in_dir/$_" } readdir DIR;
closedir DIR;

my %data;

foreach my $fil (@files) {
    open IN, "$in_dir/$fil" or die "could not open $in_dir/$fil: $!";
    my $curr = 0;
    my @row; 
    while (<IN>) {
        chomp;
        my ($dt, $ts, @vals) = split /\s+/;
        my $max = max @vals;
        if ($max > $curr) {
            $curr = $max;
            my ($mon, $day, $yr) = split /-/, $dt;
            @row = ($yr, $mon, $day, $ts, $_); 
        }
    }
    close IN;
    if (exists $data{$row[0]}{$row[1]}{$row[2]}{$row[3]}) {
        push @{$data{$row[0]}{$row[1]}{$row[2]}{$row[3]}}, $row[4];
    } else {
        $data{$row[0]}{$row[1]}{$row[2]}{$row[3]} = [$row[4]];
    }
}

# output each row from input files into output file in chronological order
open OUT, '>', $out or die "could not write $out: $!";
foreach my $yr (sort { $a <=> $b } keys %data) {
    foreach my $mon (sort { $a <=> $b } keys %{$data{$yr}}) {
        foreach my $day (sort { $a <=> $b } keys %{$data{$yr}{$mon}}) {
            foreach my $ts (sort keys %{$data{$yr}{$mon}{$day}}) {
                print join("\n", @{$data{$yr}{$mon}{$day}{$ts}}), "\n";
            }
        }
    }
}
close OUT;

Open in new window


I am getting this error message and output. It produces a blank file as well.

[rhuff@huina ~/scripts]$ perl so2highest.pl
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in exists at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 35.
Use of uninitialized value in hash element at so2highest.pl line 35.
Use of uninitialized value in hash element at so2highest.pl line 35.
Use of uninitialized value in hash element at so2highest.pl line 35.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in exists at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in exists at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 33.
0
 
libertyforall2Author Commented:
Still getting error messages but command line output was sufficient to create file by copy and past
0
 
wilcoxonCommented:
Oops - the empty file is due to another minor error (on line 45 this time) - it should be:

print OUT join("\n", @{$data{$yr}{$mon}{$day}{$ts}}), "\n"; # missing OUT

I have no idea why you're getting warnings.  At least when I run it against a directory containing just the sample file you list, I get no warnings.  Are there files in the $in_dir directory that are not in the specified format?  If so, is there a name consistency that would allow only selecting the valid files?  Do some of the files have empty lines?
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Improve Your Query Performance Tuning

In this FREE six-day email course, you'll learn from Janis Griffin, Database Performance Evangelist. She'll teach 12 steps that you can use to optimize your queries as much as possible and see measurable results in your work. Get started today!

  • 3
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now