[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Select highest value in each column and delete all other values in a column of a file using shell or perl

Posted on 2011-09-02
7
Medium Priority
?
541 Views
Last Modified: 2012-05-12
Ok. I only want to do one thing then make an output file based on the results.

If I have files in a directory with whole numbers or numbers rounded to the nearest hundredth, I want to locate the highest value in each column and delete all other values leaving me with a file that has only one row of data.

sampleinput file

/path1/samplefile.txt

08-14-2011 00:00:00 0 1 0 14
08-14-2011 00:00:00 0 10 0 7
08-14-2011 00:00:00 0 8 0 3
08-14-2011 00:00:00 0 2 0 3
08-14-2011 00:00:00 0 7 0 3
08-14-2011 00:00:00 0 30 0 3
08-14-2011 00:00:00 0 10 0 1
08-14-2011 00:00:00 0 6 0 23

sample output file

/path2/sampleoutputfile.txt

08-14-2011 00:00:00 0 30 0 23

There would be a row for each file. There would be a single file with all output rows in chrono order
0
Comment
Question by:libertyforall2
  • 3
  • 3
7 Comments
 
LVL 17

Expert Comment

by:Kent Dyer
ID: 36476837
0
 
LVL 27

Expert Comment

by:wilcoxon
ID: 36476932
This should do what you want in perl...
#!/usr/local/bin/perl

use strict;
use warnings;
use List::Util qw(max);

# change these as needed
my $in_dir = '/path1';
my $out = '/path2/sampleoutputfile.txt';

opendir DIR, $in_dir or die "could not open dir $in_dir: $!";
my @files = grep { -f "$in_dir/$_" } readdir DIR;
closedir DIR;

my %data;

foreach my $fil (@files) {
    open IN, "$in_dir/$fil" or die "could not open $in_dir/$fil: $!";
    my $curr = 0;
    my $row;
    while (<IN>) {
        chomp;
        my ($dt, $ts, @vals) = split /\s+/;
        my $max = max @vals;
        if ($max > $curr) {
            $curr = $max;
            my ($mon, $day, $yr) = split /-/, $dt;
            $row = [$yr, $mon, $day, $td, $_];
        }
    }
    close IN;
    if (exists $data{$row[0]}{$row[1]}{$row[2]}{$row[3]}) {
        push @{$data{$row[0]}{$row[1]}{$row[2]}{$row[3]}}, $row[4];
    } else {
        $data{$row[0]}{$row[1]}{$row[2]}{$row[3]} = [$row[4]];
    }
}

# output each row from input files into output file in chronological order
open OUT, '>', $out or die "could not write $out: $!";
foreach my $yr (sort { $a <=> $b } keys %data) {
    foreach my $mon (sort { $a <=> $b } keys %{$data{$yr}}) {
        foreach my $day (sort { $a <=> $b } keys %{$data{$yr}{$mon}}) {
            foreach my $ts (sort keys %{$data{$yr}{$mon}{$day}}) {
                print join("\n", @{$data{$yr}{$mon}{$day}{$ts}}), "\n";
            }
        }
    }
}
close OUT;

Open in new window

0
 

Author Comment

by:libertyforall2
ID: 36476992
I got these error messages.

[rhuff@huina ~/scripts]$ perl fcstvaluesso2.pl
Can't open perl script "fcstvaluesso2.pl": No such file or directory
[rhuff@huina ~/scripts]$ perl fcsthvalueso2.pl
Global symbol "$td" requires explicit package name at fcsthvalueso2.pl line 28.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 32.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 32.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 32.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 32.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 33.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 33.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 33.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 33.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 33.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 35.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 35.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 35.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line 35.
Global symbol "@row" requires explicit package name at fcsthvalueso2.pl line
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 27

Accepted Solution

by:
wilcoxon earned 2000 total points
ID: 36477132
Oops - make the changes below and it should work:

my @row; # line 20
@row = ($yr, $mon, $day, $ts, $_); # line 28
0
 

Author Comment

by:libertyforall2
ID: 36477223
I'm using this script

#!/usr/local/bin/perl

use strict;
use warnings;
use List::Util qw(max);

# change these as needed
my $in_dir = '/share/huina/rhuff/forecastfiles/so2b';
my $out = '/share/huina/rhuff/forecastfiles/so2c.txt';

opendir DIR, $in_dir or die "could not open dir $in_dir: $!";
my @files = grep { -f "$in_dir/$_" } readdir DIR;
closedir DIR;

my %data;

foreach my $fil (@files) {
    open IN, "$in_dir/$fil" or die "could not open $in_dir/$fil: $!";
    my $curr = 0;
    my @row; 
    while (<IN>) {
        chomp;
        my ($dt, $ts, @vals) = split /\s+/;
        my $max = max @vals;
        if ($max > $curr) {
            $curr = $max;
            my ($mon, $day, $yr) = split /-/, $dt;
            @row = ($yr, $mon, $day, $ts, $_); 
        }
    }
    close IN;
    if (exists $data{$row[0]}{$row[1]}{$row[2]}{$row[3]}) {
        push @{$data{$row[0]}{$row[1]}{$row[2]}{$row[3]}}, $row[4];
    } else {
        $data{$row[0]}{$row[1]}{$row[2]}{$row[3]} = [$row[4]];
    }
}

# output each row from input files into output file in chronological order
open OUT, '>', $out or die "could not write $out: $!";
foreach my $yr (sort { $a <=> $b } keys %data) {
    foreach my $mon (sort { $a <=> $b } keys %{$data{$yr}}) {
        foreach my $day (sort { $a <=> $b } keys %{$data{$yr}{$mon}}) {
            foreach my $ts (sort keys %{$data{$yr}{$mon}{$day}}) {
                print join("\n", @{$data{$yr}{$mon}{$day}{$ts}}), "\n";
            }
        }
    }
}
close OUT;

Open in new window


I am getting this error message and output. It produces a blank file as well.

[rhuff@huina ~/scripts]$ perl so2highest.pl
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in exists at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 35.
Use of uninitialized value in hash element at so2highest.pl line 35.
Use of uninitialized value in hash element at so2highest.pl line 35.
Use of uninitialized value in hash element at so2highest.pl line 35.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in exists at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 32.
Use of uninitialized value in exists at so2highest.pl line 32.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 33.
Use of uninitialized value in hash element at so2highest.pl line 33.
0
 

Author Closing Comment

by:libertyforall2
ID: 36477238
Still getting error messages but command line output was sufficient to create file by copy and past
0
 
LVL 27

Expert Comment

by:wilcoxon
ID: 36477782
Oops - the empty file is due to another minor error (on line 45 this time) - it should be:

print OUT join("\n", @{$data{$yr}{$mon}{$day}{$ts}}), "\n"; # missing OUT

I have no idea why you're getting warnings.  At least when I run it against a directory containing just the sample file you list, I get no warnings.  Are there files in the $in_dir directory that are not in the specified format?  If so, is there a name consistency that would allow only selecting the valid files?  Do some of the files have empty lines?
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This post looks at MongoDB and MySQL, and covers high-level MongoDB strengths, weaknesses, features, and uses from the perspective of an SQL user.
Instead of error trapping or hard-coding for non-updateable fields when using QODBC, let VBA automatically disable them when forms open. This way, users can view but not change the data. Part 1 explained how to use schema tables to do this. Part 2 h…
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
Despite its rising prevalence in the business world, "the cloud" is still misunderstood. Some companies still believe common misconceptions about lack of security in cloud solutions and many misuses of cloud storage options still occur every day. …

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question