Solved

perl or shell delete rows with -99.0000 then average monthly data and create a third column

Posted on 2012-03-16
9
337 Views
Last Modified: 2012-03-28
I have a file with 3 columns.

1. Column 2 needs to be deleted.
2. All rows with -99.0000 need to be deleted.
3. Data in the original column 3 should be converted into monthly averages and will be in column 2 since the original column 2 will be deleted. The number of rows will vary in each average due to deleted rows and different days per month. Data should be averaged to the nearest hundredth.
4. Column 1 should reflect the year and month only since the day will not longer be included for a monthly average.
5. A new column 3 will be created will 12 month binned averages. So the third column won't start until month 12 since 12 data points are needed and rows 1-11 will only have two columns of data and the 3rd column of data begins from row 12.

See attached input file.

Assuming the file perl file is called irradianceavg.pl or the shell script file is called irradianceavg.sh and creates an output file called irradianceavg.txt and I want to called the the script by either

perl /path/irradianceavg.pl > /path/irradianceavg.txt

or

sh /path/irradianceavg.sh > /path/irradianceavg.txt

I want to use the input file found at /path/irradiance.txt
irradiance.txt
0
Comment
Question by:libertyforall2
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
  • 2
  • +1
9 Comments
 
LVL 31

Expert Comment

by:farzanj
ID: 37732461
What is binned average?  Would it average latest twelve row?
0
 
LVL 31

Assisted Solution

by:farzanj
farzanj earned 150 total points
ID: 37732609
Here's something to get you started.  You will have to explain what needs to be done after this

#!/usr/bin/perl

use warnings;
use strict;

my @dat = grep {/\d+\s+[\d.]+\s+\d/} <>;
my @vals;
foreach my $row (@dat)
{
    next if ($row =~ /-99\.0000/);
    $row =~ s/^\s+//;
    my @col = split(/\s+/, $row);
    push (@vals,"$col[0] $col[2]");
}
$" = "\n";
print "@vals", "\n";

Open in new window

0
 
LVL 10

Assisted Solution

by:pfrancois
pfrancois earned 150 total points
ID: 37732672
This shell script computes the first two columns. For column 3, let me think a little bit.

#!/bin/bash

sed '1,/^;\$ Data:/d' irradiance.txt \
| tr -s ' ' : \
| cut -d : -f 2,4 \
| grep -v :-99 \
| sed '/^\(....\)../s//\1/' \
| awk -F : '{if ($1 == prevm) { c++; s += $2 } else { if (c > 0) { printf ("%s %5.2f\n", prevm, s/c)} ; c = 1; prevm = $1; s = $2}} END {printf ("%s %5.2f\n", prevm, s/c) }'

Open in new window

0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 10

Expert Comment

by:pfrancois
ID: 37733156
OK, I think this shell script does the job:

#!/bin/bash

sed '1,/^;\$ Data:/d' irradiance.txt \
| tr -s ' ' : \
| cut -d : -f 2,4 \
| grep -v :-99 \
| sed '/^\(....\)../s//\1/' \
| awk -F : '{
	if ($1 == prevm) { 
		c++; s += $2 
	} else { 
		if (c > 0) {
			printf ("%s %5.2f\n", prevm, s/c)
		}; 
		c = 1; 
		prevm = $1; 
		s = $2
	}
} 
END	{printf ("%s %5.2f\n", prevm, s/c) 
}' \
| awk -F ' ' '{
	if (++j > 12) {
		j = 1
	};
	if (++i < 12) {
		last12 [j] = $2; 
		tot += $2; 
		printf ("%s %5.2f\n", $1, $2);
	} else {
		tot -= last12 [j]; 
		last12 [j] = $2;
		tot += $2;
		printf ("%s %5.2f %5.2f\n", $1, $2, tot/12);
	}
}'

Open in new window

0
 
LVL 84

Expert Comment

by:ozo
ID: 37734558
#!/usr/bin/perl -an                                                                              
BEGIN{@ARGV=("/path/irradiance.txt")}
next if /;/ || $F[2] == -99;                                                                    
($m)=/(\d{4})/;
printf "%s %.2f\n",$p,$s1/$s0 and $s1=$s0=0 if $s0 && $p!=$m;
$p=$m and ++$s0, $s1+=$F[2];
}continue{
 redo if s/.+// && eof;
0
 
LVL 84

Expert Comment

by:ozo
ID: 37734587
Should the 12 month binned averages be the average of the last 12 monthly averages, or the average of the days in the last 12 months?

If the former:

#!/usr/bin/perl -an
BEGIN{@ARGV=("/path/irradiance.txt");$"="+"}
next if /;/ || $F[2] == -99;
($m)=/(\d{4})/;
$m[++$#m]=$s1/$s0,@m==12?(($v,$s)=((eval"@m")/12," %.2f",shift @m)):"",printf "$p %.2f$s\n",$m[-1],$v and $s1=$s0=0 if $s0 && $p!=$m;
$p=$m and ++$s0, $s1+=$F[2];
}continue{
 redo if s/.+// && eof;
0
 
LVL 10

Expert Comment

by:pfrancois
ID: 37734619
@libertyforall2: Just in case you are doubting my script doesn't do the job, I attach hereby the script and its output.
irradianceavg.txt
irradianceavg.sh
0
 
LVL 84

Accepted Solution

by:
ozo earned 200 total points
ID: 37734628
if the latter:

#!/usr/bin/perl -an
BEGIN{@ARGV=("/path/irradiance.txt");$"="+"}
next if /;/ || $F[2] == -99;
($m)=/(\d{4})/;
push(@s1,$s1),push(@s0,$s0),@s1==12&&(($v,$s)=(eval"(@s1)/(@s0)"," %.2f",shift @s0,shift@s1)),(printf"$p %.2f$s\n",$s1/$s0,$v),$s1=$s0=0 if $s0 && $p!=$m;
$p=$m and ++$s0, $s1+=$F[2];
}continue{
 redo if s/.+// && eof;
0
 

Author Closing Comment

by:libertyforall2
ID: 37780010
Works well.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
AutoIncrement column based of FK 11 63
Need to combine two scripts 2 40
Renaming with batch file 9 77
Automate and generate Azure reports for the following items 3 40
Utilizing an array to gracefully append to a list of EmailAddresses
Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question