Solved

perl or shell delete rows with -99.0000 then average monthly data and create a third column

Posted on 2012-03-16
9
334 Views
Last Modified: 2012-03-28
I have a file with 3 columns.

1. Column 2 needs to be deleted.
2. All rows with -99.0000 need to be deleted.
3. Data in the original column 3 should be converted into monthly averages and will be in column 2 since the original column 2 will be deleted. The number of rows will vary in each average due to deleted rows and different days per month. Data should be averaged to the nearest hundredth.
4. Column 1 should reflect the year and month only since the day will not longer be included for a monthly average.
5. A new column 3 will be created will 12 month binned averages. So the third column won't start until month 12 since 12 data points are needed and rows 1-11 will only have two columns of data and the 3rd column of data begins from row 12.

See attached input file.

Assuming the file perl file is called irradianceavg.pl or the shell script file is called irradianceavg.sh and creates an output file called irradianceavg.txt and I want to called the the script by either

perl /path/irradianceavg.pl > /path/irradianceavg.txt

or

sh /path/irradianceavg.sh > /path/irradianceavg.txt

I want to use the input file found at /path/irradiance.txt
irradiance.txt
0
Comment
Question by:libertyforall2
  • 3
  • 3
  • 2
  • +1
9 Comments
 
LVL 31

Expert Comment

by:farzanj
ID: 37732461
What is binned average?  Would it average latest twelve row?
0
 
LVL 31

Assisted Solution

by:farzanj
farzanj earned 150 total points
ID: 37732609
Here's something to get you started.  You will have to explain what needs to be done after this

#!/usr/bin/perl

use warnings;
use strict;

my @dat = grep {/\d+\s+[\d.]+\s+\d/} <>;
my @vals;
foreach my $row (@dat)
{
    next if ($row =~ /-99\.0000/);
    $row =~ s/^\s+//;
    my @col = split(/\s+/, $row);
    push (@vals,"$col[0] $col[2]");
}
$" = "\n";
print "@vals", "\n";

Open in new window

0
 
LVL 10

Assisted Solution

by:pfrancois
pfrancois earned 150 total points
ID: 37732672
This shell script computes the first two columns. For column 3, let me think a little bit.

#!/bin/bash

sed '1,/^;\$ Data:/d' irradiance.txt \
| tr -s ' ' : \
| cut -d : -f 2,4 \
| grep -v :-99 \
| sed '/^\(....\)../s//\1/' \
| awk -F : '{if ($1 == prevm) { c++; s += $2 } else { if (c > 0) { printf ("%s %5.2f\n", prevm, s/c)} ; c = 1; prevm = $1; s = $2}} END {printf ("%s %5.2f\n", prevm, s/c) }'

Open in new window

0
 
LVL 10

Expert Comment

by:pfrancois
ID: 37733156
OK, I think this shell script does the job:

#!/bin/bash

sed '1,/^;\$ Data:/d' irradiance.txt \
| tr -s ' ' : \
| cut -d : -f 2,4 \
| grep -v :-99 \
| sed '/^\(....\)../s//\1/' \
| awk -F : '{
	if ($1 == prevm) { 
		c++; s += $2 
	} else { 
		if (c > 0) {
			printf ("%s %5.2f\n", prevm, s/c)
		}; 
		c = 1; 
		prevm = $1; 
		s = $2
	}
} 
END	{printf ("%s %5.2f\n", prevm, s/c) 
}' \
| awk -F ' ' '{
	if (++j > 12) {
		j = 1
	};
	if (++i < 12) {
		last12 [j] = $2; 
		tot += $2; 
		printf ("%s %5.2f\n", $1, $2);
	} else {
		tot -= last12 [j]; 
		last12 [j] = $2;
		tot += $2;
		printf ("%s %5.2f %5.2f\n", $1, $2, tot/12);
	}
}'

Open in new window

0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 84

Expert Comment

by:ozo
ID: 37734558
#!/usr/bin/perl -an                                                                              
BEGIN{@ARGV=("/path/irradiance.txt")}
next if /;/ || $F[2] == -99;                                                                    
($m)=/(\d{4})/;
printf "%s %.2f\n",$p,$s1/$s0 and $s1=$s0=0 if $s0 && $p!=$m;
$p=$m and ++$s0, $s1+=$F[2];
}continue{
 redo if s/.+// && eof;
0
 
LVL 84

Expert Comment

by:ozo
ID: 37734587
Should the 12 month binned averages be the average of the last 12 monthly averages, or the average of the days in the last 12 months?

If the former:

#!/usr/bin/perl -an
BEGIN{@ARGV=("/path/irradiance.txt");$"="+"}
next if /;/ || $F[2] == -99;
($m)=/(\d{4})/;
$m[++$#m]=$s1/$s0,@m==12?(($v,$s)=((eval"@m")/12," %.2f",shift @m)):"",printf "$p %.2f$s\n",$m[-1],$v and $s1=$s0=0 if $s0 && $p!=$m;
$p=$m and ++$s0, $s1+=$F[2];
}continue{
 redo if s/.+// && eof;
0
 
LVL 10

Expert Comment

by:pfrancois
ID: 37734619
@libertyforall2: Just in case you are doubting my script doesn't do the job, I attach hereby the script and its output.
irradianceavg.txt
irradianceavg.sh
0
 
LVL 84

Accepted Solution

by:
ozo earned 200 total points
ID: 37734628
if the latter:

#!/usr/bin/perl -an
BEGIN{@ARGV=("/path/irradiance.txt");$"="+"}
next if /;/ || $F[2] == -99;
($m)=/(\d{4})/;
push(@s1,$s1),push(@s0,$s0),@s1==12&&(($v,$s)=(eval"(@s1)/(@s0)"," %.2f",shift @s0,shift@s1)),(printf"$p %.2f$s\n",$s1/$s0,$v),$s1=$s0=0 if $s0 && $p!=$m;
$p=$m and ++$s0, $s1+=$F[2];
}continue{
 redo if s/.+// && eof;
0
 

Author Closing Comment

by:libertyforall2
ID: 37780010
Works well.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Entering a date in Microsoft Access can be tricky. A typo can cause month and day to be shuffled, entering the day only causes an error, as does entering, say, day 31 in June. This article shows how an inputmask supported by code can help the user a…
Whether you've completed a degree in computer sciences or you're a self-taught programmer, writing your first lines of code in the real world is always a challenge. Here are some of the most common pitfalls for new programmers.
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

948 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now