Solved

convert a csv file to a text file average minute averages to 3 hourly averages using perl or shell

Posted on 2011-09-02
8
321 Views
Last Modified: 2012-05-12
If I have a file attached that looks the one attached. It may have EITHER minute or hourly averages AND ignoring NEGATIVE values OR values 999 and -999. I want it converted to three hourly averages in the output text file.
summitdata.csv
0
Comment
Question by:libertyforall2
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
8 Comments
 
LVL 9

Expert Comment

by:parparov
ID: 36478074
Let me clarify:
You want to convert values for every date to three average:
One between 00:00 and 07:59
One between 08:00 and 15:59
One between 16:00 and 23:59
Ignoring negative values and 999 ?

Right?
0
 

Author Comment

by:libertyforall2
ID: 36478443
I want values for 3 hour increments. 00:00 to 02:59, 03:00 to 05:59, 6:00 to 7:59 and so on until 21:00 to 23:59. If only hours are given, it will be 0 -2, 3-5, 6-8, and so on until 21 to 23. If there are less than three values because a negative value , 999, or -999 is removed, the average we be the next 3, i.e., if there is 0 1, & 3 hours because 2 is removed, then average 0, 1, & 3. If there are fewer than 3 for the final average then you can ingore that value.
0
 
LVL 9

Expert Comment

by:parparov
ID: 36478547
and what would the average hour refer to? The first one in the list? What happens if the three are not 0,1,3 but 0,9,10?
0
Problems using Powershell and Active Directory?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

 

Author Comment

by:libertyforall2
ID: 36478618
Average the first three beginning with zero or the first data point per day. If less than three, ignore that data pint. If the first three are 0, 9, & 10 then average them then start the next three with 11, 12, & 13 or whatever the next three are as long as there are at least three.
0
 

Author Comment

by:libertyforall2
ID: 36485671
Since this requires several steps, the first step is the most critical. How would I convert the file to a text file?
0
 
LVL 12

Assisted Solution

by:tel2
tel2 earned 200 total points
ID: 36486963
Hi lfa2,

How would I convert the file to a text file?
A CSV (Comma Separated Values) file is a type of text file.  If you open it with Notepad, Wordpad, etc, you should see the text.  However, your particular CSV file starts with a few chars which are not plain text, and if you don't require them, the script could just ignore them.  I expect CSV will be a good format for the Perl script to work directly from, without any need to convert it first.

Are you with me?

Regarding averages, I think the kinds of questions being asked have come up before for you recently lfa2.  I suggest you make these things clear up front in future, so experts can spend less time getting the spec's out of you, and more on giving you a solution.  In the long run you'll save yourself time, and get your answer faster.

Thanks.
tel2
0
 
LVL 9

Accepted Solution

by:
parparov earned 300 total points
ID: 36491154
Consider this code; no conversion from CSV to TXT is necessary, as tel2 correctly pointed.
#!/usr/bin/perl

use strict;
use warnings FATAL => 'all';

my $file = shift;

open(F, $file);
my @count_3 = ();
my $working_date = '';
my $working_time = '';
my $working_hour = 0;
my @avgs = ();
use Data::Dumper;
use List::Util qw(sum);

sub convert_hour($$) {
	my $hour = shift;
	my $minutes = shift;
	return 60 * $hour + $minutes;
}

while (<F>) {
	next unless /\S/;
	chomp;
	my (@data) = split(/\,/);
	
	next unless $data[1] =~ m|(\d+/\d+/\d+) (\d+)\:(\d\d)|;
	my $current_date = $1;
	my $current_time = "$1 $2:$3";
	my $current_hour = convert_hour($2,$3);
	next if $data[2] < 0 || $data[2] == 999;
	if ($current_date ne $working_date) {
		$working_date = $current_date;
		$working_time = $current_time;
		$working_hour = $current_hour;
		@count_3 = ($data[2]);
		next;
	}
	if ($current_hour >= $working_hour + 180 && @count_3 >= 3) {
		push(
			@avgs,
			[$data[0], $working_time, sum(@count_3)/@count_3],
		);
		@count_3 = ();
		$working_time = $current_time;
		$working_hour = $current_hour;
	}
	push(@count_3, $data[2]);
}
print join(
	"\n", map(
		join(" ",@{$_}),
	 @avgs)
), "\n";

Open in new window

0
 

Author Closing Comment

by:libertyforall2
ID: 36492842
Works great! Thanks.
0

Featured Post

Has Powershell sent you back into the Stone Age?

If managing Active Directory using Windows Powershell® is making you feel like you stepped back in time, you are not alone.  For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article will help you understand what HashTables are and how to use them in PowerShell.
Recently we ran in to an issue while running some SQL jobs where we were trying to process the cubes.  We got an error saying failure stating 'NT SERVICE\SQLSERVERAGENT does not have access to Analysis Services. So this is a way to automate that wit…
Six Sigma Control Plans
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

630 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question