Solved

convert a csv file to a text file average minute averages to 3 hourly averages using perl or shell

Posted on 2011-09-02
8
312 Views
Last Modified: 2012-05-12
If I have a file attached that looks the one attached. It may have EITHER minute or hourly averages AND ignoring NEGATIVE values OR values 999 and -999. I want it converted to three hourly averages in the output text file.
summitdata.csv
0
Comment
Question by:libertyforall2
  • 4
  • 3
8 Comments
 
LVL 9

Expert Comment

by:parparov
ID: 36478074
Let me clarify:
You want to convert values for every date to three average:
One between 00:00 and 07:59
One between 08:00 and 15:59
One between 16:00 and 23:59
Ignoring negative values and 999 ?

Right?
0
 

Author Comment

by:libertyforall2
ID: 36478443
I want values for 3 hour increments. 00:00 to 02:59, 03:00 to 05:59, 6:00 to 7:59 and so on until 21:00 to 23:59. If only hours are given, it will be 0 -2, 3-5, 6-8, and so on until 21 to 23. If there are less than three values because a negative value , 999, or -999 is removed, the average we be the next 3, i.e., if there is 0 1, & 3 hours because 2 is removed, then average 0, 1, & 3. If there are fewer than 3 for the final average then you can ingore that value.
0
 
LVL 9

Expert Comment

by:parparov
ID: 36478547
and what would the average hour refer to? The first one in the list? What happens if the three are not 0,1,3 but 0,9,10?
0
 

Author Comment

by:libertyforall2
ID: 36478618
Average the first three beginning with zero or the first data point per day. If less than three, ignore that data pint. If the first three are 0, 9, & 10 then average them then start the next three with 11, 12, & 13 or whatever the next three are as long as there are at least three.
0
6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

 

Author Comment

by:libertyforall2
ID: 36485671
Since this requires several steps, the first step is the most critical. How would I convert the file to a text file?
0
 
LVL 11

Assisted Solution

by:tel2
tel2 earned 200 total points
ID: 36486963
Hi lfa2,

> How would I convert the file to a text file?
A CSV (Comma Separated Values) file is a type of text file.  If you open it with Notepad, Wordpad, etc, you should see the text.  However, your particular CSV file starts with a few chars which are not plain text, and if you don't require them, the script could just ignore them.  I expect CSV will be a good format for the Perl script to work directly from, without any need to convert it first.

Are you with me?

Regarding averages, I think the kinds of questions being asked have come up before for you recently lfa2.  I suggest you make these things clear up front in future, so experts can spend less time getting the spec's out of you, and more on giving you a solution.  In the long run you'll save yourself time, and get your answer faster.

Thanks.
tel2
0
 
LVL 9

Accepted Solution

by:
parparov earned 300 total points
ID: 36491154
Consider this code; no conversion from CSV to TXT is necessary, as tel2 correctly pointed.
#!/usr/bin/perl

use strict;
use warnings FATAL => 'all';

my $file = shift;

open(F, $file);
my @count_3 = ();
my $working_date = '';
my $working_time = '';
my $working_hour = 0;
my @avgs = ();
use Data::Dumper;
use List::Util qw(sum);

sub convert_hour($$) {
	my $hour = shift;
	my $minutes = shift;
	return 60 * $hour + $minutes;
}

while (<F>) {
	next unless /\S/;
	chomp;
	my (@data) = split(/\,/);
	
	next unless $data[1] =~ m|(\d+/\d+/\d+) (\d+)\:(\d\d)|;
	my $current_date = $1;
	my $current_time = "$1 $2:$3";
	my $current_hour = convert_hour($2,$3);
	next if $data[2] < 0 || $data[2] == 999;
	if ($current_date ne $working_date) {
		$working_date = $current_date;
		$working_time = $current_time;
		$working_hour = $current_hour;
		@count_3 = ($data[2]);
		next;
	}
	if ($current_hour >= $working_hour + 180 && @count_3 >= 3) {
		push(
			@avgs,
			[$data[0], $working_time, sum(@count_3)/@count_3],
		);
		@count_3 = ();
		$working_time = $current_time;
		$working_hour = $current_hour;
	}
	push(@count_3, $data[2]);
}
print join(
	"\n", map(
		join(" ",@{$_}),
	 @avgs)
), "\n";

Open in new window

0
 

Author Closing Comment

by:libertyforall2
ID: 36492842
Works great! Thanks.
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Utilizing an array to gracefully append to a list of EmailAddresses
In this previous article (https://oddytee.wordpress.com/2016/05/05/provision-new-office-365-user-and-mailbox-from-exchange-hybrid-via-powershell/), we made basic license assignments to users in O365. When I say basic, the method is the simplest way …
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now