Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
Solved

convert a csv file to a text file average minute averages to 3 hourly averages using perl or shell

Posted on 2011-09-02
Medium Priority
323 Views
If I have a file attached that looks the one attached. It may have EITHER minute or hourly averages AND ignoring NEGATIVE values OR values 999 and -999. I want it converted to three hourly averages in the output text file.
summitdata.csv
0
Question by:libertyforall2
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

• Help others & share knowledge
• Earn cash & points
• Learn & ask questions
• 4
• 3

LVL 9

Expert Comment

ID: 36478074
Let me clarify:
You want to convert values for every date to three average:
One between 00:00 and 07:59
One between 08:00 and 15:59
One between 16:00 and 23:59
Ignoring negative values and 999 ?

Right?
0

Author Comment

ID: 36478443
I want values for 3 hour increments. 00:00 to 02:59, 03:00 to 05:59, 6:00 to 7:59 and so on until 21:00 to 23:59. If only hours are given, it will be 0 -2, 3-5, 6-8, and so on until 21 to 23. If there are less than three values because a negative value , 999, or -999 is removed, the average we be the next 3, i.e., if there is 0 1, & 3 hours because 2 is removed, then average 0, 1, & 3. If there are fewer than 3 for the final average then you can ingore that value.
0

LVL 9

Expert Comment

ID: 36478547
and what would the average hour refer to? The first one in the list? What happens if the three are not 0,1,3 but 0,9,10?
0

Author Comment

ID: 36478618
Average the first three beginning with zero or the first data point per day. If less than three, ignore that data pint. If the first three are 0, 9, & 10 then average them then start the next three with 11, 12, & 13 or whatever the next three are as long as there are at least three.
0

Author Comment

ID: 36485671
Since this requires several steps, the first step is the most critical. How would I convert the file to a text file?
0

LVL 12

Assisted Solution

tel2 earned 800 total points
ID: 36486963
Hi lfa2,

How would I convert the file to a text file?
A CSV (Comma Separated Values) file is a type of text file.  If you open it with Notepad, Wordpad, etc, you should see the text.  However, your particular CSV file starts with a few chars which are not plain text, and if you don't require them, the script could just ignore them.  I expect CSV will be a good format for the Perl script to work directly from, without any need to convert it first.

Are you with me?

Regarding averages, I think the kinds of questions being asked have come up before for you recently lfa2.  I suggest you make these things clear up front in future, so experts can spend less time getting the spec's out of you, and more on giving you a solution.  In the long run you'll save yourself time, and get your answer faster.

Thanks.
tel2
0

LVL 9

Accepted Solution

parparov earned 1200 total points
ID: 36491154
Consider this code; no conversion from CSV to TXT is necessary, as tel2 correctly pointed.
``````#!/usr/bin/perl

use strict;
use warnings FATAL => 'all';

my \$file = shift;

open(F, \$file);
my @count_3 = ();
my \$working_date = '';
my \$working_time = '';
my \$working_hour = 0;
my @avgs = ();
use Data::Dumper;
use List::Util qw(sum);

sub convert_hour(\$\$) {
my \$hour = shift;
my \$minutes = shift;
return 60 * \$hour + \$minutes;
}

while (<F>) {
next unless /\S/;
chomp;
my (@data) = split(/\,/);

next unless \$data[1] =~ m|(\d+/\d+/\d+) (\d+)\:(\d\d)|;
my \$current_date = \$1;
my \$current_time = "\$1 \$2:\$3";
my \$current_hour = convert_hour(\$2,\$3);
next if \$data[2] < 0 || \$data[2] == 999;
if (\$current_date ne \$working_date) {
\$working_date = \$current_date;
\$working_time = \$current_time;
\$working_hour = \$current_hour;
@count_3 = (\$data[2]);
next;
}
if (\$current_hour >= \$working_hour + 180 && @count_3 >= 3) {
push(
@avgs,
[\$data[0], \$working_time, sum(@count_3)/@count_3],
);
@count_3 = ();
\$working_time = \$current_time;
\$working_hour = \$current_hour;
}
push(@count_3, \$data[2]);
}
print join(
"\n", map(
join(" ",@{\$_}),
@avgs)
), "\n";
``````
0

Author Closing Comment

ID: 36492842
Works great! Thanks.
0

Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Utilizing an array to gracefully append to a list of EmailAddresses
Windows 10 came with  a lot of built in applications, Some organisations leave them there, some will control them using GPO's. This Article is useful for those who do not want to have any applications in their image (example:me).
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us morâ€¦
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: Tâ€¦
Suggested Courses
Course of the Month6 days, 2 hours left to enroll

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.