Solved

Using a mac to execute a perl command on a perl file and input file then output a text file on a mac

Posted on 2011-09-10
13
443 Views
Last Modified: 2012-05-12
I am using the code below. I entered the command below and it generated an output file but it was blank. I usually sftp ssh to a linux cluster but the server is down so I am trying to parse the data on my mac. What am I doing wrong? I have a 2009 macbook pro

royhuff:Desktop royhuff$ perl csv.pl vc.csv >out.txt
royhuff:Desktop royhuff$

#!/usr/bin/perl

use strict;
use warnings FATAL => 'all';

my $file = shift;

open(F, $file);
my @count_3 = ();
my $working_date = '';
my $working_time = '';
my $working_hour = 0;
my @avgs = ();
use Data::Dumper;
use List::Util qw(sum);

sub convert_hour($$) {
	my $hour = shift;
	my $minutes = shift;
	return 60 * $hour + $minutes;
}

while (<F>) {
	next unless /\S/;
	chomp;
	my (@data) = split(/\,/);
	
	next unless $data[1] =~ m|(\d+/\d+/\d+) (\d+)\:(\d\d)|;
	my $current_date = $1;
	my $current_time = "$1 $2:$3";
	my $current_hour = convert_hour($2,$3);
	next if $data[2] < 0 || $data[2] == 999;
	if ($current_date ne $working_date) {
		$working_date = $current_date;
		$working_time = $current_time;
		$working_hour = $current_hour;
		@count_3 = ($data[2]);
		next;
	}
	if ($current_hour >= $working_hour + 180 && @count_3 >= 3) {
		push(
			@avgs,
			[$data[0], $working_time, sum(@count_3)/@count_3/ 1000 
],
		);
		@count_3 = ();
		$working_time = $current_time;
		$working_hour = $current_hour;
	}
	push(@count_3, $data[2]);
}
print join(
	"\n", map(
		join(" ",@{$_}),
	 @avgs)
), "\n";

Open in new window

vc.csv
0
Comment
Question by:libertyforall2
  • 5
  • 3
  • 3
  • +2
13 Comments
 
LVL 10

Accepted Solution

by:
jeromee earned 100 total points
ID: 36517846
Add this line at line 27:

    next unless $data[1];

as in:

    my (@data) = split(/\,/);
   next unless $data[1];

    next unless $data[1] =~ m|(\d+/\d+/\d+) (\d+)\:(\d\d)|;
0
 

Author Comment

by:libertyforall2
ID: 36517897
This is written 26 through 28. How exactly should I change it?

my (@data) = split(/\,/);
	
	next unless $data[1] =~ m|(\d+/\d+/\d+) (\d+)\:(\d\d)|;

Open in new window

0
 

Author Comment

by:libertyforall2
ID: 36517914
I modified line 26-28 to look like this below and I still got blank output. The script worked in linux. I think its just my command conventions perhaps.

my (@data) = split(/\,/);
	next unless $data[1];
	next unless $data[1] =~ m|(\d+/\d+/\d+) (\d+)\:(\d\d)|;

Open in new window

0
 
LVL 9

Assisted Solution

by:lisfolks
lisfolks earned 200 total points
ID: 36517970
First, your data file is written to a Windows format - it contains CR/LF characters that the Mac doesn't recognize. You need to convert the csv file like so:

royhuff:Desktop royhuff$ tr '\r' '\n' < vc.csv > vc_unix.csv

Once you have done that, then jeromee's suggestion will work!
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 36518008
There are a number of things you're doing wrong, but if the script works on a linux system and not on the mac, then you should look at the differences in environment.  Specifically, the file locations and permissions.

I suspect that the open call is failing, but since you're ignoring the return code, you would not have known if failed or was successful.

I know that I've mentioned this to you before, but you really should be using a lexical var for the filehandle and the 3 arg form of open and using single character var names like this is bad coding.

Change your open call to this:
open my $csv_fh, '<', $file or die "could not open '$file' $!";

Open in new window


You will also need to change the initialization of the while loop to use the lexical var instead of the bareword.
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 36518018
The change that jeromee suggested is superfluous and won't do anything useful.  The regex that is applied to $data[1] to extract the datestamp makes that suggestion unnecessary.
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 9

Assisted Solution

by:parparov
parparov earned 100 total points
ID: 36518068
Since I wrote the original script, I'll step in too:

1) in line 8, change "open(F, $file);" to 'open(F, $file) or die "Can't open file $file: $!;"'
2) replace line 28 with:
next unless $data[1] && $data[1] =~ m|(\d+/\d+/\d+) (\d+)\:(\d\d)|;
3) Tell us the results of the run after you made these changes.
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 36518074
After running a test, it would appear that the difference in line termination is the main reason why you're not getting the same results and that can be easily handled inside the script rather than converting the script externally.

Simply replace the chomp line with this regex.
s/\r\n//;

Open in new window

0
 
LVL 28

Expert Comment

by:FishMonger
ID: 36518090
Here is the output I get after making the latest change that I suggested.  Is this what you want?  If not, please explain what needs to be changed.

out.txt
0
 
LVL 9

Expert Comment

by:lisfolks
ID: 36519135
Okay, so I have a Mac, running OS X Snow Leopard, and I ran libertyforall2's code on my machine, using his data file. I can tell you for a fact that his code works perfectly once a) the vc.csv file is formatted to use \n's instead of \r's (using the command line I gave in my post), and b) the one line change that jeromee suggested is put in place.

Fishmonger's suggestion for replacing the chomp line may work to fix the file's line ending compatibility issues, also. However, it appears that normally you run this on a Windows machine. Therefore, in the case of this one-time situation of running it on a Mac, you may want to go the way of converting the data rather than adding a change to the code to work with that data.

The change to the code regarding the $data[1] variable will ensure on either environment that if there is no data present for it to look at, the code will continue on to the next item instead of crashing on an "uninitialized variable".
0
 
LVL 9

Assisted Solution

by:lisfolks
lisfolks earned 200 total points
ID: 36519154
Oh, and parparov's suggestion for the $data[1] line change is more elegant than using two lines, I think, though it is just another way to do the same thing jeromee suggested:

jeromee's:

next unless $data[1];
next unless $data[1] =~ m|(\d+/\d+/\d+) (\d+)\:(\d\d)|;

paparov's:
next unless $data[1] && $data[1] =~ m|(\d+/\d+/\d+) (\d+)\:(\d\d)|;

Either way, without changing the Windows line endings in the file on the Mac or in your code, the script won't find any data to work with.
0
 
LVL 28

Assisted Solution

by:FishMonger
FishMonger earned 100 total points
ID: 36519206
There really isn't a need to do the external conversion and the next unless $data[1].

You can simply change:
next unless /\S/;

Open in new window

to:
next unless /,/;

Open in new window


I don't have access to a Mac at the moment, but here's a shortened version that doesn't even care about the line endings.
#!/usr/bin/perl

use strict;
use warnings;

while ( <DATA> ) {
    next unless /,/;
    my (@data) = split(/,/);
    next unless $data[1] =~ m|(\d+/\d+/\d+) (\d+):(\d\d)|;
    print;
}


__DATA__
Users of this data file should acknowledge the National Park Service.,,,,,
NOTE: There are two parts to this file. Part I contains information on the sites selected.,,,,,
Part II contains the actual data and begins on line 9.,,,,,

Part I - Site Information,,,,,
ABBR,AIRS_SITE_CODE,LONG_DEC,LAT_DEC,ELEV_mMSL,SITE_NAME
HAVO-VC,15-001-0005,155.2578,19.4308,1215,Hawaii Volcanoes National Park - Visitor Center

Part II - Data,,,,,
ABBR,DATE_TIME,SO2_PPB,,,
HAVO-VC,10/27/10 0:00,0,,,
HAVO-VC,10/27/10 1:00,0,,,
HAVO-VC,10/27/10 2:00,0,,,
HAVO-VC,10/27/10 3:00,0,,,
HAVO-VC,10/27/10 4:00,0,,,
HAVO-VC,10/27/10 5:00,0,,,
HAVO-VC,10/27/10 6:00,0
HAVO-VC,10/27/10 7:00,0
HAVO-VC,10/27/10 8:00,0
HAVO-VC,10/27/10 9:00,0

Open in new window


Which, on my system it doesn't produce the warning and outputs:
D:\perl\EE>test.pl
HAVO-VC,10/27/10 0:00,0,,,
HAVO-VC,10/27/10 1:00,0,,,
HAVO-VC,10/27/10 2:00,0,,,
HAVO-VC,10/27/10 3:00,0,,,
HAVO-VC,10/27/10 4:00,0,,,
HAVO-VC,10/27/10 5:00,0,,,
HAVO-VC,10/27/10 6:00,0
HAVO-VC,10/27/10 7:00,0
HAVO-VC,10/27/10 8:00,0
HAVO-VC,10/27/10 9:00,0
0
 

Author Closing Comment

by:libertyforall2
ID: 36520310
I used another cluster as was able to resolve the issue without modifying the script.
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This video explains how to create simple products associated to Magento configurable product and offers fast way of their generation with Store Manager for Magento tool.

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now