We help IT Professionals succeed at work.

Read CSV file from remote servers and combine into 1

akatsuki27
akatsuki27 asked
on
Medium Priority
327 Views
Last Modified: 2012-05-11
Hey experts,

I have a bunch of hosts all collecting home directory info on a specific user and creating a csv file saved locally on the respective host.

I'm writing a perl script to collect the remote data from each file and append it to one large csv file. This seems to be a little complicated to me because I'm trying to add headers into the file to distinguish from which host the data is coming from.

The remote file looks like this:

***** today's date *********
***** heaviest dirs/files ***

100MB, /path/to/dir/
70MB, /path/to/dir/file1
10MB, /path/to/dir/file2
1.5MB, /path/to/dir/dir2
500K, /path/to/dir/dir2/otherfile
etc....

To explain what I tried to do:
 - open local output file
 - loop through hostname list and ssh into hosts
 - read remote csv (ignoring first 3 lines)
 - print header into local output file to distinguish origin of data
 - somehow read the csv and output to local file ???

This is the part where I'm sort of stuck. Actually, I'm pretty sure the code is all wrong cause while testing it, it hung. I think it has to do something with the shell it's opening on the remote host...

This has been racking my brain for a week now. If I could give more than 500 points I would...

open (FILE2, "> /home/user/du_collection.csv") or die "Cannot create the csv file";
my @output;

for(my $i = 0; $i <= $#list; $i++){
        my @ssh = `ssh $user\@$list[$i]`;
        open (FILE1, "//home/foobar/du_out.csv") or die "Cannot open the csv file";
        print FILE2 "************ $list[$i] ************";
        <FILE1>;<FILE1>;<FILE1>;     # ignore first 3 lines of csv file
        while (<FILE1>){
                foreach my $line (<FILE1>){
                        
                       [ commands to read csv and print to FILE2 ]
                        
                         }
                close FILE1;
                 }
}

close FILE2;

Open in new window


Any and all help is appreciated. Thanks!
Comment
Watch Question

Commented:
Hi akatsuki27,
How do you get the data from the remote hosts?
I do see your call to ssh but I fail to see which command you are running on the remote host to get the data you need.
Do you run "du"?
CERTIFIED EXPERT
Commented:
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION

Author

Commented:
Jeromee,

I have a script on the remote box that runs du and stores the output in a csv file. The script I'm working on now should connect to each remote host and gather the disk usage data and store it in 1 file.

FishMonger,

Is that module available for RHEL? Yum doesn't find it. Is there another module that will allow me to handle remote files?

Author

Commented:
Also FishMonger can you explain why the ssh is not working?

Commented:
Hi akatsuki27,
Is ftp/sftp an option to download the files?

Author

Commented:
Unfortunately no. I'm limited to ssh or scp. ftp is not set up on any/all internal host.

Commented:
scp will do too or you could run ssh to "cat" the file.
Check out http://search.cpan.org/~ivan/Net-SCP-0.08/SCP.pm for scp

Author

Commented:
I think reading the file once and processing will be my best choice. Since the home directories I'm checking get really huge. I would rather not copy those files over from 50+ hosts.

Do you know of the module FishMonger mentioned? I dont find it using yum. Is there another module that does similar things?
CERTIFIED EXPERT

Commented:
CERTIFIED EXPERT

Commented:
This is the command I normally use to install modules.

perl -MCPAN -e 'install File::Remote'

Commented:
If you don't want to download all files, then you will need to install and run a script on all 50+ hosts.

Author

Commented:
Jeromee,

I do have a script on all 50+ hosts running du. I still need a way to collect that data into 1 csv for general use.

Commented:
Hi akatsuki27,
So you do need to download 50+ files once they have been generated on each individual machine/host.
I would recommend that you use scp (or a Perl wrapper around it as suggested earlier with http://search.cpan.org/~ivan/Net-SCP-0.08/SCP.pm)
Use Fishmonger excellent pointers on how to download any Perl module you want.
Or just use plain scp.

Good luck.
CERTIFIED EXPERT

Commented:
The file::Remote module that I suggested uses scp under the hood, which simplifies the process because you don't need to manually scp the file.

The example I gave will need a couple minor adjustments.  One being that you'll need to change:
use File::Remote;

Open in new window

to:
use File::Remote qw(:replace);

Open in new window

Author

Commented:
FishMonger

This line is so it can ignore the first 3 lines of the csv file on the remote host?

<$remote_fh> for (1..3);

Open in new window

CERTIFIED EXPERT

Commented:
yes, that's correct.
Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a sample view!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.