• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 451
  • Last Modified:

Perl script to determine os/x home directory use report.

I am looking for a perl script that will run independant of base os and that will search OS/x server home directory use (du) and that will report on file extension types within each users directory.  I have all server names and login info.  Sudo is loaded as well.  Will need to generate simple report (csv, etc.).
0
roaks
Asked:
roaks
  • 8
  • 7
  • 4
  • +1
1 Solution
 
mjcoyneCommented:
To get file extensions, given a list of directories you want to search, you could do something like:

#!/usr/bin/perl -w
use strict;
use File::Basename;

my @dirs = ('/user1/dir1/', '/user2/dir2/', '/user3/dir3/');

open (FH, ">extension_list.txt") or die;

foreach (@dirs) {
    print FH "Extension list for directory $_:\n";
    chdir ($_);
    while( <*.*> ) {
        my ($filename, $dir, $ext) = fileparse($_, qr/\..*/);
        print FH "\t$ext\n";
    }
}

To run du, you can use Perl's backticks and capture the output to a variable.  Then you can process this output as you like and print it to a log as well.  Something like:

my $du_output = `du`;

should work.  Of course, you can add command-line switches to the command, or run it for each user's directory by passing it the directory name, for example:

my $du_output = `du -h --max-depth=1 $dir`;
0
 
roaksAuthor Commented:
I probably have more than 1000 home directories - what is the quickest method to get this data?  Also, my report needs to look like:

User, Disk Used, extension types (docs, spreadsheets, mp3's, etc).
0
 
Adam314Commented:
Do you want the disk used to be broken out by extension types, or all extension types a user has to be listed?

eg:(list all extension types)
User,   Disk Used,   Extension Types
Jon,   1.2GB,   doc xls txt csv
Mark, 980MB, doc xls ppv


eg: (break out by extension type)
User, Disk Used,  Extension Types
Jon,  340MB, doc
Jon, 280MB, xls
Jon, 315MB, txt
Jon, 265MB, csv
Mark, 500MB, doc
Mark, 360MB, xls
Mark, 120MB, ppv
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
ps15Commented:
for $dir (</home/*/>) {
 my %seen;
 my @extensions =grep { ! $seen{$_} ++ } map {/\.(.{3})$/;$1} grep {/\..{3}$/} split /\n/, `find $dir`;
 my ($disk) = map {/^(.*?)\s/;$1} `du -sh $dir`;
 my ($name) = map {m°/([^/]+)/$°;$1} $dir;
 print "$name $disk @extensions\n";
}
0
 
ps15Commented:
Sorry, let me revise that a bit, the other one might have added hidden dirs a la .ssh as ehxtension "ssh"

for my $dir (</home/*/>) {
 my %seen;
 my @extensions =grep { ! $seen{$_} ++ } map {/\.(.{3})$/;$1} grep {!-d $_} grep {/\..{3}$/} split /\n/, `find $dir`;
 my ($disk) = map {/^(.*?)\s/;$1} `du -sh $dir`;
 my ($name) = map {m°/([^/]+)/$°;$1} $dir;
 print "$name,$disk,@extensions\n";
}

# Also feel free to use Other output formats:

# All extensions seperated by commas
# print "$name,$disk,".join(",", @extensions)."\n";

# output "broken up by extensions" as seen a few comments up
# for my $ext (@extensions) { print "$name,$disk,$ext\n";}
0
 
roaksAuthor Commented:
This would be very cool and completely comprehensive:
eg:(list all extension types)
User,   Disk Used,   Extension Types
Jon,   1.2GB,  ( doc 340mb,  xls 320mb,  txt 3mb,  csv 1mb)


I can pare down or up from here....
0
 
ps15Commented:
for my $dir (</home/cmaf/>) {
 my ($disk) = map {/^(.*?)\s/;$1} `du -sh $dir`;
 my ($name) = map {m°/([^/]+)/$°;$1} $dir;
 my @files =  split /\n/, `find $dir`;
 my %seen;
 my $ext_data = join ", ", map {my $ext=$_;my $files = "'".join("' '", grep{$_=~/$ext$/}@files)."'";`du -ch $files|tail -n 1` =~ /^(.*?)\s/;"$ext $1"} grep { ! $seen{$_} ++ } map {/\.(.{3})$/;$1} grep {!-d $_} grep {/\..{3}$/} @files;
 print "$name, $disk, ( $ext_data )\n";
}
0
 
ps15Commented:
er, make that /home/*/ in the first line
0
 
roaksAuthor Commented:
Last input from me - There are a total of six servers - each has the same path to the home directories - other than hd1, hd2, etc.  It would seem to me that my options would be:
1 - set up a daily cron on each server, reporting/appending to a single file on a "master" server, or
2 - run the single script from the master server.

#1 seems better from a sys overhead respect.
#2 seems simpler.
0
 
Adam314Commented:
From the master server, are the other drives mounted?  Could they be?
If not, can you telnet to those machines?
0
 
roaksAuthor Commented:
no telnet - will need to use ssh, scp, etc.
0
 
Adam314Commented:
Is that a no to the other drives being mounted also?

It would be a bit more work to have it all on one server (if the other drives aren't mounted), but easier for maintainence.  Perl has modules to help with ssh though.  I've never used the ssh, but I've used telnet bunches of times.
0
 
roaksAuthor Commented:
No mounted drives - all local
0
 
Adam314Commented:
All local?  I thought there were 6 servers you wanted to check.  Am I missing something?
0
 
roaksAuthor Commented:
No - I just have to scan these servers  - they each have mounted drives - nothing is mounted via the network.
0
 
Adam314Commented:
So, is this correct?
You have 6 servers, and each has it's own drive.  No drives from any one server are mounted to another server.  You need the report for each of these 6 servers.


If that is correct, and you want to only run the script from the 'master' server, then there are 2 ways:
1) Mount the drives from the other servers to the master server.  You can then access then using the local filesystem.  Process the directories this way, and save the results to a file.
2) Use ssh to connect to each of the servers, and run commands that way.  Collect the results, and save them to a file.

If you don't like either of those methods, then you need to run the script on each of the servers.  If you want all of the results in a single place (eg: on the master server), you'll still need a way to get the results from each server to the master server (mount drives, ssh, ftp).
0
 
roaksAuthor Commented:
Number 2.
0
 
Adam314Commented:
I've never used the SSH module, but I've used the telnet module, and I believe they work similiar.
http://search.cpan.org/~dbrobins/Net-SSH-Perl-1.30/lib/Net/SSH/Perl.pm

#Include module
use Net::SSH::Perl;

my %Usage;

#Then you want to do something like this (I can't test this though...):
for my $Server (qw(Server1 Server2 Server3)){
    #Create object
    my $ssh = Net::SSH::Perl->new($host);

    #Login
    $ssh->login($user, $pass);

    #Get usage by user
    ($stdout, $stderr, $exit) = $ssh->cmd("du -sh /home/*");
    my @Lines = split(/\n/,$stdout);
   
    foreach my $Line (@Lines)
        my @f=split(/\s+/,$Line);
        $Usage{$f[1]} = $Usage{$f[0]};
    }
}

foreach my $User (keys %Usage){
    print "$User    $Usage{$User}\n";
}
0
 
roaksAuthor Commented:
So - in order to log into each server and get the data I need, I'll have to sudo.  Can you do a quick cut and paste that will show me how the actual script should be laid out?  I will be giving you extra points for all of your help....    :)
0
 
Adam314Commented:
I'm not familiar with sudo, but if it is simply another command that is needed, it won't be hard to use.  What is below is the code you will want... put it in a file, and execute it

#Include module
use Net::SSH::Perl;

#Variable to store usage by user
my %Usage;

#Then you want to do something like this (I can't test this though...):
#Loop through each server.  Change Server1, Server2,... to your actual servers
for my $Server (qw(Server1 Server2 Server3)){
    #Create object, and connect to the server
    my $ssh = Net::SSH::Perl->new($Server);

    #Login.  Change $user and $pass to the username and password you will be using
    $ssh->login($user, $pass);
   
    #Execute sudo, change to actual sudo command
    ($stdout, $stderr, $exit) = $ssh->cmd("sudo");
   
    #Get usage by user, using du command
    ($stdout, $stderr, $exit) = $ssh->cmd("du -sh /home/*");
    my @Lines = split(/\n/,$stdout);
   
    #Save usage to %Usage variable
    foreach my $Line (@Lines)
        my @f=split(/\s+/,$Line);
        $Usage{$f[1]} = [$Usage{$f[0]}, ""];
    }
   
    #Get file types
    ($stdout, $stderr, $exit) = $ssh->cmd("find /home");
    @Lines = GetTypeByUser($stdout);
   
    #Save filetypes to %Usage variable
    foreach my $Line (@Lines)
        my @f=split(/\s+/,$Line,2);
        ${$Usage{$f[0]}}[1] = $f[1];
    }
   
}

#Print heading
print "User     Size Used     File Types\n";

#Print results
foreach my $User (keys %Usage){
    print "$User    $Usage{$User}\n";
}

sub GetTypeByUser {
      my $Data=shift;
      my @Lines = split(/\n/,$Data);
      my %FileTypes;
      my @ReturnData;
      
      foreach my $Line (@Lines){
            $Line =~ m|/home/(\w+)/.*\.(.*?)$|;
            my $User=$1;
            my $Ext=$2;
            ${$FileTypes{$User}}{$Ext} = 1;
      }
      foreach my $User (keys %FileTypes){
            push @ReturnData, "$User " . join(",",keys(%{$FileTypes{$User}}));
      }
      #print Dumper(%FileTypes);
      return @ReturnData;
}
0

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

  • 8
  • 7
  • 4
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now