Solved

treesize

Posted on 2000-05-03
4
206 Views
Last Modified: 2010-03-05
I'm looking for a perl (Win32) script to get a treesize that accepts parameters and has an output similar to the examples.

Examples:
  1. treesize.pl "c:\My Documents"
     500 kbytes c:\My Documents
  2. treesize.pl "c:\My Documents\*"
     30 kbytes c:\My Documents\utils
     470 kbytes c:\My Documents\zips

0
Comment
Question by:sleroux
  • 2
  • 2
4 Comments
 
LVL 5

Expert Comment

by:pitonyak
ID: 2775995

I created a program for you which when run as:

perl -w TreeSize.pl \devsrc\perl\*.pl d:\devsrc\perl

Will first list all the files which match *.pl and then lists all the files.
Here is the output on my computer.

This DIR TOT DIRS Directory
======== ======== ====================================================
     0B       0B  \devsrc\perl\deps\blurfl\quux\*.pl
     0B       0B  \devsrc\perl\deps\blurfl\*.pl
    64KB     64KB \devsrc\perl\deps\*.pl
     1KB      1KB \devsrc\perl\ftp\*.pl
     0B       0B  \devsrc\perl\gen_file\tmp\archive\*.pl
     0B       0B  \devsrc\perl\gen_file\tmp\bak\*.pl
     0B       0B  \devsrc\perl\gen_file\tmp\Debug\*.pl
     0B       0B  \devsrc\perl\gen_file\tmp\err\*.pl
     0B       0B  \devsrc\perl\gen_file\tmp\GenLib\use\*.pl
     0B       0B  \devsrc\perl\gen_file\tmp\GenLib\*.pl
     0B       0B  \devsrc\perl\gen_file\tmp\in\archive\bak\*.pl
     0B       0B  \devsrc\perl\gen_file\tmp\in\archive\*.pl
     0B       0B  \devsrc\perl\gen_file\tmp\in\*.pl
     0B       0B  \devsrc\perl\gen_file\tmp\log\*.pl
     0B       0B  \devsrc\perl\gen_file\tmp\OELib\*.pl
     0B       0B  \devsrc\perl\gen_file\tmp\out\*.pl
     0B       0B  \devsrc\perl\gen_file\tmp\*.pl
    10KB     10KB \devsrc\perl\gen_file\*.pl
   749B     749B  \devsrc\perl\learn\*.pl
     4KB      4KB \devsrc\perl\mike\tmp\*.pl
    37KB     42KB \devsrc\perl\mike\*.pl
    87KB     87KB \devsrc\perl\pchmod\*.pl
    10KB     10KB \devsrc\perl\test\*.pl
    36KB    254KB \devsrc\perl\*.pl

This DIR TOT DIRS Directory
======== ======== ====================================================
     0B       0B  d:\devsrc\perl\deps\blurfl\quux\*
     0B       0B  d:\devsrc\perl\deps\blurfl\*
    69KB     69KB d:\devsrc\perl\deps\*
     2KB      2KB d:\devsrc\perl\ftp\*
    43KB     43KB d:\devsrc\perl\gen_file\tmp\archive\*
    54KB     54KB d:\devsrc\perl\gen_file\tmp\bak\*
   130KB    130KB d:\devsrc\perl\gen_file\tmp\Debug\*
    12KB     12KB d:\devsrc\perl\gen_file\tmp\err\*
    29KB     29KB d:\devsrc\perl\gen_file\tmp\GenLib\use\*
   731KB    760KB d:\devsrc\perl\gen_file\tmp\GenLib\*
     4KB      4KB d:\devsrc\perl\gen_file\tmp\in\archive\bak\*
    55KB     60KB d:\devsrc\perl\gen_file\tmp\in\archive\*
     0B      60KB d:\devsrc\perl\gen_file\tmp\in\*
    86KB     86KB d:\devsrc\perl\gen_file\tmp\log\*
   261KB    261KB d:\devsrc\perl\gen_file\tmp\OELib\*
     9KB      9KB d:\devsrc\perl\gen_file\tmp\out\*
   826KB      2MB d:\devsrc\perl\gen_file\tmp\*
    30KB      2MB d:\devsrc\perl\gen_file\*
     1KB      1KB d:\devsrc\perl\learn\*
    72KB     72KB d:\devsrc\perl\mike\tmp\*
    45KB    117KB d:\devsrc\perl\mike\*
   124KB    124KB d:\devsrc\perl\pchmod\*
    29KB     29KB d:\devsrc\perl\test\*
    81KB      2MB d:\devsrc\perl\*


You will notice that although I have but 254KB of perl files, I have 2MB files total.
The first column of numbers is for the current dir and the second column is for ALL the
files in this dir and in the subdirectories.


# By Andrew D. Pitonyak
# pitonyak@bigfoot.com

use strict;

my @end_txt = ('B ', 'KB', 'MB', 'GB', 'TB');

sub right_fmt($$)
{
    my ($len, $str) = @_;
    my $slop = $len - length($str);
    if ($slop > 0) {
        $str = " "x$slop."$str";
    }
    return $str;
}


sub print_size
{
    foreach (@_)
    {
        my $number = $_;
        my $idx = 0;
        while (($number > 1024) && ($idx < @end_txt-1))
        {
            $number >>= 10;
            ++$idx;
        }
        print right_fmt(6, $number);
        print "$end_txt[$idx] ";
    }
}

sub dumb_treesize($$)
{
    my ($dir_name, $wild_card) = @_;
    my $file_sizes = 0;
    my $dir_sizes  = 0;
    foreach (grep(-f $_, glob("$dir_name$wild_card")))
    {
        $file_sizes += (stat($_))[7];
    }
    foreach (grep(-d $_, glob("$dir_name\*")))
    {
        $dir_sizes += dumb_treesize("$_\\", $wild_card);
    }
    my $total_size = $file_sizes + $dir_sizes;
    print_size($file_sizes);
    print_size($total_size);
    print "$dir_name$wild_card\n";
    return $total_size;
}

sub smart_treesize_dir($)
{
    my $dir_name = shift;
    my $wild_card = '*';
    #
    # Strip off any trailing wild card characters
    #
    if ((not -d $dir_name) && ($dir_name !~ /\\$/) && ($dir_name =~ /^(.+?\\)([^\\]+$)/))
    {
        $dir_name = $1;
        $wild_card = $2 if defined($2);
    }
    if (not -d $dir_name)
    {
        print "Directory $dir_name does not exist\n";
    }
    else
    {
        $dir_name .= '\\' if $dir_name !~ /\\$/;
        print "This DIR TOT DIRS Directory\n";
        print "======== ======== ====================================================\n";
        dumb_treesize($dir_name, $wild_card);
    }
}


foreach (@ARGV) {
    smart_treesize_dir($_);
    print "\n";
}


0
 

Author Comment

by:sleroux
ID: 2776951
Adjusted points from 200 to 250
0
 

Author Comment

by:sleroux
ID: 2776952
*** I think there's a bug in your script.
The sum of all subdirs in d:\devsrc\perl\gen_file\tmp\* does not equal to 2MB. rather than 1414KB ~ 1.38MB.

826KB 2MB d:\devsrc\perl\gen_file\tmp\*
Real total: 1414KB

Also (once you fixed the bug)...
Can you slightly modify the script to accept parameters.  
-k for KB
-m for MB
-g for GB
and the default (no parameters) would be bytes.

Thanks.
0
 
LVL 5

Accepted Solution

by:
pitonyak earned 250 total points
ID: 2779208

I think that I understand what you desire. The current default is to show things as bytes only.
You can change this with...

-b bytes
-k Kilo Bytes
-m mega bytes
-g giga bytes
-t Up to tera bytes but use a "best fit" algorithim

An example run follows. Note that the initial run uses the default bytes.
Next, everything is in MB. Note that anything less than 1MB will listed as 0MB.
The final setting of -t mixes the modes so that what is printed is less than 1024.

A final comment. You claimed that there is a bug.
I think that instead there is a misunderstanding of what the output means.
The second column is not the space used by ONLY the subdirectories, but
the values of the subdirectories and the current directory.
If you desire to have the first column represent the files in the current dir
and the second column the sub dirs but not those in the current dir then find
the following two lines:

#print_size($dir_sizes);
print_size($total_size);

And change them to
print_size($dir_sizes);
#print_size($total_size);

and you will be finished. Here is a a current run.

perl -w TreeSize.pl \devsrc\perl\gen_file\ -m \devsrc\perl\gen_file -t \devsrc\perl\gen_file\


  This DIR     With Subs   Directory
============ ============ ====================================================
     44632B       44632B  \devsrc\perl\gen_file\tmp\archive\*
     55687B       55687B  \devsrc\perl\gen_file\tmp\bak\*
    133206B      133206B  \devsrc\perl\gen_file\tmp\Debug\*
     12319B       12319B  \devsrc\perl\gen_file\tmp\err\*
     30339B       30339B  \devsrc\perl\gen_file\tmp\GenLib\use\*
    748815B      779154B  \devsrc\perl\gen_file\tmp\GenLib\*
      4964B        4964B  \devsrc\perl\gen_file\tmp\in\archive\bak\*
     56951B       61915B  \devsrc\perl\gen_file\tmp\in\archive\*
         0B       61915B  \devsrc\perl\gen_file\tmp\in\*
     88454B       88454B  \devsrc\perl\gen_file\tmp\log\*
    267450B      267450B  \devsrc\perl\gen_file\tmp\OELib\*
      9511B        9511B  \devsrc\perl\gen_file\tmp\out\*
    846119B     2298447B  \devsrc\perl\gen_file\tmp\*
     31034B     2329481B  \devsrc\perl\gen_file\*

  This DIR     With Subs   Directory
============ ============ ====================================================
         0MB          0MB \devsrc\perl\gen_file\tmp\archive\*
         0MB          0MB \devsrc\perl\gen_file\tmp\bak\*
         0MB          0MB \devsrc\perl\gen_file\tmp\Debug\*
         0MB          0MB \devsrc\perl\gen_file\tmp\err\*
         0MB          0MB \devsrc\perl\gen_file\tmp\GenLib\use\*
         0MB          0MB \devsrc\perl\gen_file\tmp\GenLib\*
         0MB          0MB \devsrc\perl\gen_file\tmp\in\archive\bak\*
         0MB          0MB \devsrc\perl\gen_file\tmp\in\archive\*
         0MB          0MB \devsrc\perl\gen_file\tmp\in\*
         0MB          0MB \devsrc\perl\gen_file\tmp\log\*
         0MB          0MB \devsrc\perl\gen_file\tmp\OELib\*
         0MB          0MB \devsrc\perl\gen_file\tmp\out\*
         0MB          2MB \devsrc\perl\gen_file\tmp\*
         0MB          2MB \devsrc\perl\gen_file\*

  This DIR     With Subs   Directory
============ ============ ====================================================
        43KB         43KB \devsrc\perl\gen_file\tmp\archive\*
        54KB         54KB \devsrc\perl\gen_file\tmp\bak\*
       130KB        130KB \devsrc\perl\gen_file\tmp\Debug\*
        12KB         12KB \devsrc\perl\gen_file\tmp\err\*
        29KB         29KB \devsrc\perl\gen_file\tmp\GenLib\use\*
       731KB        760KB \devsrc\perl\gen_file\tmp\GenLib\*
         4KB          4KB \devsrc\perl\gen_file\tmp\in\archive\bak\*
        55KB         60KB \devsrc\perl\gen_file\tmp\in\archive\*
         0B          60KB \devsrc\perl\gen_file\tmp\in\*
        86KB         86KB \devsrc\perl\gen_file\tmp\log\*
       261KB        261KB \devsrc\perl\gen_file\tmp\OELib\*
         9KB          9KB \devsrc\perl\gen_file\tmp\out\*
       826KB          2MB \devsrc\perl\gen_file\tmp\*
        30KB          2MB \devsrc\perl\gen_file\*


# By Andrew D. Pitonyak
# pitonyak@bigfoot.com

use strict;

my $last_parm = 0;
my %parm_ending = (
    'b' => 0,
    'k' => 1,
    'm' => 2,
    'g' => 3,
    't' => 4,
);

my @end_txt = ('B ', 'KB', 'MB', 'GB', 'TB');

sub right_fmt($$)
{
    my ($len, $str) = @_;
    my $slop = $len - length($str);
    if ($slop > 0) {
        $str = " "x$slop."$str";
    }
    return $str;
}


sub print_size
{
    foreach (@_)
    {
        my $number = $_;
        my $idx = 0;
        while (($number >= 1024) && ($idx < $last_parm))
        {
            $number >>= 10;
            ++$idx;
        }
        if ($last_parm < 4 && $idx < $last_parm)
        {
            $idx = $last_parm;
            $number = 0;
        }
        print right_fmt(10, $number);
        print "$end_txt[$idx] ";
    }
}

sub dumb_treesize($$)
{
    my ($dir_name, $wild_card) = @_;
    my $file_sizes = 0;
    my $dir_sizes  = 0;
    foreach (grep(-f $_, glob("$dir_name$wild_card")))
    {
        $file_sizes += (stat($_))[7];
    }
    foreach (grep(-d $_, glob("$dir_name\*")))
    {
        $dir_sizes += dumb_treesize("$_\\", $wild_card);
    }
    my $total_size = $file_sizes + $dir_sizes;
    print_size($file_sizes);
    #
    # If you want to print the size of the directories rather
    # than the total size then uncomment the next print_size
    # and comment out the one that follows it.
    #
    #print_size($dir_sizes);
    print_size($total_size);
    print "$dir_name$wild_card\n";
    return $total_size;
}

sub smart_treesize_dir($)
{
    my $dir_name = shift;
    my $wild_card = '*';
    #
    # Strip off any trailing wild card characters
    #
    if ((not -d $dir_name) && ($dir_name !~ /\\$/) && ($dir_name =~ /^(.+?\\)([^\\]+$)/))
    {
        $dir_name = $1;
        $wild_card = $2 if defined($2);
    }
    if (not -d $dir_name)
    {
        print "Directory $dir_name does not exist\n";
    }
    else
    {
        $dir_name .= '\\' if $dir_name !~ /\\$/;
        print "  This DIR     With Subs   Directory\n";
        print "============ ============ ====================================================\n";
        dumb_treesize($dir_name, $wild_card);
    }
}


foreach (@ARGV) {
    if (/^-(.)/)
    {
        $last_parm = $parm_ending{$1} if defined($parm_ending{$1});
    }
    else
    {
        smart_treesize_dir($_);
        print "\n";
    }
}

0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This video demonstrates how to create an example email signature rule for a department in a company using CodeTwo Exchange Rules. The signature will be inserted beneath users' latest emails in conversations and will be displayed in users' Sent Items…

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now