How do I get a count of the types of files in each subdirectory?

I want to count the number and type of files in a bunch of subdirectories. I have code to count all the files in a single directory but not the types. I also have code to count the types of files but it's an aggregate from all the subdirectories. How can I accomplish both?

ie
mydirectory/folderA
    400 JPG
        2 TXT
    400 XMP
mydirectory/folderB
    400 CR2
         1 DOC
         2 PDF

#!/bin/bash
#This count all files and directories contained in the specified DIR

echo -e "Which directory to count \c"
read DIR
#this lists the number of files by type
echo "Number of files in all directories by type"
find "$DIR" -type f | rev | cut -d . -f1 | rev | sort | uniq -ic | sort -rn

#this list the number of files in all the sub directories (but not separated by type)
echo "Number of files in each subdirectory (but not by type)"
find "$DIR" -type f -execdir pwd \; | uniq -c

Open in new window

afroceltAsked:
Who is Participating?
 
ozoCommented:
find "$DIR" -type f | perl -lne '!/\.DS_Store/&&m#(.*)/.*?(\.[^.]*)?$#&&$c{$1}{$2}++;END{for(sort keys %c){ print;$d=$c{$_}; printf"%5d %s\n",$d->{$_},$_ for sort {$d->{$b}<=>$d->{$a}||$a cmp $b} keys %{$d}}}'
0
 
ozoCommented:
find "$DIR" -type f | perl -lne 'm#(.*)/.*?(\.[^.]*)?$#;$c{$1}{$2}++;END{for$d(sort keys %c){ print $d; printf"%5d %s\n",$c{$d}{$_},$_ for sort {$c{$d}{$b}<=>$c{$d}{$a}||$a cmp $b} keys %{$c{$d}}}}'
0
 
afroceltAuthor Commented:
Thank you. This works just like I need it to. However, is there a way to not count the .DS_Store files? Can I also ask for some sort of explanation of this code? I want to understand how it works.
0
 
afroceltAuthor Commented:
While this worked, I was hoping to learn more about the code. It's pretty obscured.
0
 
ozoCommented:
find "$DIR" -type f  # find searches the directory tree rooted at each given file name by evaluating the given expression
                                  #  -type t
                                  #  True if the file is of the specified type.
                                  #  f       regular file
|               # A series of simple commands joined by  `|'  characters  forms  a pipeline.  The output of each command in a pipeline is connected to the input of the next.
perl -lne #    -l[octnum]
                #        enables automatic line-ending processing.  It has two separate
                #        effects.  First, it automatically chomps $/ (the input record
                #        separator) when used with -n or -p.  Second, it assigns "$\" (the
                #        output record separator) to have the value of octnum
                #    -n   causes Perl to assume the following loop around your program,
                #      which makes it iterate over filename arguments somewhat like sed
                #      n or awk:
                #      while (<>) {
                #          ...             # your program goes here
                #      }
                #    -e commandline
                #      may be used to enter one line of program.

!                   # not
/\.DS_Store/   # matches as follows:
                        #   \.                        '.'
                        #  DS_Store                 'DS_Store'
&&                  #    C-style Logical And
                        #      Binary "&&" performs a short-circuit logical AND operation.  That is,
                        #      if the left operand is false, the right operand is not even evaluated.
m#(.*)/.*?(\.[^.]*)?$#    # matches as follows:
                                        #  (                        group and capture to \1:
                                        #  .*                       any character except \n (0 or more times  (matching the most amount possible))
                                        #  )                        end of \1
                                        #   /                        '/'
                                        #    .*?                      any character except \n (0 or more times  (matching the least amount possible))
                                        #    (                        group and capture to \2 (optional (matching the most amount possible)):
                                        #    \.                       '.'
                                        #    [^.]*                    any character except: '.' (0 or more times (matching the most amount possible))
                                        #  )?                       end of \2  
                                        #  $                        before an optional \n, and the end of the string
&&                  #    C-style Logical And
$c{$1}{$2}++ # $c{$1}  the $1value from hash %c
                        # $c{$1}{$2}  the $2value from hash %{$c{$1}}
                        # ++   Auto-increment,  "++" and "--" work as in C.
END{      #   An "END" code block is executed as late as possible,
for(        #          for LIST
               #  The "for(each)" modifier is an iterator: it executes the statement once for each item in the LIST (with $_ aliased to each item in turn).
sort      # sort LIST
             # In list context, this sorts the LIST and returns the sorted list value
             #  If SUBNAME or BLOCK is omitted, "sort"s in standard string
               comparison order.
keys %c  #  keys HASH
                # Called in list context, returns a list consisting of all the keys of the named hash,
){
print; #      Prints a string or a list of strings
           # If FILEHANDLE is omitted, prints to the last selected output handle.  If LIST is omitted, prints $_ to the currently selected output handle.
$d=$c{$_};  # assign the $_ value from hash %c to $d
printf              #  printf FORMAT, LIST
                         # Equivalent to "print FILEHANDLE sprintf(FORMAT, LIST)", except that "$\" (the output record separator) is not appended.
"%5d %s\n",  # %d    a signed integer, in decimal
                        # %s    a string
                         # Between the "%" and the format letter, you may specify several additional attributes controlling the interpretation of the format.
                        # (minimum) width  Arguments are usually formatted to be only as wide as required to display the given value.  You can override the width by putting a number here,
$d->{$_},     #  ""->"" is an infix dereference operator,
$_
 for # Statement Modifiers
       # Any simple statement may optionally be followed by a SINGLE modifier,  just before the terminating semicolon (or block ending).
       # for LIST
       #   The "for(each)" modifier is an iterator: it executes the statement once for each item in the LIST (with $_ aliased to each item in turn).
sort   #  sort BLOCK LIST
          # If SUBNAME is specified, it gives the name of a subroutine that returns an integer less than, equal to, or greater than 0, depending on how the elements of the list are  to be ordered.
          # In place of a SUBNAME, you can provide a BLOCK as an anonymous, in-line sort subroutine.
 {$d->{$b}<=>$d->{$a}||$a cmp $b}  
                     #  Binary "<=>" returns -1, 0, or 1 depending on whether the left argument is numerically less than, equal to, or greater than the right argument.
                     #  Binary "||" performs a short-circuit logical OR operation.  That is, if the left operand is true, the right operand is not even evaluated.
                    # Binary "cmp" returns -1, 0, or 1 depending on whether the left argument isvstringwise less than, equal to, or greater than the right argument.
keys
%{$c{$d}}}}
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.