*nix quickie - du command - how to compare *nix/Windows directory sizes

Posted on 2007-09-28
Last Modified: 2008-01-09
I uploaded a set of files to my Linux server[1] via FTP, from my Win2K machine.

If I look at the file properties of the parent directory (call it 'Files') under Windows, I get the following info:

  Size: 12,147,595 bytes
  Contains: 3,099 files, 277 folders.

After running the upload (which was unattended) I wanted to check that everything was present and correct, so I ran the following command from the remote 'Files' directory to see the total number of files:

  $ find . -type f | wc -l

As you can see, the file count was correct.  I then ran the following command (again, from the 'Files' directory) to get the total size of the uploaded files:

  $ du -b | tail -n1

This size is different to the size reported by Windows.  I expect part of this is due to the fact that directory entries count as 1024 bytes, but even removing 1024 * 277 folders (283,648) I get a total of 12,183,324 - still more than the expected 12,147,595 reported by Windows.

So therefore, 2 questions:

1) Why is the total file size reported by these two methods different.
2) What is the correct *nix command that I should use to get the actual file size of the files only (i.e. the size reported by Windows).


- Mark

[1] Slackware 11.0.0, Linux version
Question by:FartingUncle
    LVL 15

    Expert Comment


    1) if you use Windows Explorer and right click a file you will see two sizes, size and size on disk, the command line in Windows for a dir command gives you the size of the file - but I am not sure how this works out when you are running a dir of the directory.
    2) ?????

    There are some FTP clients that compare directories, I think WinSCP ( GPL ) also has a compare but just for files in a directory - and I think WinSCP is limited to looking at just the time stamps on the files.

    rsync (GPL)  can do things over ftp.
    -size n[bckw]
        File uses n units of space. The units are 512-byte blocks by default or if `b' follows n, bytes if `c' follows n, kilobytes if `k' follows n, or 2-byte words if `w' follows n. The size does not count indirect blocks, but it does count blocks in sparse files that are not actually allocated.

    If you are going to repeat this in an automated method - I would suggest zipping the 13 megs xfer the file, then unzipping.

    Sorry, I could not directly answer your two questions  - thought you may find something useful here - I did really search for an answer.


    I think mc - Midnight Commander may also have a directory compare function.
    LVL 19

    Expert Comment

    I would try to see if there is a size difference on one or various files.

    you need to get a file with all files in your windows box with all their sizes
    and then you can run find to get all your files and sizes, like in here:
    find . -type f | xargs ls -l | awk '{print $5,$8}'

    then run a diff on those two files and see if there is a file that could not be correctly transferred, maybe due it is open and still in use or just to see there is a difference on the methods to see the space used for these files.
    LVL 1

    Author Comment

    Hi - thanks for the replies.


    Yes, Windows reports two sizes, and the size I have listed here is the logical file size (not the size on disk, which is about 53MB).

    I am not sure what you are referring to with your link to 'find'.  

    I have re-uploaded the files as a zip file and unzipped onto the server.  I still get a difference, but this time the uploaded size is 12,469,131 bytes, which gives a difference of 321,536.  This is 1024 * 314, so the figure is a lot less random and so there is probably a 'neat' answer to q1, but even if you take into account 1024 bytes per directory, there are still 37 * 1024 bytes unaccounted for.


    I think it's unlikely that there is a file difference in my newly uploaded set of files, and it would be quite a bit of work to generate matching outputs from both OSes, but thanks for the idea.

    In general, I am surprised that there isn't a simple *nix command that can give the logical size of all files in a sub-directory and which can match the results given by Windows.  I'm sure this kind of check is something that is commonly required?
    LVL 19

    Expert Comment

    well, du is the command you need.
    if there is a difference, then
    a) there was a problem in the transference
    b) it is counting the space used by directories
    c) there is a problem on the WINDOWS side

    I'm pretty sure there are enough options to du to get the file size you are looking for.
    LVL 1

    Author Comment

    Well,'s not (a) as I did a second transfer as a zipped file, which wouldn't have unzipped if there were transfer errors.'s not wholly explained by (b) - directories are listed as taking 1024 bytes, and there are 277 directories.  That still leaves about 38Kb unaccounted for.

    ...I very much doubt (c) as there Google doesn't throw anything up and I severely doubt that an error like this would not have been spotted and publicised by now!

    As far as I can see, *nix is clearly counting something in this total that Windows is not.  My original questions can therefore be rephrased as follows:

    1) What, apart from the logical file size, does the du command count?  1024 bytes per directory entry is one of the things, but is not the whole story.
    2) How can I get an output from du (or some other command) that omits these extra items?
    LVL 19

    Expert Comment

    when there are more files into a directory, it would be counted with more bytes. did you check that?
    LVL 4

    Expert Comment

    Would the current/parent (. and ..) have any representation in the extra results from du for each folder and subfolder?
    LVL 1

    Author Comment

    Redimido: sorry, I'm not sure what you mean by that.  Can you expand please?

    Avatech: A single directory entry is 1024 bytes and there are 277 sub-directories (so 278 including the root).  The total discrepancy is 314 * 1024 bytes, so I don't think this is the whole story.

    It appears this question isn't as straight-forward as it first appeared, which is a surprise, to be honest!  I thought it would be one of those things that experienced *nix people would know straight away.  Obviously it's not quite so simple, so I'm upping the points.
    LVL 19

    Accepted Solution

    What I say is if the directory has many entries, it will take more than 1024 bytes, 2048, 4096 bytes, etc.
    there could be the difference.
    you can see it with
    find . -type d | xargs ls -l
    LVL 1

    Author Comment

    The required command was

    find . -type d | xargs ls -l -d

    ...the -d switch ensuring that directory entries were listed rather than their contents.

    However, you are right!  There are directory entries of higher multiples of 1024... and they account for all the extra space.  So that's q1 answered - the extra space is purely from the directory entries after all.

    Now the second part - is there a command that can give me the disk usage of just the files, not including any directory entries themselves.
    LVL 19

    Assisted Solution

    if du does not have this option, and I don't think it should since the directories are part of the total, I would say:

    if you do not specify a directory it will take current one.
    [ $# -ne 0 ] && DIR=$1
    printf "Size: "
    cd $DIR
    ls -lR | awk '{sum = sum + $5} END {print sum}'
    LVL 1

    Author Comment

    Redmido: That script doesn't work, I'm afraid.  It gives an even higher number than "du -b"!  I guess this is because "ls -lR" is showing all files and all directories (including both the "." and "..") so it counts every directory twice - the opposite of what I need!

    However, the "awk" part of your script was enough for me to figure out a command line that works:

      find . -type f | xargs ls -l | awk '{sum = sum + $5} END {print sum}'

    It basically uses 'find' to get all folders, passes them to 'ls' to get the details and then 'awk' sums the sizes.  I've added this as an alias in my startup script, so now I can just type 'filesize' to get the result I am after.

    Thanks for your help.  For solving q1 and for providing enough info to figure out q2 I will be awarding you the points.


    - Mark
    LVL 1

    Author Comment

    OK - thanks.  If possible, it might be worth removing these last few posts, as they clutter the answer for any future readers.

    Featured Post

    How your wiki can always stay up-to-date

    Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
    - Increase transparency
    - Onboard new hires faster
    - Access from mobile/offline

    Join & Write a Comment

    Setting up Secure Ubuntu server on VMware 1.      Insert the Ubuntu Server distribution CD or attach the ISO of the CD which is in the “Datastore”. Note that it is important to install the x64 edition on servers, not the X86 editions. 2.      Power on th…
    FreeBSD on EC2 FreeBSD ( is a robust Unix-like operating system that has been around for many years. FreeBSD is available on Amazon EC2 through Amazon Machine Images (AMIs) provided by FreeBSD developer and security office…
    Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
    Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…

    746 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    19 Experts available now in Live!

    Get 1:1 Help Now