compare two text files on single column

Posted on 2014-07-14
Last Modified: 2014-07-28
I've saved the output of two separate runs of 'ls -l' to two text files - file1 and file2.  The listing in file2 is a superset of file1.  I'd like to compare the contents of the two files based on column 7 only (filesize) to roughly find the lines in file2 which are not in file1.  I've tried several awk one liners but they fail so miserably that it doesn't make sense to post them here.
Thank you,
Question by:97WideGlide
    LVL 19

    Expert Comment

    Is filesize the only important value?  To compare entire lines, and only list the ones in file2 but not file1, use:

        comm -13 file1 file2
    LVL 8

    Author Comment

    Thanks for your response but my goal is to roughly find duplicate files based filesize so I would need to compare column 7 of each file only.
    LVL 19

    Accepted Solution

    Field 7?  In my "ls -l" output, the size is field 5 - field 7 is the day of the month!

    Anyway, the following awk script reads in the contents of file1 and stores all of the sizes in an array (larr).  It then reads in file2 line by line, and if the size of the file in file2 doesn't match the size of any files in file1, the entire line is printed out.
    awk 'BEGIN{j=0; while (0 != getline l1 < "file1") {split(l1,l1s); larr[j++] = l1s[5]}}
    {f=0; for (l in larr){if ($5 == larr[l]) f=1}; if (f == 0) print}' file2

    Open in new window

    Please change all 5s to 7s if your "ls" is different.  Also change the "file1" and file2 to match your file names ("file1" must be in double quotes like here, file2 doesn't have to be).
    LVL 8

    Author Closing Comment

    Thank you!

    Featured Post

    How to run any project with ease

    Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
    - Combine task lists, docs, spreadsheets, and chat in one
    - View and edit from mobile/offline
    - Cut down on emails

    Join & Write a Comment

    Suggested Solutions

    Introduction: Dialogs (2) modeless dialog and a worker thread.  Handling data shared between threads.  Recursive functions. Continuing from the tenth article about sudoku.   Last article we worked with a modal dialog to help maintain informat…
    Recently, an awarded photographer, Selina De Maeyer (, completed a photo shoot of a beautiful event ( in An…
    Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
    This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.

    731 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    17 Experts available now in Live!

    Get 1:1 Help Now