Solved

Unix diff command to find the minus of two text files

Posted on 2014-03-17
5
1,017 Views
Last Modified: 2014-03-21
I want to print the content of file 1a that are not present in the file 1b. Both files contain one line of similar pattered text.
Example:
$ cat 1a
1234
3456
4567
$ cat 1b
1234
5566
9999
3456
$ grep -vf 1a 1b
5566
9999


$ grep -vf 1b 1a shows those present in 1a but not in 1b. And it works perfectly, But when the files are big (1000K+ records each), the above diff command hangs. Could you suggest workaround or alternate solution that might work? Thanks you.
0
Comment
Question by:toooki
5 Comments
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
ID: 39935760
You could use "comm". The drawback here is that both files must be sorted.

sort 1b > 1b.sort

sort 1a | comm -2 -3 - 1b.sort

will show the records present in file 1a but not in 1b(.sort)

I don't think that your grep command actually "hangs". Probably it just takes quite a long time to complete.
0
 
LVL 84

Expert Comment

by:ozo
ID: 39935771
Your example seems to show a grep command, not a diff command.
If we can use other commands, this should work:
 perl -lne '@ARGV?$s{$_}++:$s{$_}||print' 1a 1b
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39935789
If you want to use "diff" then this one:

diff 1b 1a | grep "^>"

will show the lines of 1a which are not in 1b, preceeded by ">".

This one

diff 1a 1b |grep "^<"

will do the same, but the lines in question will be preceeded by "<".

This will remove the prefix:

diff 1a 1b |awk '/^</ {print $2}'
0
 
LVL 37

Expert Comment

by:Gerwin Jansen
ID: 39938069
Off topic comment deleted.

Gerwin Jansen
EE Topic Advisor
0
 

Author Comment

by:toooki
ID: 39946613
Many thanks to all.
sort 1b > 1b.sort
sort 1a | comm -2 -3 - 1b.sort

The above worked for me!
0

Featured Post

Free camera licenses with purchase of My Cloud NAS

Milestone Arcus software is compatible with thousands of industry-leading cameras for added flexibility. Upon installation on your My Cloud NAS, you will receive two (2) camera licenses already enabled in the software. And for a limited time, get additional camera licenses FREE.

Join & Write a Comment

Utilizing an array to gracefully append to a list of EmailAddresses
This is an article about my experiences with remote access to my clients (so that I may serve them) and eventually to my home office system via Radmin Remote Control. I have been using remote access for over 10 years and have been improving my metho…
Viewers will learn how to connect to a wireless network using the network security key. They will also learn how to access the IP address and DNS server for connections that must be done manually. After setting up a router, find the network security…
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now