Solved

Unix diff command to find the minus of two text files

Posted on 2014-03-17
5
1,044 Views
Last Modified: 2014-03-21
I want to print the content of file 1a that are not present in the file 1b. Both files contain one line of similar pattered text.
Example:
$ cat 1a
1234
3456
4567
$ cat 1b
1234
5566
9999
3456
$ grep -vf 1a 1b
5566
9999


$ grep -vf 1b 1a shows those present in 1a but not in 1b. And it works perfectly, But when the files are big (1000K+ records each), the above diff command hangs. Could you suggest workaround or alternate solution that might work? Thanks you.
0
Comment
Question by:toooki
5 Comments
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
ID: 39935760
You could use "comm". The drawback here is that both files must be sorted.

sort 1b > 1b.sort

sort 1a | comm -2 -3 - 1b.sort

will show the records present in file 1a but not in 1b(.sort)

I don't think that your grep command actually "hangs". Probably it just takes quite a long time to complete.
0
 
LVL 84

Expert Comment

by:ozo
ID: 39935771
Your example seems to show a grep command, not a diff command.
If we can use other commands, this should work:
 perl -lne '@ARGV?$s{$_}++:$s{$_}||print' 1a 1b
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39935789
If you want to use "diff" then this one:

diff 1b 1a | grep "^>"

will show the lines of 1a which are not in 1b, preceeded by ">".

This one

diff 1a 1b |grep "^<"

will do the same, but the lines in question will be preceeded by "<".

This will remove the prefix:

diff 1a 1b |awk '/^</ {print $2}'
0
 
LVL 37

Expert Comment

by:Gerwin Jansen
ID: 39938069
Off topic comment deleted.

Gerwin Jansen
EE Topic Advisor
0
 

Author Comment

by:toooki
ID: 39946613
Many thanks to all.
sort 1b > 1b.sort
sort 1a | comm -2 -3 - 1b.sort

The above worked for me!
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Comparing Internet speeds via throughput 3 42
Access_log 17 101
DNS @ Naked Domain Record 5 69
Is it possible to host a website on a windows vps 4 35
If your business is like most, chances are you still need to maintain a fax infrastructure for your staff. It’s hard to believe that a communication technology that was thriving in the mid-80s could still be an essential part of your team’s modern I…
Join Greg Farro and Ethan Banks from Packet Pushers (http://packetpushers.net/podcast/podcasts/pq-show-93-smart-network-monitoring-paessler-sponsored/) and Greg Ross from Paessler (https://www.paessler.com/prtg) for a discussion about smart network …
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
This video gives you a great overview about bandwidth monitoring with SNMP and WMI with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're looking for how to monitor bandwidth using netflow or packet s…

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now