comparing UNIX files

I have 2 huge text files, each containing around 40million unique lines of text.

Is there any way of comparing the 2 files and returning only those lines that appear in one file but not in the other?
stummjAsked:
Who is Participating?
 
omarfaridConnect With a Mentor Commented:
you may use different commands like diff , cmp, comm

Please man page for these commands on your system
0
 
stummjAuthor Commented:
I considered diff but I think that does a line by line comparison and line 1 in file 1 may match (say) line 50,000 in file 2. So diff is out. I'll check the others though. Thanks
0
 
stummjAuthor Commented:
comm looks like it will work but I need to sort the files first.
Im guessing that using "sort" on a 40 million line file may take a long time. Is there a better way to sort?
0
 
omarfaridCommented:
I think you need to give it a try :)
0
 
stummjAuthor Commented:
LOL I reckon you are right. Thanks omarfarid, this is a steep learning curve for me. I would usually do this in Oracle not by UNIX text processing!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.