Solved

what is the same (UNdiff)

Posted on 1997-11-11
19
318 Views
Last Modified: 2013-12-06
in the unix in order to find out what is different between two files we use the "diff <file1> <file2> > <output file>
(optional the grep) ,but i have the oposite problem ,i want to find out what is the same in two files (there must be nothing the same) ,and they are huge files so the diff will give me whole pages of text that will not help me ,so what i need is to know the command or script on how to know what is the same between two files ,can somebody help me on this please
0
Comment
Question by:doron123
  • 7
  • 6
  • 3
  • +1
19 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 2007685
sort file1 > file1.sort
sort file2 | comm -12 - file2.sort

If there is also nothing repeated within each file,
you could also check
sort file1 file2 | sort -uc
0
 

Author Comment

by:doron123
ID: 2007686
thank you ,i'll explain further more ,I need to know if names within the file are the same (there must be NONE!!) ,the comm can do only for a line ,is there anything like the comm but can spot names?
0
 
LVL 84

Expert Comment

by:ozo
ID: 2007687
If names are separated by spaces
tr -s ' ' '\n' < file1 | sort > names.sort
etc.
Or if you can explain how to recognise a name, a simple Perl command could do it.
0
 

Author Comment

by:doron123
ID: 2007688
i have two files to compare ,in the tr command you wrote i can use only one ... how can i do that ,could you please re exlain the command .
thanks
0
 
LVL 84

Expert Comment

by:ozo
ID: 2007689
tr -s ' ' '\n' < file1 | sort > names1.sort
tr -s ' ' '\n' < file2 | sort | comm -12 - names1.sort
0
 
LVL 51

Accepted Solution

by:
ahoffmann earned 20 total points
ID: 2007690
cat file1 file2 | tr -s ' ' '\012' | sort | uniq -d
0
 
LVL 84

Expert Comment

by:ozo
ID: 2007691
You can use that if you don't mind catching repeats a single file, (as I said)
0
 
LVL 3

Expert Comment

by:braveheart
ID: 2007692
I like sdiff which prints the files side by side, indicating
which lines are the same, different or which have been added
to one or the other file. For example, if one file contains:
x
a
b
c
d
and the other file contains:
y
a
d
c
sdiff gives the following output:
x    |    y
a         a
b    <
c    <
d         d
     >    c

| means that the line has changed
  means no change
< means added to the first file w.r.t the second file
> means added to the second file w.r.t. the first.


0
 

Author Comment

by:doron123
ID: 2007693
thank you ,but in this case i'll see only the diffs that are place oriented!!!.
my problem is i work with asics ,and i get huge files containing not all the times at the same location the names ,and most of the times i get instance names in the middle of the line that are identical ,(but the rest of the line is not identical).
if you could help me so do that .
exept the sort first and than undiff ....

thanks ahead :
===============
Doron Amedey

0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 84

Expert Comment

by:ozo
ID: 2007694
What is a name?  How do you know when a name in the middle of a line?
If names are delimited by spaces, the tr | sort solution should work.
0
 

Author Comment

by:doron123
ID: 2007695
the syntax is <name> .<module>/<instance> <names>
and if i must track down instances than here is the problem.
do you have any solution for this problem here.

0
 
LVL 3

Expert Comment

by:braveheart
ID: 2007696
Oops, the formatting went awry. The output should be
   x   |   y
   a       a
   b   <
   c   <
   d       d
       >   c



0
 
LVL 3

Expert Comment

by:braveheart
ID: 2007697
Are you saying that your file contains lots of different records
on the same line? If so, you must first separate your records.
Perhaps an awk/nawk/gawk solution would be more appropriate
where you can specify your own field separators. Beware that if
you use large files you should use either nawk or gawk because
awk has poor garbage collection on some systems.
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 2007698
I understand your problem as follows:
   you have a diff output od 2 files and you want remove
   the diffs of lines which are identical in  <unstance>
   means:
       <  name1 .module/instance name1
       --
       <  name2 .module/instance name1
   should be removed
If this is what you wnat, a perl filter could do it. Give ozo a chance ;-)
0
 

Author Comment

by:doron123
ID: 2007699
this is what i need but 2 things :
1.) the mudule is not the same in most cases!!
2.) the name1 is not the same all ways!!
3.) one of those 3 elements is the same in some lines

and i don't want to delete them ,but i need to change one of the identical names in the netlist to other name.
so all i need to do first is to know which lines in the first file has some equivalent value as the x line (!! important) on file two.

help me and i'll reward you !!!>
thanks
0
 

Author Comment

by:doron123
ID: 2007700
where can i find those awk/nawk/gawk programs?

0
 

Author Comment

by:doron123
ID: 2007701
where can i find those awk/nawk/gawk programs?

0
 
LVL 84

Expert Comment

by:ozo
ID: 2007702
Given the syntax <name> .<module>/<instance> <names>
this should extract the names

awk '{$2 = ""; OFS="\n"; print $0}' <file | grep . | sort
0
 
LVL 51

Expert Comment

by:ahoffmann
ID: 2007703
> where can i find those awk/nawk/gawk programs?

awk is usually part of any UNIX: /usr/bin/awk
Some UNIXs have nawk also or instead (HP-UX): /usr/bin/nawk
gawk is GNU's version of awk: ftp://ftp.gnu.org/pub/gnu
  (most UNIXs deliver it too; IRIX, HP-UX)
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
AIX print queues constantly going down 11 490
removing nim resources 5 48
Bad Block Relocation for Synchronous GLVM AIX 7.1 2 66
remove a combination of patterns from a file 15 60
Installing FreeBSD… FreeBSD is a darling of an operating system. The stability and usability make it a clear choice for servers and desktops (for the cunning). Savvy?  The Ports collection makes available every popular FOSS application and packag…
Every server (virtual or physical) needs a console: and the console can be provided through hardware directly connected, software for remote connections, local connections, through a KVM, etc. This document explains the different types of consol…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

895 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now