Solved

How to compare Two binary files?

Posted on 2010-08-16
15
1,224 Views
Last Modified: 2013-12-26
A library is compiled with Wind River Diab 4.2b compiler resulting in old.a file.
Identical library is compiled with Wind River Diab 5.8.0.0 compiler resulting in new.a.

SlickEdit DIFFzilla utility was used to compare old.a and new.a binary files.  
The files are not getting compared because first part of old.a is shown against Imaginary Buffer.  Then, first part of new.a is show against Imaginary Buffer.  Then, next part of old.a is shown against Imaginary Buffer.  Then, next part of new.a is shown against Imaginary Buffer.  This keeps repeating until end of files is reached.

My guess is that files are not different.  Somehow compiler output formats are different which isn't allowing the files to get compared.

What other options do I have to compare these binay files?

Other tools I'm using are Clearcase, Codewright, Unix, Linux.  
0
Comment
Question by:naseeam
  • 6
  • 4
  • 4
  • +1
15 Comments
 
LVL 40

Expert Comment

by:evilrix
ID: 33449266
What is it, exactly, you are trying to achieve?
0
 
LVL 1

Author Comment

by:naseeam
ID: 33450233
I'm trying to compare two binary files using SlickEdit DIFFzilla utility.
0
 
LVL 40

Expert Comment

by:evilrix
ID: 33450240
That I understand but to what end... what is you ultimate goal here?
0
Simplifying Server Workload Migrations

This use case outlines the migration challenges that organizations face and how the Acronis AnyData Engine supports physical-to-physical (P2P), physical-to-virtual (P2V), virtual to physical (V2P), and cross-virtual (V2V) migration scenarios to address these challenges.

 
LVL 1

Author Comment

by:naseeam
ID: 33455550
I compile library source files with one version of the compiler to get old.a library file.
Then, I compile exact same source files with newer version of compiler to get new.a library file.

My goal is to find out if two .a library files are identical or not?  If they are identical, then, I don't need to test library (built with new version of compiler) in the target board.
0
 
LVL 32

Expert Comment

by:phoffric
ID: 33455801
In some environments two identical compilations of libraries or executable builds still result in different binaries because of date time stamps. But, by experimenting, we could determine what was built from identical source code.

For example, I just build a few seconds apart two executables, a.exe and b.exe. I did a diff as seen below on hex dumps, and the results show minor changes. For our configuration management builds, we dropped the hex dumps into Codewright and got (better - easier to read) results than what I am showing below using the freeware WinMerge
$ od -c a.exe > a.txt
$ od -c b.exe > b.txt
$ diff a.txt b.txt
9c9
< 0000200   P   E  \0  \0   L 001  \t  \0 361 252   j   L  \0   &  \0  \0
---
> 0000200   P   E  \0  \0   L 001  \t  \0 343 252   j   L  \0   &  \0  \0
14c14
< 0000320  \0 240  \0  \0  \0 004  \0  \0   3 317  \0  \0 003  \0  \0 200
---
> 0000320  \0 240  \0  \0  \0 004  \0  \0   % 317  \0  \0 003  \0  \0 200

Open in new window

binary-diff.PNG
0
 
LVL 40

Assisted Solution

by:evilrix
evilrix earned 40 total points
ID: 33455893
You can't just binary diff the libraries because every time you compile them there is a chance that variable content (such as timestamps or paths as phoffric eludes) will be included. You need to do something a bit smarter than this.

This might get you going...

To find the dependencies of a dynamic library just used the ldd command.
http://unixhelp.ed.ac.uk/CGI/man-cgi?ldd+1

You can find out what symbols each library exports using the nm command
http://unixhelp.ed.ac.uk/CGI/man-cgi?nm

0
 
LVL 40

Expert Comment

by:evilrix
ID: 33455911
Oh, and if they are static libraries you can just unarchive them since a static library is nothing more than an archive file with an index table. You can use the ar command to do this.

http://unixhelp.ed.ac.uk/CGI/man-cgi?ar

Once the contents are extracted to can compare each individual element.
0
 
LVL 32

Expert Comment

by:phoffric
ID: 33455952
>> (such as timestamps or paths)
True, sometimes the Configuration Management was mapped to a different drive.
But that was what we found were the binary differences: timestamps and/or paths.

And it was easy to determine that if we got a clean comparison, then we knew that the other 1MB represented the same source code.

When comparing libraries, there were many more difference because each object in that library could have a different timestamp. But this process was sound and absolutely necessary to guarantee that what was being sent to the customer was identical to what had been thoroughly tested in the lab.
0
 
LVL 40

Expert Comment

by:evilrix
ID: 33455977
>> such as timestamps or paths

In the case of a static unix library you can add to this the random access index table for symbol names

http://unixhelp.ed.ac.uk/CGI/man-cgi?ranlib
0
 
LVL 32

Expert Comment

by:phoffric
ID: 33456008
Each environment was different. That is why we had to experiment to understand exactly what worked for that environment.
0
 
LVL 1

Author Comment

by:naseeam
ID: 33465774
I'll have to wait for my Unix Account before I can try out above solutions.
0
 
LVL 40

Expert Comment

by:evilrix
ID: 33465852
>> I'll have to wait for my Unix Account before I can try out above solutions.

This might help you make progress before then.

"Cygwin is a Linux-like environment for Windows."
http://www.cygwin.com/
0
 
LVL 32

Expert Comment

by:phoffric
ID: 33468954
If you do use Cygwin (I use it), then do not get its X Server
                   http://x.cygwin.com/

Instead, get Xming (and don't waste the time I did trying to use the Cygwin X server - I had to get EE help to learn this)
       http://sourceforge.net/projects/xming/
0
 
LVL 5

Accepted Solution

by:
shajithchandran earned 460 total points
ID: 33473382
what i would probably do is, just extract the text , data and may be the loader section from the libraries and compare them. If they are same, then the libraries are same.

After all, during execution, all that matters is the instructions (text section) , the initialized data (data section) and how the loader will resolve (loader section). If they are same, then i believe , we can safely conclude that the libraries are same.

i use dump -s on my unix machine to extract them.
0
 
LVL 1

Author Closing Comment

by:naseeam
ID: 33476005
Excellent solution.  Brillant Expert!
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Can case within switch statement specify range of values ? 3 82
Eclipse Neon start with Admin account only 6 125
Super Scope, DHCP 5 78
oracle 11g 23 84
Jaspersoft Studio is a plugin for Eclipse that lets you create reports from a datasource.  In this article, we'll go over creating a report from a default template and setting up a datasource that connects to your database.
This is a short and sweet, but (hopefully) to the point article. There seems to be some fundamental misunderstanding about the function prototype for the "main" function in C and C++, more specifically what type this function should return. I see so…
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use nested-loops in the C programming language.
The viewer will learn how to use and create new code templates in NetBeans IDE 8.0 for Windows.

803 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question