Solved

Inspecting object code

Posted on 2001-06-24
10
229 Views
Last Modified: 2010-04-02
Hi,
I am doing some investigation into program similarity, i wonder if someone can help.

The two snippets of code below both do essentially the same thing, but one uses a 'for loop', and the other a 'while loop'.


//First Prog
int i;
for (i=0;i<10;i++)
   {
   printf ("%d\n",i);
   }


//Second Prog
i=0;
while (i<10)
   {
   printf ("%d\n",i);
   i++;
   }



The thing that interests me is, that after each program has been compiled to an .obj (presumably before linkage takes place), will the .obj's be different ? and what are the difference's likely to be.

Would it be possible to write a program parse the .obj's, and detect the difference's (or in my case the similarities) ?

Thanks for any help
0
Comment
Question by:AntBon
10 Comments
 
LVL 7

Expert Comment

by:KangaRoo
ID: 6222580
1) Run the program in the debugger and look at the disassembly
2) Take a close look at the command line tools that come with your compiler. There is likely a command that displays the generated object code
3) Make the compile generate assembly code, usually a command line option like -s (Borland) or -S (GCC)
0
 
LVL 22

Accepted Solution

by:
nietod earned 50 total points
ID: 6222582
>>  will the .obj's be different ?
These are so somilar that there is a good chance that assembly code generated by the compiler (the actual instructions that perform the tasks you wrote in your C++ code) will be the same.  If the assembly code is the same, the the object code will be almost identical.  (It might not be 100% identical because the boject code mght contain portions that include complile time info, file names, etc etc)   however, there is no guarantee that the assembly will be the same, but it is reasonably likely.  In more complex cases there is a greater chance that the two assembly codes produced would be different.  Also turning on optimizatiosn will tend to encourage the two to be more similar and turning them off will tend to encourage them to be more different  

>> what are the difference's likely to be.
No one can say.    In fact I don't think a difference is that likely for this case.

>> Would it be possible to write a program parse the .obj's, and detect
>> the difference's (or in my case the similarities) ?
It woudl be possible, yes.  but a tremendous amount of work.

A better solution is to write,compile and link the two programs and then run them under a debugger that supports disssasembly.  then look at the assembly code that the compiler produces.

0
 
LVL 32

Expert Comment

by:jhance
ID: 6222585
It's very complier dependent but in general, yes, the code will be different.

Just because you're carefully chosen the initial values and loop counters to behave identically here, doesn't make these two constructs identical for all cases.  So the compiler usually doesn't "see" code the way you and I do.  It's really "stupid" and has a really hard time comprehending intent.  We can clearly see that the two code blocks above are going to do the same thing but that's because we have a higher level understanding of what the programmer is doing.

0
Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

 
LVL 22

Expert Comment

by:nietod
ID: 6222598
For example, in MS VC in a debug compile I got the following results


For loop.
00401048   mov         dword ptr [ebp-4],0
0040104F   jmp         main+2Ah (0040105a)
00401051   mov         eax,dword ptr [ebp-4]
00401054   add         eax,1
00401057   mov         dword ptr [ebp-4],eax
0040105A   cmp         dword ptr [ebp-4],0Ah
0040105E   jge         main+43h (00401073)
00401060   mov         ecx,dword ptr [ebp-4]
00401063   push        ecx
00401064   push        offset string "%d\n" (0042e01c)
00401069   call        printf (00408170)
0040106E   add         esp,8
00401071   jmp         main+21h (00401051)


While loop
00401048   mov         dword ptr [ebp-4],0
0040104F   cmp         dword ptr [ebp-4],0Ah
00401053   jge         main+41h (00401071)
00401055   mov         eax,dword ptr [ebp-4]
00401058   push        eax
00401059   push        offset string "%d\n" (0042e01c)
0040105E   call        printf (00408170)
00401063   add         esp,8
00401066   mov         ecx,dword ptr [ebp-4]
00401069   add         ecx,1
0040106C   mov         dword ptr [ebp-4],ecx
0040106F   jmp         main+1Fh (0040104f)


In this case the code is slightly different and the while loop code is slightly superior--very slightly.  But as i said, this sort of difference will depend on many many factors, like the exact compiler used, the whethor or not you are optiizing, and the offects of other code in the vacinity of the code in question.




0
 
LVL 22

Expert Comment

by:nietod
ID: 6222608
When I try this in a release (not debug) version with optimizations I get the two algorithms produce exactly the same code and that this code is significantly improved over the code above.   But once again, this is not guaranteeed.

00401002   xor         esi,esi
00401004   push        esi
00401005   push        40C0A0h
0040100A   call        00403A91
0040100F   add         esp,8
00401012   inc         esi
00401013   cmp         esi,0Ah
00401016   jl          00401004
0
 
LVL 30

Expert Comment

by:Axter
ID: 6222671
Hi AntBon:
Feel free to click the [Reject Answer] button near (Answer-poster's)response, even if it seems like a good answer.
Doing so will increase your chance of obtaining additional input from other experts.  Later, you can click the [Select Comment as Answer] button on any response.
0
 

Expert Comment

by:ComTech
ID: 6222700
I will alert PandorMod to look at this, the primary Moderator for this Topic Area.

ComTech
0
 
LVL 7

Expert Comment

by:KangaRoo
ID: 6225386
There is little to look at, nietods post came two minutes after mine, he could never have seen my comment and write his in that time.

Meanwhile Antbon can look at our posts, together we can provide help on the large majority of C++ compilers (besides, they all come with similar tools and options)
0
 
LVL 22

Expert Comment

by:nietod
ID: 6225465
Actually, I don't think this question is directed to a specific compiler or even this specific code, but more generally to what can happen with different algorithms that produce the same results.  At least I assume so since this case is too specific to be of much value.
0
 

Author Comment

by:AntBon
ID: 6447510
Thanks v much and sorry for the delay
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
maze travler 6 62
ASP.net build a IF/Then Walkthrough Guide 1 240
C++ error cannot convert from std::string to const char * 6 251
C++ Code Issue 4 26
In days of old, returning something by value from a function in C++ was necessarily avoided because it would, invariably, involve one or even two copies of the object being created and potentially costly calls to a copy-constructor and destructor. A…
Container Orchestration platforms empower organizations to scale their apps at an exceptional rate. This is the reason numerous innovation-driven companies are moving apps to an appropriated datacenter wide platform that empowers them to scale at a …
The goal of the tutorial is to teach the user how to use functions in C++. The video will cover how to define functions, how to call functions and how to create functions prototypes. Microsoft Visual C++ 2010 Express will be used as a text editor an…
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

820 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question