Inspecting object code

I am doing some investigation into program similarity, i wonder if someone can help.

The two snippets of code below both do essentially the same thing, but one uses a 'for loop', and the other a 'while loop'.

//First Prog
int i;
for (i=0;i<10;i++)
   printf ("%d\n",i);

//Second Prog
while (i<10)
   printf ("%d\n",i);

The thing that interests me is, that after each program has been compiled to an .obj (presumably before linkage takes place), will the .obj's be different ? and what are the difference's likely to be.

Would it be possible to write a program parse the .obj's, and detect the difference's (or in my case the similarities) ?

Thanks for any help
Who is Participating?
>>  will the .obj's be different ?
These are so somilar that there is a good chance that assembly code generated by the compiler (the actual instructions that perform the tasks you wrote in your C++ code) will be the same.  If the assembly code is the same, the the object code will be almost identical.  (It might not be 100% identical because the boject code mght contain portions that include complile time info, file names, etc etc)   however, there is no guarantee that the assembly will be the same, but it is reasonably likely.  In more complex cases there is a greater chance that the two assembly codes produced would be different.  Also turning on optimizatiosn will tend to encourage the two to be more similar and turning them off will tend to encourage them to be more different  

>> what are the difference's likely to be.
No one can say.    In fact I don't think a difference is that likely for this case.

>> Would it be possible to write a program parse the .obj's, and detect
>> the difference's (or in my case the similarities) ?
It woudl be possible, yes.  but a tremendous amount of work.

A better solution is to write,compile and link the two programs and then run them under a debugger that supports disssasembly.  then look at the assembly code that the compiler produces.

1) Run the program in the debugger and look at the disassembly
2) Take a close look at the command line tools that come with your compiler. There is likely a command that displays the generated object code
3) Make the compile generate assembly code, usually a command line option like -s (Borland) or -S (GCC)
It's very complier dependent but in general, yes, the code will be different.

Just because you're carefully chosen the initial values and loop counters to behave identically here, doesn't make these two constructs identical for all cases.  So the compiler usually doesn't "see" code the way you and I do.  It's really "stupid" and has a really hard time comprehending intent.  We can clearly see that the two code blocks above are going to do the same thing but that's because we have a higher level understanding of what the programmer is doing.

Cloud Class® Course: CompTIA Cloud+

The CompTIA Cloud+ Basic training course will teach you about cloud concepts and models, data storage, networking, and network infrastructure.

For example, in MS VC in a debug compile I got the following results

For loop.
00401048   mov         dword ptr [ebp-4],0
0040104F   jmp         main+2Ah (0040105a)
00401051   mov         eax,dword ptr [ebp-4]
00401054   add         eax,1
00401057   mov         dword ptr [ebp-4],eax
0040105A   cmp         dword ptr [ebp-4],0Ah
0040105E   jge         main+43h (00401073)
00401060   mov         ecx,dword ptr [ebp-4]
00401063   push        ecx
00401064   push        offset string "%d\n" (0042e01c)
00401069   call        printf (00408170)
0040106E   add         esp,8
00401071   jmp         main+21h (00401051)

While loop
00401048   mov         dword ptr [ebp-4],0
0040104F   cmp         dword ptr [ebp-4],0Ah
00401053   jge         main+41h (00401071)
00401055   mov         eax,dword ptr [ebp-4]
00401058   push        eax
00401059   push        offset string "%d\n" (0042e01c)
0040105E   call        printf (00408170)
00401063   add         esp,8
00401066   mov         ecx,dword ptr [ebp-4]
00401069   add         ecx,1
0040106C   mov         dword ptr [ebp-4],ecx
0040106F   jmp         main+1Fh (0040104f)

In this case the code is slightly different and the while loop code is slightly superior--very slightly.  But as i said, this sort of difference will depend on many many factors, like the exact compiler used, the whethor or not you are optiizing, and the offects of other code in the vacinity of the code in question.

When I try this in a release (not debug) version with optimizations I get the two algorithms produce exactly the same code and that this code is significantly improved over the code above.   But once again, this is not guaranteeed.

00401002   xor         esi,esi
00401004   push        esi
00401005   push        40C0A0h
0040100A   call        00403A91
0040100F   add         esp,8
00401012   inc         esi
00401013   cmp         esi,0Ah
00401016   jl          00401004
Hi AntBon:
Feel free to click the [Reject Answer] button near (Answer-poster's)response, even if it seems like a good answer.
Doing so will increase your chance of obtaining additional input from other experts.  Later, you can click the [Select Comment as Answer] button on any response.
I will alert PandorMod to look at this, the primary Moderator for this Topic Area.

There is little to look at, nietods post came two minutes after mine, he could never have seen my comment and write his in that time.

Meanwhile Antbon can look at our posts, together we can provide help on the large majority of C++ compilers (besides, they all come with similar tools and options)
Actually, I don't think this question is directed to a specific compiler or even this specific code, but more generally to what can happen with different algorithms that produce the same results.  At least I assume so since this case is too specific to be of much value.
AntBonAuthor Commented:
Thanks v much and sorry for the delay
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.