Reverse Engineering Assembly Code

Based on the code attached, I need to find out the value of the compile time constant CNT, as well as the declaration for a_struct.

So far, I think CNT = 45 since bp->right is located at *bp+184, and a_struct is the element in b_struct right before that. There is also bp->left before a_struct which leaves 180 bytes for a_struct.

Also, I know that a_struct has only 2 fields, idx and x.  

C Code
typedef struct { 
  int left; 
  a_struct a[CNT]; 
  int right;
} b_struct;

void test(int i, b_struct *bp) {
  int n = bp->left + bp->right; 
  a_struct *ap = &bp->a[i]; 
  ap->x[ap->idx] = n;
}

Open in new window


Assembly (AT&T Format)
00000000 <test>:
push	%ebp 
mov	%esp,%ebp 
mov	0x8(%ebp),%eax 
mov	0xc(%ebp),%ecx 
lea	(%eax,%eax,4),%eax 
add	0x4(%ecx,%eax,4),%eax 
mov	0xb8(%ecx),%edx 
add	(%ecx),%edx 
mov	%edx,0x8(%ecx,%eax,4) 
pop	%ebp ret

Open in new window

robo_docAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Kent OlsenData Warehouse Architect / DBACommented:
Hi Robo,

This may be more difficult that you realize.

If CNT really is a constant (#define CNT xxx) it's not stored anywhere in the object (executable).  You'll need to trace through the assembly code to find a place where it's used and determine the value from the assembly statements.

The struct may be even more difficult as a struct is simply a way where the compiler locates (references) items by address+offset in a consistent manner.  Since that's what most assembly instructions look like, you'll need to go through the assembly code and inspect every location where you believe that the struct is used and rebuild the structure.

That's quite a daunting task, even for long time C and assembly coders.

Kent
0
robo_docAuthor Commented:
This is a homework assignment for an introductory course in computer systems. I also have the machine code. I'm attaching the full text of the assignment.

You are charged with maintaining a large C program and you come across the following
code:

typedef struct { 
  int left; 
  a_struct a[CNT]; 
  int right;
} b_struct;

void test(int i, b_struct *bp) {
  int n = bp->left + bp->right; 
  a_struct *ap = &bp->a[i]; 
  ap->x[ap->idx] = n;
}

The declarations of the compile-time constant CNT and the structure a_struct are in a file for which you do not have the necessary access priviliges. Fortunately, you have a copy of the .o version of the code, which you are able to disassemble with the objdump program, yieding the disassembly shown below.

Using your reverse engineering skills, deduce the following. Explain your answers. (Hint: Begin by annotating the disassembled code line by line as we’ve seen for other assembly programs, and keep referring back to the C code.)

A. The value of CNT. 
B. A complete declaration of structure a_struct. Assume that the only fields in this structure are idx and x.

Open in new window




Disassembled Code for test function
00000000 <test>: 
0:	55                       push	%ebp 
1:	89  e5                   mov	%esp,%ebp 
3:	8b  45 08                mov	0x8(%ebp),%eax 
6:	8b  4d 0c                mov	0xc(%ebp),%ecx 
9:	8d  04 80                lea	(%eax,%eax,4),%eax 
c:	03  44 81 04             add	0x4(%ecx,%eax,4),%eax 
10:	8b  91 b8 00 00 00       mov	0xb8(%ecx),%edx 
16:	03  11                   add	(%ecx),%edx 
18:	89  54 81 08             mov	%edx,0x8(%ecx,%eax,4) 
1c:	5d                       pop	%ebp 
1d:	c3                       ret

Open in new window

0
Kent OlsenData Warehouse Architect / DBACommented:
Ah.  That's different.  :)

At line 11 of the C program, you add the integers left and right.  These elements are immediately before and immediately after an array of a_struct objects.

If you'll find the place in the assembly code where you load/add these items, you can deduce some things.

 -  The addresses of left is addr(b_struct) and the address of right is sizeof (a_struct) * CNT + sizeof (left).

You've already identified this.

Next up, is the determining these values.  You've identified right as *bp+184.  How did you get there?

0
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

robo_docAuthor Commented:
Line 8 of the disassembled code. I interpreted this as loading the value of *bp + 184  into register %edx. At line 9, there is an add instruction which adds the content of memory address indicated by register %ecx to the value of register %edx. This is where I think the bp->left and bp->right are being added.

0xb8 is hex for 184.
0
SuperdaveCommented:
I think you're on the right track with the 184 offset, but the 45 doesn't seem right.  You're allowing 4 bytes for the integer "left", so the structure size is 180.  That much I agree with.  But 45 would mean each structure element is 4 bytes.  We know a structure contains two elements, one of which is a 4-byte integer, and the other is some kind of array, so each structure is more than 4 bytes, but something that divides into 180.  You'll have to figure out the subscripting, which means figuring out what the lea instructions are actually doing.
0
robo_docAuthor Commented:
I already figured out that a_struct is 180 bytes large, and element idx of a_struct is a 4 byte integer to the number of elements in the x array of a_struct. Based on this, I know that the declaration for a_struct is

 
typedef struct {
  int * idx;
  int x[]
} a_struct

Open in new window


Am I right with this declaration?

If this is correct, then CNT would be 45, since the 4 byte idx element leaves 176 bytes left in a_struct, and assuming each element in the array is 4 bytes, that would leave 44 elements. 44+1 = 45.
0
Kent OlsenData Warehouse Architect / DBACommented:
Sorry robo,

I'm having connectivity issues.

I agree with Dave.  And we're almost there.

  ap->x[ap->idx] = n;

a_struct has only two fields (idx and n).  idx is an integer since we're storing an integer there.  Now we need to determine the second object in the structure to determine the size.
0
SuperdaveCommented:
There are two different arrays involved--don't get them mixed up.  You have an array of a_structs within b_struct, then you have an array of ints x within a_struct.  When you determine the size of the array (based on figuring out how it's doing the subscripting; that is, figuring out what number it's using the lea's to multiply by), then you can figure the subscript of int x[], which you need to specify as part of the structure declaration.  Then also based on the structure size, you can divide it into 180 to know the subscript of the array of a_structs.
0
robo_docAuthor Commented:
Line 6 for the disassembled code:

lea (%eax, %eax, 4), %eax

Since argument i is stored in register %eax, I see this instruction as %eax = M[i + 4i], where M[ ] is the address in memory, but I do not know where to take it from there. I annotated the assembly code, I just cannot follow it and locate the things I need to answer.
0
SuperdaveCommented:
lea doesn't reference the memory, it just loads an "address" (which doesn't really have to be an address even, it's just that it uses the memory addressing mode to calculate it).  So what the instruction actually does is %eax = i + 4i.
0
robo_docAuthor Commented:
So the next line of code translates to %eax = 5i + M[4+ *bp + (5i * 4)]

The next time %eax is mentioned in the code is line 10. I don't know where to go with line 13 of the disassembled code.
0
Kent OlsenData Warehouse Architect / DBACommented:
Hi Robo,

Your original estimation that CNT = 45 seems like a necessary calculation, you just need to take it another step.  If there are 184 bytes between left and right, then a[CNT] occupies 180.  You've deduced that all of the items in a_struct and b_struct are integers.  Given that you're working on a 32-bit architecture, then there are 45 integers in a_struct.  (The first item in a_struct is idx, you've shown it as a pointer to an integer, but I believe that it is really an integer.  Either way, the item is 32-bits on a 32-bit architecture.)

Let's apply a little bit of Algebra.

CNT is the number of a_struct objects in b_struct.
Assume that ARR is the number of integer objects in array x within a_struct.
ARR must be > 0.

Then, CNT = 45 / (ARR + 1)

Because we must be dealing with integers, 45 must divide evenly by  (ARR + 1).

So then, CNT must be 3, 5, 9, or 15.

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Infinity08Commented:
It's probably more convenient to obtain the size of a_struct by looking at the line 9 from the disassembly.
Knowing this size makes it trivial to calculate CNT.
0
SuperdaveCommented:
I think Infinity means address 9 which is line 6.  Actually you need to look at lines 6 and 7 together because eax is multiplied and multiplied some more.  That lea stuff is a compiler optimization, by the way.  The straightforward way to do it is just to multiply by the structure size, but the lea's are faster or shorter than the multiply instruction.
0
Infinity08Commented:
>> I think Infinity means address 9 which is line 6.

Yes ;)


>> Actually you need to look at lines 6 and 7

The 7th line adds some context, yes. But, given the information robo_doc has already figured out, the 6th line has all the information needed for what I was referring to ;)


>> The straightforward way to do it is just to multiply by the structure size, but the lea's are faster or shorter than the multiply instruction.

Technically, the lea as it's used here is a way of calculating the size that is closer to reality, than simply multiplying would be ... But that's probably more by chance than by intent heh.
That's getting a bit off topic though heh.


Sorry for the intrusion ... You're already in good hands, robo_doc, so there's actually no need for me to get involved too.
0
robo_docAuthor Commented:
No worries. Thank you for all the help.
0
robo_docAuthor Commented:
Solution was easy to follow, but it only answered part of my question.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Assembly

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.