Link to home
Start Free TrialLog in
Avatar of Destructive
Destructive

asked on

Determining the address of a variable/array using GDB?

hi. It is my first time using the site but I promise to grade generously and swiftly if your answer can steer me in the right direction.  I am working on a homework that has been asked about previously before:
https://www.experts-exchange.com/questions/22967305/buffer-overflow-bomb.html
It's the same exact homework and entire assembly code of the executable if it helps you get a better idea, but I don't think you don't need to look at it to understand my problem.

I'm at the point where I have to locate the address of a particular array called buf which holds 12 chars (11 + implicit newline terminator character).  It is located in a function getbuf().  I've provided the C code and  assembly version of the code of getbuf() and my GDB session of my attempt to find the address.  The program is very simple:  you start it, enter an input, and the input is stored in buf.  The end-goal is to do buffer overflow and enter more than 12 chars so the program does more interesting things.  The problem I have is figuring out the address of the array buf (the base address, so buf[0]) so I can provide this to my input to "ret" (return) to them.  I am overwriting the entires in the array with machine instructions, so the goal is to basically provide the base address of the array as part of my input so the program ret to there.

The instructions say that we should use gdb to locate the address of the array.  We should break to getbuf and somehow find the address of buf.

I have tried to search for so many hours but everyone's buffer overflow tutorial on figuring out the address of buf assumes you have a symbol table available to you and you just need to  type x &buf to find it in gdb.  But if I do that in gdb, I get the error "No symbol table loaded"  So I need to find the address some other way...   but how?

if I look at the assembly code of getbuf, I can deduce the following:
in the lea -0xc(%ebp), %eax, it is allocating space of 12 ( -0xc) for the array buf and putting the address of ebp - 12  in register %eax. So I thought if I just find the address of %eax, which in the above gdb session, it shows it is  0xbf8e345c, I have found the base address of the array.   %ebp is  0xbf8e3468, according to the gdb session, and if we subtract 12 from that, we get 0xbf8e345c in %eax, as desired, right?

But when I try this (try to ret to 0xbf8e345c), the program complains 0xbf8e345c is an invalid instruction address.  Worse yet, every time I run the program, this address (0xbf8e345c) keeps changing (except for the 3 least significant bytes 45c which seem to remain fixed every time).  So if I were to do gdb again and look at %eax , it would be 0x?????45c.  Also, %ebp (frame pointer) and %esp (stack pointer) are changing every time I run gdb and do the same exact thing too (same exact input, but when I do info registers, it just is different each time)...I have no idea what I am doing because if they keep changing to some random address, how can I calculate this address right and accurately?

so this is where I'm at
C:
int getbuf()
{
    char buf[12];
    /* Read line of text and store in buf */
    Gets(buf); 
    return 1;
}
 
Assembly:
08048f40 <getbuf>:
 8048f40:	55			push   %ebp
 8048f41:	89 e5			mov    %esp,%ebp
 8048f43:	83 ec 18		sub    $0x18,%esp
 8048f46:	8d 45 f4		lea    -0xc(%ebp),%eax
 8048f49:	89 04 24		mov    %eax,(%esp)
 8048f4c:	e8 7f fe ff ff		call   8048dd0 <Gets>
 8048f51:	b8 01 00 00 00		mov    $0x1,%eax
 8048f56:	c9			leave
 8048f57:	c3			ret
 8048f58:	90			nop
 8048f59:	8d b4 26 00 00 00 00	lea    0x0(%esi),%esi
 
GDB:
(gdb) break getbuf // Set up breakpoint to getbuf function after I type in my input
Breakpoint 1, 0x08048f46 in getbuf ()
 
 
(gdb) run < input.txt // Run the program with my input in a file input.txt
Starting program: bufbomb
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
 
 
(gdb) info frame // See what stack looks like?
Stack level 0, frame at 0xbf8e3470:
 eip = 0x8048f46 in getbuf; saved eip 0x8048f7e
 called by frame at 0xbf8e3490
 Arglist at 0xbf8e3468, args: 
 Locals at 0xbf8e3468, Previous frame's sp is 0xbf8e3470
 Saved registers:
  ebp at 0xbf8e3468, eip at 0xbf8e346c
 
 
(gdb) info registers // See what registers look like right now.
eax            0x3      3
ecx            0x0      0
edx            0x6d30b0 7155888
ebx            0x0      0
esp            0xbf8e3450       0xbf8e3450
ebp            0xbf8e3468       0xbf8e3468
esi            0x3      3
edi            0x8744018        141836312
eip            0x8048f46        0x8048f46 <getbuf+6>
eflags         0x286    [ PF SF IF ]
cs             0x73     115
ss             0x7b     123
ds             0x7b     123
es             0x7b     123
fs             0x0      0
gs             0x33     51
 
 
(gdb) stepi // Move one more instruction.
0x08048f49 in getbuf ()
 
 
(gdb) info frame // See current stack information.
Stack level 0, frame at 0xbf8e3470:
 eip = 0x8048f49 in getbuf; saved eip 0x8048f7e
 called by frame at 0xbf8e3490
 Arglist at 0xbf8e3468, args: 
 Locals at 0xbf8e3468, Previous frame's sp is 0xbf8e3470
 Saved registers:
  ebp at 0xbf8e3468, eip at 0xbf8e346c
 
 
(gdb) info registers // See register information.  Notice eax should now have address of buf array.
eax            0xbf8e345c       -1081199524
ecx            0x0      0
edx            0x6d30b0 7155888
ebx            0x0      0
esp            0xbf8e3450       0xbf8e3450
ebp            0xbf8e3468       0xbf8e3468
esi            0x3      3
edi            0x8744018        141836312
eip            0x8048f49        0x8048f49 <getbuf+9>
eflags         0x386    [ PF SF TF IF ]
cs             0x73     115
ss             0x7b     123
ds             0x7b     123
es             0x7b     123
fs             0x0      0
gs             0x33     51

Open in new window

Avatar of Infinity08
Infinity08
Flag of Belgium image

First of all, allow me to refer to the answers I gave in the other question (the one you posted the link for). It's a good reference in case you're unsure of how to approach this assignment.


>> %ebp is  0xbf8e3468, according to the gdb session, and if we subtract 12 from that, we get 0xbf8e345c in %eax, as desired, right?

And that's indeed the address of the buf.


>> But when I try this (try to ret to 0xbf8e345c), the program complains 0xbf8e345c is an invalid instruction address.

Why do you want to do that ? The point of a buffer overflow is to give more data as input than the buffer can hold. That extra data will then overwrite other memory used by different parts of the code.
In this case, the buffer is on the stack, so you'll be overwriting whatever comes after the buffer ... Find out what that is, and how you can "exploit" the buffer overflow by overwriting that value with data of your choice.
Avatar of Destructive
Destructive

ASKER

hi Infinity08, thank you so much for your response...I have read your responses in the link, so many times, and it helped me to actually get to where he ended up being stuck at the end (which is where I am right now).  So I've finished those first two levels he was working on thanks to you and him, but I'm stuck at this part now.

It helped me to know that you confirmed this is the address of buf.  But basically, this is the syntax I am entering for my input to the program itself to maybe give you a better picture of what my train of thought is:

01 02 03 04 05 06 07 08 09 10 11 12  90 90 90 90 ** ** ** **

So as my input, I enter the format above.  The first 12 bytes correspond to the contents of buf I believe.

Then I need to add a "padding" of four bytes (four 90s, or four no ops, it doesn't matter what I enter, just four bytes of whatever).  So after these 16 bytes, if I enter in a valid memory address where those asterisk  are located (bytes 17 through 20), the program will actually "ret" (jump) to the address I specify in those four bytes.  This is what I had to do in the first two levels is ret to various functions in the program by supplying the address in those bytes.  I am overwriting the return address normally stored in those ** bytes with a different address.  So long as I enter the address in little endian of course.   So in theory, I thought if I did:

01 02 03 04 05 06 07 08 09 10 11 90 90 90 90 90 5c 34 8e bf
^                                                                                |
|<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< | From here, ret to where my machine instructions are (01 02 03...)

Where the first 16 bytes are actually machine instructions that I want to execute.
And the last 4 bytes are the address of buf in little endian:  0xbf8e345c

I thought it should "ret" me to 01 02 03...

So I thought if I put the address of buf where those asterisks are, the program should "ret" jump to the address of buffer and execute the machine instructions I am entering (01 02 03...), since those instructions are located beginning at buffer[0]?  I I mean, if I am not supposed to enter the buf's address in those bytes where I am overwriting the ret address, how can I get to my machine instructions?  I need to manipulate bytes 17-20 with an address, and I thought it should be buf address.. Is this not true?

...and I think I figured out that if I do not enter an input for the program, while  in gdb (just run the program and break to getbuf), the memory addresses of %ebp and %eax do not seem to change every time I run gdb.   it is still 0xbf8e345c like in my gdb session i posted here.  so that's good.

i will try some more options and see what i can do...maybe my machine instructions are the problem, not the address of buffer...

..also thanks for answering my other question about movl
Sorry, for this part in my previous post:


01 02 03 04 05 06 07 08 09 10 11 90 90 90 90 90 5c 34 8e bf
^                                                                                |
|<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< | From here, ret to where my machine instructions are (01 02 03...)

I accidentally wrote 01 02 ... 11 90 90 90 90 90 (5 90s)
It meant what I wrote the first time (above that part):  01 02... 11 12 90 90 90 90 (4 90s)
OK I re-read what you said again:

"In this case, the buffer is on the stack, so you'll be overwriting whatever comes after the buffer ... Find out what that is, and how you can "exploit" the buffer overflow by overwriting that value with data of your choice."

 I have to admit I don't know what that is after the buffer (which is the the first 12 bytes of my input).  I mean, I know I have to add four more bytes of padding before I get to the part where I can modify the return address in getbuf (see lin.  What's occupying those bytes I don't really know.  I guess that should be my first objective before I do anything else?  I know that after those four bytes, I am overwriting the return address in bytes 17-20.
Assembly:
08048f40 <getbuf>:
 8048f40:       55                      push   %ebp
 8048f41:       89 e5                   mov    %esp,%ebp
 8048f43:       83 ec 18                sub    $0x18,%esp
 8048f46:       8d 45 f4                lea    -0xc(%ebp),%eax // Where I am overwriting the first twelve bytes
 8048f49:       89 04 24                mov    %eax,(%esp)
 8048f4c:       e8 7f fe ff ff          call   8048dd0 <Gets>
 8048f51:       b8 01 00 00 00          mov    $0x1,%eax
 8048f56:       c9                      leave
 8048f57:       c3                      ret // Where I am overwriting the return address in bytes 17-20
 8048f58:       90                      nop
 8048f59:       8d b4 26 00 00 00 00    lea    0x0(%esi),%esi

I will try thinking on it some more..thx for your speedy reply
ASKER CERTIFIED SOLUTION
Avatar of Infinity08
Infinity08
Flag of Belgium image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
oh my goodness!  I GOT IT!!  thank you infinity08..

I have no idea what was my problem...but i was getting the wrong address when i was doing gdb for my buffer.  so %eax was giving me different values.. what i did was, just start all over again.  I re-wrote the machine code, got the hex values i needed all over again.  i guess i had some bug unrelated to this when i was running gdb.

but if not for you confirming for me that i was looking in the right place, for the address of buffer i might have strayed for days.. thank you so much!  you are truly an expert on assembly code...

i hope you don't mind, but i will probably ask another question soon about the next level(this is the last level)  tomorrow after i have had some time to think it over.  we will need to do something very similar to this...

Thank you for your diagram on the stack.  i will study it some more.  I will tell you I am not very comfortable with the stack so i appreciate your diagram greatly.   i will need to study it for the next level..because we must be careful when we overwrite the contents after buffer and restore them back to the stack or else it wont work this time.

about the first two levels...basically I did this:
Level 0:  Goal is to get address of function smoke.

01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 ** ** ** **
Where bytes 1 through 16 are arbitrary bytes and can be anything (basically padding)
Where the ** refers to the address of function smoke in little endian.
This wil ret to smoke as desired.

Level 1:  Goal is to get to address of function fizz, but also change the paramter in fizz to your cookie
function fizz(int value)
So change value to cookie.  (By default, value is some number that is not your cookie.  I forgot which, but it doesn't matter because we just need to overwrite it with the value of our cookie)

01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 ** ** ** ** xx xx xx !! !! !! !!
Where bytes 1 through 16 are arbitrary bytes and can be anything (basically padding)
Where the ** bytes refers to the address of function fizz in little endian (same placement as function smoke like previously)
Where the xx bytes refer to additional padding (no idea why we need them, just need them)
Where the !! bytes refer to your cookie (this will replace "value" to your cookie)

thanks again infinity, i really appreciate your extremely timely answers.
Thank you.  You are an expert on assembly.
oh, and of course if i ask another question, it will be asked separately, not in this question.  Thanks again!
Great job :) I'm glad I could be of assistance.

And I look forward to your next question ;) I love these kind of exercises heh.