?
Solved

Understanding Hex Dump of helloworld.c

Posted on 2005-03-07
12
Medium Priority
?
1,019 Views
Last Modified: 2012-05-05
Hi,

I am trying to understand how to read hex dumps. For simplicity I wanted to start with hello.c. The executable created has 12904 bytes or ~807 lines of 16 bytes. I want to understand the different sections of the hex dump and if there is anything in general you can assume. Please be thorough.

source:

#include <stdio.h>
int main()
{
  printf("Hello World\n");
  return (0);
}

gcc hello.c -o hello

$ gcc --version
gcc (GCC) 3.3.3 (cygwin special)
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

thanks! matt
0
Comment
Question by:unityxx311
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
  • 2
  • +1
12 Comments
 
LVL 22

Assisted Solution

by:grg99
grg99 earned 150 total points
ID: 13479843
12K of code is a bit much to comment on in this little white box.

There's probably some startup code that initilizes things, then calls main().   main calls printf() which writes the string to stdout.  then main returns to the startup code, which does some cleaning up and exits.

You're probably better off starting out looking at the code generated by the compiler for hello.c

Use the -S option to get the compiler output, then peruse it until it makes some sense.

Or you can use a debugger to look at or step through the code.  Very interesting.  


0
 

Author Comment

by:unityxx311
ID: 13479947
Yeah I was a little puzzled at the amount of hex produced from just hello.c. I figured most of it was assembly code...
0
 

Author Comment

by:unityxx311
ID: 13479959
using gcc -S

$ cat hello
        .file   "hello.c"
        .def    ___main;        .scl    2;      .type   32;     .endef
        .section .rdata,"dr"
LC0:
        .ascii "Hello World\12\0"
        .text
.globl _main
        .def    _main;  .scl    2;      .type   32;     .endef
_main:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $8, %esp
        andl    $-16, %esp
        movl    $0, %eax
        movl    %eax, -4(%ebp)
        movl    -4(%ebp), %eax
        call    __alloca
        call    ___main
        movl    $LC0, (%esp)
        call    _printf
        movl    $0, %eax
        leave
        ret
        .def    _printf;        .scl    2;      .type   32;     .endef
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 22

Expert Comment

by:NovaDenizen
ID: 13480124
grg99's right, in that you are probably better off just starting with the .s files.

gcc -S hello.c
generates this output in hello.s:

        .file   "hello.c"
        .section        .rodata
.LC0:
        .string "Hello World\n"
        .text
.globl main
        .type   main, @function
main:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $8, %esp
        andl    $-16, %esp
        movl    $0, %eax
        subl    %eax, %esp
        movl    $.LC0, (%esp)
        call    printf
        movl    $0, %eax
        leave
        ret
        .size   main, .-main
        .section        .note.GNU-stack,"",@progbits
        .ident  "GCC: (GNU) 3.3.5  (Gentoo Linux 3.3.5-r1, ssp-3.3.2-3, pie-8.7.7.1)"

With my comments:
        .file   "hello.c"              
        .section        .rodata        ; means 'read-only data'.  
.LC0:
        .string "Hello World\n"     ; your string constant, stored at label .LC0
        .text                              ; it says 'text', but this really stands for 'code'.  
.globl main                            ; this tells the assembler to publish the symbol 'main' as a global in the symbol table of the .o, so other programs (namely the linker) will be able to find it.  Note that .LC0 is not published.
        .type   main, @function   ; this is type information associated with the label, to let the outside world know main is a function and not a variable or anything like that.
main:                                   ; this is the actual label for main().
        pushl   %ebp                 ; %ebp is the 'frame pointer', and %esp is the 'stack pointer'.  These two lines preserve the
        movl    %esp, %ebp       ; calling routine's frame pointer and reset %ebp to the current frame pointer.
        subl    $8, %esp            ; reserves 8 bytes on the stack.  Not sure why.
        andl    $-16, %esp         ; aligns top of stack to 16-byte boundary.  Not sure why.
        movl    $0, %eax           ; zeroes %eax
        subl    %eax, %esp       ; not sure why this is necessary.  Is it clearing the CPU arithmetic flags?
        movl    $.LC0, (%esp)   ; puts the address of the "Hello World\n" string on the stack as the first parameter to printf
        call    printf                  ; calls printf
        movl    $0, %eax          ; zeroes %eax again.  function return values go into %eax, so printf clobbered it.
        leave                          ; reverses those first two %ebp and %esp instructions, to get the frame pointer back to how the calling routine expects it
        ret                              ; returns to calling routine
        .size   main, .-main      ; specifies the size of the 'main' entry.  '.' is the current output location, and '.-main' is # of bytes from 'main:' to here.
        .section        .note.GNU-stack,"",@progbits     ; not sure
        .ident  "GCC: (GNU) 3.3.5  (Gentoo Linux 3.3.5-r1, ssp-3.3.2-3, pie-8.7.7.1)"   ; identifying information for compiler

0
 
LVL 8

Expert Comment

by:ssnkumar
ID: 13503280
So, it would typically contain the memory layout of your code.

-ssnkumar
0
 
LVL 22

Expert Comment

by:NovaDenizen
ID: 13507076
Many unix-like operating systems (Linux, Solarix, and hp-ux for example) use ELF binaries for object code.  

This site has a couple of informative links on it:
http://www.answers.com/topic/executable-and-linkable-format
0
 
LVL 22

Expert Comment

by:NovaDenizen
ID: 13507095
Actually, the wikipedia article is probably a little better:
http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
0
 

Author Comment

by:unityxx311
ID: 13535747
So essentially, to hex edit any executable file you need to know the format?
0
 
LVL 22

Accepted Solution

by:
NovaDenizen earned 900 total points
ID: 13540849
In general, to make modifications to any kind of file you need to understand the format.

0
 
LVL 22

Expert Comment

by:NovaDenizen
ID: 13540858
What are you trying to do?
0
 

Author Comment

by:unityxx311
ID: 13557269
I was trying to determine if there was a standard format of an executable, like ELF that someone pointed out. For example, could I look at lines 30-40 and say this is were it does x. I was looking at the hex for just the simple case of hello.c which had thousands of lines and I figured that most of it was overhead, which might be in some standard format for every exe created with gcc.

matt
0
 
LVL 8

Assisted Solution

by:ssnkumar
ssnkumar earned 450 total points
ID: 13571945
>  I was trying to determine if there was a standard format of an executable
Then this link might be of help to you:
http://www.wotsit.org/search.asp?s=binary

-ssnkumar
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Preface I don't like visual development tools that are supposed to write a program for me. Even if it is Xcode and I can use Interface Builder. Yes, it is a perfect tool and has helped me a lot, mainly, in the beginning, when my programs were small…
This is a short and sweet, but (hopefully) to the point article. There seems to be some fundamental misunderstanding about the function prototype for the "main" function in C and C++, more specifically what type this function should return. I see so…
The goal of this video is to provide viewers with basic examples to understand how to use strings and some functions related to them in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use nested-loops in the C programming language.
Suggested Courses

800 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question