Assembly newb with questions on my code

Started by
7 comments, last by Goran Milovanovic 11 years, 7 months ago
Hey everyone,

So this is my first time writing assembly code and I'm trying to understand everything I'm doing so far but I need a bit of help. First off, here's what my code looks like:




BITS 32

;ssize_t write(int fd, const void *buf, size_t n);
xor eax, eax ; Make eax zero for our null terminator
xor esp, esp ; Clear the stack (Is this a bad idea?)
push eax ; Push the null terminator to the stack
push 0x7273752F ; /usr
push 0x6E69622F ; /bin
push 0x6465672F ; /ged
push 0x7469 ; it
mov ebx, 1 ; Use stdout
mov eax, 4 ; Move 4 into eax for write call
mov edx, 15 ; length of 15
mov ecx, esp ; push the string into ecx
int 0x80 ; Do the system call

; void _exit(int status);
mov eax, 1 ;Exist system call
mov ebx, 0 ;Status is clean
int 0x80 ;Do the system call


The comments are what I think each line does. I'm basically trying to push "/usr/bin/gedit" to the stack that way I can then move esp into the buffer, ecx. Yes I realize I'm not actually executing gedit, this is just a test for me to see if I can print the path out. I'm having trouble doing this though because when I use hexdump -C on this, my string has an 'h' in between each push I have there. So it would look like this:
"h/usrh/binh/gedit"

Any idea why is this happening and do I have my logic correct for what each line is doing?

Thanks
Advertisement
h just stands for hex. push 40 and push 40h would have different meaning.
I'm not sure why hexdump prints it like this. If you want to be sure try to print stack to file.
Why use xor to set a register to zero? I mean, why not simply mov eax 0?

+---------------------------------------------------------------------+

| Game Dev video tutorials -> http://www.youtube.com/goranmilovano | +---------------------------------------------------------------------+
`xor esp, esp' seems like a horrible idea. Just leave it alone.

I haven't programmed using Liinux system calls in assembly, but I doubt very much that the name of the file is expected to be in the stack. You probably need to put just a pointer there.

You can probably write a trivial C program and use gdb and figure out how write() is actually implemented.
1. Modifying ESP directly: Don't zero out ESP! ESP is a *pointer* to stack memory. You want to leave it wherever it was when your process started (the loader will allocate your stack memory for you before your code starts). You'll get one of a variety of hardware exceptions (#GP, #SS, or #PF) if you try to push anything after setting ESP to zero.

The usual things you can do with ESP are:
a. Save and restore it by moving it to/from EBP ("setting up a stack frame")
b. Subtract and add from it (reserving local variables or cleaning up pushed arguments after a cdecl call)
c. Push/pop with it (implied use of ESP).

2. You can push a string literal onto the stack like you're trying to do, BUT you need to do it in reverse, since each time you push, the stack address *decreases*.
a. Push your string 4 bytes at a time from the end towards the beginning, including zeros for null termination. If your string is not a multiple of 4 bytes, you can use zeros to pad the first pushed value (last portion of the string). Otherwise, your method of pushing a zeroed eax will work.
b. You're already handling the little endian layout of the pushed constants properly, so that's fine.
c. Generally you have to push things in 'reverse' order even with C-style function call parameters (you push the rightmost argument first and work your way left in the argument list). Since it appears the function you're calling uses registers only, this won't matter in this case.

3. Alvaro's worry about the string being in stack memory should be OK, since it looks like your function call is passing the char* correctly in ECX. I haven't used linux for assembly programming, but this looks a lot like how calls are made in old-school DOS programming when making system calls - all of the actual function parameters are passed via registers. If you wanted to do the same thing, but call a C library function instead, you would likely need to pass your parameters on the stack, including pushing a pointer to your string onto the stack itself (no calling convention I'm aware of lets you pass strings or arrays without a pointer).

4. Hexdump's "h" character between your strings are the opcode for the "PUSH imm32" (0x68) instruction. 0x68 is the same as the lowercase 'h' in ASCII.
a. Each instruction has several different versions which may have a different opcode byte (or multiple bytes!) for each version.
b. PUSH is one of the most common instructions, so Intel reserved a lot of single-byte opcodes for it.
c. "imm32" means the instruction uses an "immediate" operand - a value stored after the opcode, in the code stream itself. imm32 means it's a 32-bit value stored in the code stream.

Why use xor to set a register to zero? I mean, why not simply mov eax 0?

Self-xor is historically (and still is, usually) faster than moving zero into a register, most compilers do it and it's more or less an idiom nowadays.

And yes, by setting ESP to zero without saving it, you just destroyed the stack pointer and your stack is gone. In assembly there are some registers you simply should not mess with unless you know what you are doing or really need the extra scratch space. If possible, make use of as few registers as possible (the "nice" ones like EAX, ECX, etc...) and use memory for the rest, and optimize later on.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

Thanks for the replies!

Nypyren, am I pushing the string in the correct order right now or do I have it backwards? The push opcode generates the 'h' character; will this effect my string output or will it print out as: "/usr/bin/gedit"?

Lastly, I just want to make sure my comments are correct for why I'm doing these instructions.

You guys have been really helpful!
Here's what will happen in memory during these commands:


Stack: ??...

push eax ; Push the null terminator to the stack

Stack: 00 00 00 00 ??...

push 0x7273752F ; /usr

Stack: 2F 75 73 72 00 00 00 00 ??...

push 0x6E69622F ; /bin

Stack: 2F 62 69 6E 2F 75 73 72 00 00 00 00 ??...

push 0x6465672F ; /ged

Stack: 2F 67 65 64 2F 62 69 6E 2F 75 73 72 00 00 00 00 ??...

push 0x7469 ; it

Stack: 69 74 00 00 2F 67 65 64 2F 62 69 6E 2F 75 73 72 00 00 00 00 ??...

What you actually have in memory on the stack in ASCII representation: "it(null)(null)/ged/bin/usr(null)(null)(null)(null)"


Notice that each time a push occurs, the new data is added to the LEFT side. This is how the stack works on x86 processors.

What you want to do is reverse the order of your pushes, then you should get the string you want pushed properly. Also, you won't need to push eax anymore since the "it" portion of your push includes two free null terminators.


The push opcode generates the 'h' character; will this effect my string output or will it print out as: "/usr/bin/gedit"?


The opcodes won't affect your output string. The program itself is stored in a different area of memory than the stack. Your push instructions are essentially copying the values from the memory representing your code to the memory representing the stack, and adjusting the value of ESP.



Lastly, I just want to make sure my comments are correct for why I'm doing these instructions.


Your comments for each individual instruction are generally correct except for two:
"xor esp, esp" - this is NOT clearing the stack. This is the same as setting a pointer to NULL in C. Except this is the special 'stack pointer' register, and setting it to NULL will screw up ALL stack-related operations.

"mov ecx, esp" - it's not "pushing". It's just setting the value of ecx to be equal to esp. Assembly doesn't differentiate between numbers and pointer types, but due to the meaning of the code up to that point, ECX will represent a pointer to the start of your string.

Self-xor is historically (and still is, usually) faster than moving zero into a register, most compilers do it and it's more or less an idiom nowadays.


Ah, I see. Thanks.

+---------------------------------------------------------------------+

| Game Dev video tutorials -> http://www.youtube.com/goranmilovano | +---------------------------------------------------------------------+

This topic is closed to new replies.

Advertisement