Jump to content
  • Advertisement
Sign in to follow this  
Flambergeman

Teach me assembly

This topic is 4151 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Since now I fixed my problem to print things on screen and begining to understand assembly in general, I'll post my next (all) doubts in this post. Things will be more centralized. I will use here GNU assembler(GAS) and C++. If any problem with this, tell me moderator! OK let's go: I'm in doubt with this function that I writed just ago:
int System::out::print( int p ){
	asm(
//	"sub %esp, 4\n\t"
	"mov %eax, 4\n\t"
	"mov %ebx, 1\n\t"
	"lea %ecx, [%ebp+8]\n\t"
	"mov %edx, 1\n\t"
	"int 0x80\n\t"
//	"add %esp, 4"
	   );
	return 0;
}

This function print a int on screen. I need to fix two things: (1) The lines commented above, are really necessary? The function works equal with and without the these lines. I need more explanations about this stack pointer and his relationship with memory allocations. (2) How to get the size of a number. Sure, here the number is allways 4 bytes long, but if I want to write a second function that receives a class type, I do not know how to calculate the size of this object. Thanks in advance!!

Share this post


Link to post
Share on other sites
Advertisement
Did you check out PC Assembly Language as Icefox suggested in the other thread?

One trick to learning assembly language is to write up some simple routines in C or C++ and then have the compiler generate the assembler for you.

"sub %esp, 4\n\t" - this makes space on the stack for what in C would be called a local variable

"add %esp, 4" - this cleans it up

I think you've got the operand order reversed. And also, when you use an immediate value in GAS, you need to prefix it appropriately. Iirc, with a $ sign.

"subl $4,%esp\n"
"addl $4,%esp\n"

And in GAS instructions are sometimes suffixed to indicate the operand size - b for byte, w for word and l for long (or double word). Here's a reference than might help: AT&T Syntax versus Intel Syntax

You always need to keep the stack balanced between function calls. For every push you must pop, for every sub, a corresponding add. This ensures the function is able to return to the proper address.

To get the size of a structure or a class you have to determine it yourself as you write the code.


Share this post


Link to post
Share on other sites
Popping after a function is only required if the target function is cdecl. If it's stdcall, you don't have to pop. COM functions and most Win32 library functions are stdcall. LIBC functions are usually cdecl. varargs functions are ALWAYS cdecl (as far as I've ever seen).

A helpful pattern to speed up popping after function calls:

(I'm using VC inline assembly syntax here which is nearly identical to Intel syntax)

push d
push c
push b
push a
call Fn1
add esp, 0x10 // change this depending on your processor


Compilers will further optimize by delaying those "add esp, 0x10" such that they are placed immediately before "joins" in the control flow graph (and where the different predecessors have varying stack offsets), but if you're handwriting assembly I doubt you'll want to bother keeping track of where those should go.


(Function 1 : one argument: "Param0")
(Block 1 : Stack Offset 0)
push esi
push edi
xor eax, eax
mov ax, [esp+0xC] (Param0 = Frame+4 - Stack Offset -8 = +0xC)
text ax, ax
jne L1
(Branch Exit : Stack Offset -8, successors: Blocks 2 and 5)

(Block 2 : Stack Offset -8)
push eax
call Fn2

(Block 3 : Stack Offset -0xC)
push eax // ax returned by first function, otherwise wouldn't be necessary
call Fn3

(Block 4 : Stack Offset -0x10)
(Stack Offset Constraint for Successor == -8, adjusting stack)
add esp, 8

(Block 5 : Stack Offset -8)
L1:

pop edi
pop esi
ret
(Function Exit: Stack Offset 0)


Kind of a pain to keep track of when it gets more complicated.

Some macro assemblers let you define structure data types and should have some kind of "sizeof" equivalent. There are no actual instructions that deal with structure data types like this - the macro assembler or compiler determines the constant size of the struct at compile/assemble time and just sticks the number in the output code.

[Edited by - Nypyren on April 18, 2007 11:48:21 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Nypyren
Popping after a function is only required if the target function is cdecl. If it's stdcall, you don't have to pop. COM functions and most Win32 library functions are stdcall. LIBC functions are usually cdecl. varargs functions are ALWAYS cdecl (as far as I've ever seen).


Why would the calling convention matter? Taking up stack space is taking up stack space, and you only have X amount.

Quote:
A helpful pattern to speed up popping after function calls:

(I'm using VC inline assembly syntax here which is nearly identical to Intel syntax)

push d
push c
push b
push a
call Fn1
add esp, 0x10 // change this depending on your processor


It would be helpful to explain where the 0x10 comes from and how it would need to change ;) Also, this assumes caller-cleanup of the stack, which AFAIK isn't exactly universal.

[Edited by - Zahlman on April 19, 2007 9:54:32 PM]

Share this post


Link to post
Share on other sites
Another thing that should be mentioned is stack alignment. Consider a stack alignment of 16. Now a function foo() calls a function bar(), so the return address is pushed onto the stack. The result is, that in bar() the stack is not aligned anymore, so bar() has to adjust the stack pointer.

Share this post


Link to post
Share on other sites
Quote:
Original post by LessBread
Did you check out PC Assembly Language as Icefox suggested in the other thread?

I readed now more carefully and I think that I understand now. you push values to stack to pass parameters to variables, and use sub esp to ma room for local variables. Ok I just need to practice now :)

Quote:
Original post by LessBread
One trick to learning assembly language is to write up some simple routines in C or C++ and then have the compiler generate the assembler for you.

Yeah, I'm doing this.

Quote:
Original post by LessBread
"sub %esp, 4\n\t" - this makes space on the stack for what in C would be called a local variable

"add %esp, 4" - this cleans it up

I think you've got the operand order reversed. And also, when you use an immediate value in GAS, you need to prefix it appropriately. Iirc, with a $ sign.

"subl $4,%esp\n"
"addl $4,%esp\n"

And in GAS instructions are sometimes suffixed to indicate the operand size - b for byte, w for word and l for long (or double word). Here's a reference than might help: AT&T Syntax versus Intel Syntax

I'm using intel syntax.

Quote:
Original post by LessBread
You always need to keep the stack balanced between function calls. For every push you must pop, for every sub, a corresponding add. This ensures the function is able to return to the proper address.

I already know. Is just dificult to read the code and notice that a pop, or add in esp is missing.

Quote:
Original post by LessBread
To get the size of a structure or a class you have to determine it yourself as you write the code.

I'll read more info on arrays tomorow.


Nypyren
I'll use allways stdcall convention for my functions. It's faster, and I'll not need printf() like parameters. The number of parameters will be fixed allways.

And thinking how to pass parameters to fnctions, I decided that I will use allways the push op, because my functions can have more than 4 parameters, so using registers to load some of them is not a thing that solves the wole problem. So I'll not use any register to pass parameters. I'll use the stack only. Do you agree?
I want things that works really fast here.

nmi
I have been noticed :)

Share this post


Link to post
Share on other sites
Quote:
Original post by Zahlman
Why would the calling convention matter?

Quote:
Also, this assumes caller-cleanup of the stack, which AFAIK isn't exactly universal.


You answered your own question. Though, the stack stuff in the OP's code doesn't really have anything to do with calling conventions, so that's probably why you said the first part. Only the second part is where calling convention matters.




To OP, I don't think that function is doing what you want. What it does is print one byte at the address of the value of the integer p. If you wanted to print an integer, you would first have to convert it to a string, then print the string.

The parameter that goes into ecx for the system call is the starting address of a string of bytes that will be interpreted as characters. It will not print the number in ecx.

[Edited by - nicksterdomus on April 19, 2007 11:25:54 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Flambergeman
I'll use allways stdcall convention for my functions. It's faster, and I'll not need printf() like parameters. The number of parameters will be fixed allways.


That will definitely make life easier! :) Just make sure if you use other people's code that you know whether it's cdecl or stdcall.

Share this post


Link to post
Share on other sites
Also, I believe lea is the wrong instruction to use in the line
"lea %ecx, [%ebp+8]\n\t"

If I am converting this instruction to ATT syntax correctly, then this instruction would add the contents of ebp and 8, then put that value into ecx. What you want to do is add the contents of ebp and 8, which forms an address, then load the value stored at that address into ecx. You can use the mov instruction to do that.
"mov %ecx, [%ebp+8]"

Share this post


Link to post
Share on other sites
Quote:
Original post by nicksterdomus
To OP, I don't think that function is doing what you want. What it does is print one byte at the address of the value of the integer p. If you wanted to print an integer, you would first have to convert it to a string, then print the string.

I know. I'm using number 65 as parameter, so console prints a 'A'. This int is just for test for now. I will change the parameters of methods out::print()'s, to accept classes later.
Quote:
Original post by nicksterdomus
The parameter that goes into ecx for the system call is the starting address of a string of bytes that will be interpreted as characters. It will not print the number in ecx.

I know too. This is because I use 'lea' and not 'mov'.
Quote:
Original post by nicksterdomus
Also, I believe lea is the wrong instruction to use in the line

"lea %ecx, [%ebp+8]\n\t"

No, this correct is lea. mov doesn't work. I tried at my previous post.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!