AT&T Assembly

Started by
9 comments, last by Dave Hunt 18 years, 11 months ago
Hey Guys! I've got a couple of questions about gcc (or rather the compiler that devCpp uses). 1 : Why does it use AT&T? (Can someone tell me ?) >:( 2 : I have this code ->

int a = 5;
a += 5;

In Intel assembly (the one I like :) . Like in Visual C++) its -> (lets assume that eax is pushed before & popped afterwards)

int a = 5;

__inline
{
mov eax, dword ptr [a]
add eax, 5
mov dword ptr [a], eax
}

This is how I implemented it in AT&T (obviously it doesn't work; in DevCpp) ->

int a = 5;

asm("movl -4(%ebp), %eax");
asm("addl $5, %eax");
asm("movl %eax, -4(%ebp)");

eax, gets the value of 5. When I add 5, eax becomes 0. a doesn't change at all. Can anyone explain? Thanks alot!
"Take delight in the Lord and He will give you your heart's desires" - Psalm 37:4My Blog
Advertisement
GCC uses Gas, the GNU Assembler, to compile inline assembly code. Gas has historically always used AT&T syntax. I think that's because it's easier to parse and makes it look more like other (legacy) assembly languages. AT&T syntax has been standard in the UNIX world for very long.

So the real problem is that, with all due respect, GCC is a dinosaur.
No, no, no... gcc is not a dinosaur, their inline asm is much more powerful than VC's. It's just that the syntax is a huge pain. It's more powerful because you can tell gcc exactly what is getting changed and it can inline and optimize your asm with the surrounding code.

The way you tell it what to do is with "constraints" and "clobbers"... It's fairly complicated so here's my 60 explanation. You have 4 colon separated fields: the asm statement : output operands : input operands : clobbered memory/registers.

Your example would look something like:
int a = 5;//   statement             outputs   inputs    clobbered registers/memoryasm("movl %0,   %eax"    :          : "r"(a) : "%eax");asm("addl $5,   %eax"    : /* I don't think you need to clobber %eax again... */);asm("movl %eax, 0x0(%0)" : "=m"(a));


Of course, I've never done x86 assembly with gcc... only PS2 assembly, which designates their registers differently. So I can't promise this will work. But you should note that the operands are specified in the asm in reverse order. The =m is because you're writing To that memory. "r" means toss it in a register before executing the expression...

Here is a little IBM tutorial. Don't let people knock gcc asm too much. I'll agree that the syntax is awful, especially the reversed operands, but it really is the best thing going.
I should also point out that if you use "r"(operand) constraints, you don't need to do the loading yourself... so:

int a = 5;asm("addl $5,   %0"      : "=a"(a) : "0"(a));


should be all you need. Note that GCC will automatically allocate registers for you, so generally you don't want to name them yourself (but other times you have to or want to). This means that gcc won't have to push/pop eax before your asm block if it was already using it... it can find some other register to hold a.

This stuff can be pretty hard... and frustrating if you get your constraints wrong. But it's well worth learning in cases where you need it.
Quote:Original post by C0D1F1ED
So the real problem is that, with all due respect, GCC is a dinosaur.


It's more like a crocodile, having survived virtually unchanged for eons. It has adapted over time, but it's still basically a crocodile.
I've also read that the main justification for AT&T syntax is it's more similiar to non-Intel processors.

I find it annoying too. Using the same syntax as the people that designed the processor and wrote the authoritative reference on it seems like a very good thing to do. I don't see any particular reason why all the gcc gobbledygook couldn't be put on Intel syntax.
-Mike
If you want to, and if you have a reasonably recent version of gas/gcc (don't know exactly which versions are relevant), then the following may apply:

From gas's info:

`as' now supports assembly using Intel assembler syntax.
`.intel_syntax' selects Intel mode, and `.att_syntax' switches back to
the usual AT&T mode for compatibility with the output of `gcc'. Either
of these directives may have an optional argument, `prefix', or
`noprefix' specifying whether registers require a `%' prefix.

so you can write inline assembly like this:

.intel_syntax noprefix
mov eax, 5
add eax, 3
.att_syntax prefix

instead of like this:

movl $5, %eax
addl $3, %eax

(of course, all wrapped in __asm__ strings etc.)
note that you can use multiple asm statements per line.
This is how I write my code:

asm("	# conditionally move the function address into register $8	movz $8, %0, %2	movn $8, %1, %2	# tell VU0 which register context to use	ctc2 $8, $vi27	# wait for ctc latency	vnop	vnop	# call constraint microcode on register context	vcallmsr $vi27	"	: :	"r"((int)RunConstraint_ctx0 >> 3),		"r"((int)RunConstraint_ctx1 >> 3),		"r"(constraint_ctx)	:	"$8");


The %0, %1, %2 just refer to the respective values in the operand list.
Is it really necessary to specify weither I want to read or write to the memory/register? Doesn't the destination/source operand "speak" for itself?
Know what I mean?
"Take delight in the Lord and He will give you your heart's desires" - Psalm 37:4My Blog
well... yes.

If you're writing to memory, think of it this way: gcc might have already loaded that memory into a register. If you overwrite the memory, gcc's register has a stale value and any operations it does with that will be wrong, and it even might write that register back to memory later, clobbering what you wrote. Intel asm just assumes this is what you do, so it'll reload all its registers anyway. GCC asm assumes that if you do anything like that, you'll tell it, and it'll optimize accordingly and reload minimally.

If you're talking about why are there two separate lists for inputs and outputs... well I don't know. It might have something to do with GCC targeting many different processors so having to adapt to many different assembly language definitions. x86 is fairly unique (and painful) in that one of the input operands is frequently required to be an output operand, while the number of registers is few.

But that's a really crappy explanation, and I'm sure someone knows better.

This topic is closed to new replies.

Advertisement