Visual Studio inline assembly problems...
Hi there,
i have the following problem. I'm implementing my math library and i've decided to implement the matrix functions using sse instruction set.
Everything works right. UNTIL. I don't use the visual studio option "Optimize" in the release version.
Here is the problem riassumed:
Matrix4 Matrix4::operator * ( const Matrix4& m2 )
{
Matrix4 Ret;
__asm
{
....
....
Do the multiplication using sse instructions...
....
put the result in "Ret"
}
return Ret;
}
If I don't "Optimize" in the asm code the function correctly stores the results in the C++ Variable "Ret". And the correct result is returned.
IF I USE THE OPTIMIZE option of Visual Studio in the asm code is updated a memory region, but the function returns a different memory region!!!
For example in the asm code using the instruction:
lea eax, Ret
eax have a value of 0x1234
After the ASM block, if i set a breakpoint, the c++ "Ret" variable have the correct values stored in ( the ones that the asm block have set ).
BUT the returned value (the matrix returned from the function) have incorrect values...
It's impossible that at the end of the function Ret is correct but outside the function the Returned matrix is incorrect...
Thanks a lot
Dave
Ah!!! I Forgot an important thing that let me think about a bug in visual studio...
If, before the _asm block i create a Matrix4 pointer "*pRet" to the local variable "Ret" and in the _asm block I update the variable via that pointer then the function magically returns the correct result... Very Very strange.
There must be a logical explanation no?
If, before the _asm block i create a Matrix4 pointer "*pRet" to the local variable "Ret" and in the _asm block I update the variable via that pointer then the function magically returns the correct result... Very Very strange.
There must be a logical explanation no?
I ignore the correct answer to your question. If I had to make a guess, I would say that the compiler doesn't know that the Ret variable is used in the assembly and it optimizes it out. What you are looking as the "correct" result in the memory is probably causing some memory corruption in other data. Maybe that corruption has no side effects because it's probably unused stack memory corruption.
But may I suggest using intrinsics instead of assembly ?
The compiler is a LOT MORE friendly with sse intrinsics than it ever will be by treating assembly. It also makes it a lot easier for the compiler to use the most resources at it's disposition (ie : what registers are free before and after the function) and to make inlining as easy as possible.
Good luck
But may I suggest using intrinsics instead of assembly ?
The compiler is a LOT MORE friendly with sse intrinsics than it ever will be by treating assembly. It also makes it a lot easier for the compiler to use the most resources at it's disposition (ie : what registers are free before and after the function) and to make inlining as easy as possible.
Good luck
Quote:Original post by dawix
It's impossible that at the end of the function Ret is correct but outside the function the Returned matrix is incorrect...
99% says there is bug in the code.
1% says compiler is broken.
The "returned matrix" is not such a trivial operation. It involves memcpy, possibly invoking a copy constructor or assignment operator.
Quote:IF I USE THE OPTIMIZE option of Visual Studio in the asm code is updated a memory region, but the function returns a different memory region!!!
Alignment? Aliasing? Calling convention?
Quote:There must be a logical explanation no?
Yea, it has to do with lines 89-91 and the statement in line 114.
Seriously - what do you expect? The best anyone can answer is:
- There is a bug in your code
- There is a bug in compiler
Inline assembly is generally a bad idea because it can mess with the optimizer, you should either write the entire routine in assembly, or use compiler intrinsics.
In this case since it's a class, you should probably go with the compiler intrinsic, because if you want the class methods inlined you would have to write the entire class in assembly (because I don't think msvc can inline functions in object files that has been generated separately by an assembler).
In this case since it's a class, you should probably go with the compiler intrinsic, because if you want the class methods inlined you would have to write the entire class in assembly (because I don't think msvc can inline functions in object files that has been generated separately by an assembler).
Quote:Alignment? Aliasing? Calling convention?
RVO ?
dawix> I'd say, like the others, that you really should prefer intrinsics to pure inline assembly. OR, if you don't care about inlining, and you're not happy with visual studio's code generation, you have the option to write your whole function in assembly, in an external .asm file with a custom build step.
(unfortunately, bad asm code generation from intrinsics happens pretty often, even in vc2010, but there are chances that most of the time, it either won't make enough of a difference to be worthwile, or the flexibility offered by intrinsics to the compiler will outweight the crappy asm).
[Edited by - momotte on May 8, 2010 1:32:31 AM]
Calling convention: _cdecl, alignment: i don't know.
At this point doing some tests I start thinking that i don't know the assembly sintax because what happens it's not possible at all.
Someone give me an explanation for this:
Matrix4 Matrix4::operator * ( const Matrix4& m2 )
{
Matrix4 Ret;
Matrix4* pRet = &Ret
_asm
{
lea esi, Ret // shuld be the address of Ret and in the same call is 0x31F3E8
mov esi, pRet // should be the same address as we obtain above but is 0x31F5E0
}
now the TOP of the TOP, at the end of the asm block i try to do this:
Matrix4 temp = Ret;
Single steping in the disassembly the function clearly pass the address: 0x0x31F5E0 to the copy constructor.
THis means that if in the asm block I update using the address obtained via LEA esi, Ret the things are wrong, if I update using the address obtained via MOV esi, pRet the things are Right.
NOW the MAGIC QUESTION: But this two adresses sould be the same no?
At this point doing some tests I start thinking that i don't know the assembly sintax because what happens it's not possible at all.
Someone give me an explanation for this:
Matrix4 Matrix4::operator * ( const Matrix4& m2 )
{
Matrix4 Ret;
Matrix4* pRet = &Ret
_asm
{
lea esi, Ret // shuld be the address of Ret and in the same call is 0x31F3E8
mov esi, pRet // should be the same address as we obtain above but is 0x31F5E0
}
now the TOP of the TOP, at the end of the asm block i try to do this:
Matrix4 temp = Ret;
Single steping in the disassembly the function clearly pass the address: 0x0x31F5E0 to the copy constructor.
THis means that if in the asm block I update using the address obtained via LEA esi, Ret the things are wrong, if I update using the address obtained via MOV esi, pRet the things are Right.
NOW the MAGIC QUESTION: But this two adresses sould be the same no?
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement