For 32-bit WindowsTo make full use of available SSE/SSE2 functionality, passing __m128 values (which implements XMVECTOR on that platform) to an inline routine, you need to do the following:
- Use the __fastcall calling conventions to pass the first three __m128 values (XMVECTOR instances) as arguments to a function in a SSE/SSE2 register.
- Pass on the stack any remaining __m128 values passed as arguments to a function.
The following are example declarations that illustrate this convention:
XMMATRIX XMMatrixLookAtLH(FXMVECTOR EyePosition, FXMVECTOR FocusPosition, FXMVECTOR UpDirection);
XMMATRIX XMMatrixTransformation2D(FXMVECTOR ScalingOrigin, FLOAT ScalingOrientation, FXMVECTOR Scaling, FXMVECTOR RotationOrigin, FLOAT Rotation, CXMVECTOR Translation);
To support these calling conventions, the FXMVECTOR and CXMVECTOR aliases are defined as follows:
For 32-bit Windows
typedef const XMVECTOR FXMVECTOR;
typedef const XMVECTOR& CXMVECTOR;
[/quote]
Question: Why aren't the functions declared using the __fastcall convention? E.g. XMMATRIX __fastcall XMMatrixLookAtLH(...). Is this not necessary and happens automatically or do they assume the global compiler settings to use __fastcall as default? How do I make sure that the first three arguments are passed in registers if I would write my own function similar to one found in the XNA math library?
Thanks,
-Dirk