Hi - I am having a bit of difficulty understanding exactly what the documentation is telling me on this... I am using Visual C++ 2010 express compiling for 32 or 64 bit implementation (Windows 7). I am using the XNA Math library (xnamath.h) and wish to write some of my own functions to handle more complex collision detection where I pass XMVECTOR parameters (going further than xnacollision.h). As I understand it, to achieve reasonable optimization, I should:
1) inline the functions
** This is because standard function call conventions are not great at allowing register passing of SIMD registers, so call overhead best avoided if poss...?**
2) specify _fastcall in function declaration.
3) pass FXMVECTOR for 1st 3 args and CXMVECTOR thereafter.
** 2 & 3 give the best chance of an optimal transfer of values within the SIMD registers if the inline is ignored by the compiler? **
** For x64 the docs seem to imply no possibility of SIMD register passing in a function call - so I have to hope it inlines... is that right? **
Thanks for any comments/explanations/insight.