Jump to content
  • Advertisement
Sign in to follow this  
Quat

General XNA Math Usage

This topic is 2500 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

My main question is if the XMLoad/XMStore functions are almost free with their intrinsic operation?

The reason being is that I am using XMFLOAT3/XMFLOAT4 variants as member data, and then use the Store/Load functions whenever I need to do calculations, per the documentation advice.

However this gets annoying very quickly. If I have a camera class and want to do some operation, I need to Load, do calculation, then store back. A client of the camera class might want the camera's right vector, so it gets the XMFLOAT3 version, then has to load/store again.

I'm just worried that all the load/storing overhead offsets the gain. And it makes we wonder if I should extend XMFLOAT3/XMFLOAT4 for lightweight vector operations, and just use XMVECTOR when I have heavy duty calculations done in a loop.

Share this post


Link to post
Share on other sites
Advertisement
First off, I am not a CPU SIMD master so feel free to take anything I say with a grain of salt.

XNAMath is really a pretty light wrapper around CPU SIMD instruction sets (SSE/SSE2/etc.), and the usage of the API reflects that. It has you do explicit loads and stores when preparing to do SIMD math operations, because that's what you have to do when working with SSE. If you look at the code for those functions, you can see exactly what it's doing. For instance:


XMFINLINE XMVECTOR XMLoadFloat3
(
CONST XMFLOAT3* pSource
)
{
#if defined(_XM_NO_INTRINSICS_)
XMVECTOR V;
XMASSERT(pSource);

((UINT *)(&V.vector4_f32[0]))[0] = ((const UINT *)(&pSource->x))[0];
((UINT *)(&V.vector4_f32[1]))[0] = ((const UINT *)(&pSource->y))[0];
((UINT *)(&V.vector4_f32[2]))[0] = ((const UINT *)(&pSource->z))[0];
return V;
#elif defined(_XM_SSE_INTRINSICS_)
XMASSERT(pSource);

#ifdef _XM_ISVS2005_
// This reads 1 floats past the memory that should be ignored.
// Need to continue to do this for VS 2005 due to compiler issue but prefer new method
// to avoid triggering issues with memory debug tools (like AV)
return _mm_loadu_ps( &pSource->x );
#else
__m128 x = _mm_load_ss( &pSource->x );
__m128 y = _mm_load_ss( &pSource->y );
__m128 z = _mm_load_ss( &pSource->z );
__m128 xy = _mm_unpacklo_ps( x, y );
return _mm_movelh_ps( xy, z );
#endif // !_XM_ISVS2005_
#elif defined(XM_NO_MISALIGNED_VECTOR_ACCESS)
#endif // _XM_VMX128_INTRINSICS_
}


You can see it's just wrapping SSE functions for loading data into SIMD registers. Making this an explicit part of the API makes it annoying to work with for small bits of ad-hoc math code, but I would imagine also allows you to write more efficient code for the cases where SIMD can actually be useful (like vectorizing lots of math on long arrays of data). It would be nice if there were non-SSE implementations of the math functions that you could use for more general cases, or perhaps some wrappers to make the loading/storing stuff less painful for cases where you don't care as much about the performance.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!