Jump to content
  • Advertisement
Sign in to follow this  
QQemka

DirectXMath - storing transform matrices

This topic is 762 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello. I recently moved to DirectXMath and i got problem with PROPER storing xyz location, rotation scale etc. I want to compile in x86

So i read that XMMATRIX has to be 16 byte aligned - then i cant use it in dynamically allocated objects

At some point i will have to send the matrices to the pipeline and i do not want to store my data as raw float variables x,y,z,scalex etc and count the XMMATRIX on stack every render - i want to store it as matrix all the time to make operations on it easier. What should i do on x86? Is XMFLOAT4X4 the answer?

 

Then again - if i store it as XMFLOAT4X4 and want to rotate it, then i got to convert it to XMMATRIX, do the thing and store it back as XMFLOAT4X4 class member. Im very confused, but it seems the most proper way.

Thanks in advance

Edited by QQemka

Share this post


Link to post
Share on other sites
Advertisement

So i read that XMMATRIX has to be 16 byte aligned - then i cant use it in dynamically allocated objects

 

Instead of the new operator you can use _aligned_malloc to allocate properly aligned memory for the object and then manually call the constructor.

You also need to make sure that any XMMATRIX contained in the class is correctly aligned (you can probably use __declspec(align(16)) to enforce the alignment).

Share this post


Link to post
Share on other sites

This is Microsofts documentation on the library:

 

https://msdn.microsoft.com/en-us/library/ee415571.aspx?f=255&MSPPError=-2147217396

 

So to use the library "correctly", you will want to store XMMATRIX and XMVECTOR wherever you have control over alignment. When you do not have control over alignment, you will use XMFLOAT4 or XMFLOAT4X4 etc.

 

There is also nothing wrong with always storing XMFLOAT4 and XMFLOAT4X4 and converting to XMVECTOR or XMMATRIX when operations need to be done, There is just some extra overhead that you wouldn't get if you stored XMVECTOR or XMMATRIX where you can. There are libraries which you might need to pass the structures to where you do not have control over alignment, where you will need to pass an XMFLOAT4 or XMFLOAT4X4 for example.

 

From those docs above:

 

"Note  Some STL templates modify the provided type's alignment. For example, make_shared<> adds some internal tracking information which may or may not respect the alignment of the provided user type, resulting in unaligned data members. In this case, you need to use unaligned types instead of aligned types. If you derive from existing classes, including many Windows Runtime objects, you can also modify the alignment of a class or structure."

 

Where aligned types are XMMATRIX and XMVECTOR and unaligned types XMFLOAT4 and XMFLOAT4X4 (or even float[4])

Edited by iedoc

Share this post


Link to post
Share on other sites

If your intent is to store a vector / matrix in order to pass it to a shader, then using XMVECTOR / XMMATRIX would bring no practical benefit over XMFLOAT4 / XMFLOAT4X4. The cases when XMVECTOR / XMMATRIX should be used are when performing some computationally expensive math and need SIMD acceleration. Using unaligned data types would lead to unaligned load / store operations between RAM and SIMD registers, which can be much slower than their aligned counterparts, e.g. by a factor of four. As a rule of thumb: use XMFLOAT4 / XMFLOAT4X4 for storing data; use XMVECTOR / XMMATRIX for math and avoid converting intermediate results to unaligned data types to reduce the performance hit of unaligned loads / stores.

Edited by vanka78bg

Share this post


Link to post
Share on other sites

The simple option is to override global operator new and call _aligned_malloc (or similar) from it. If you do that you can make every heap allocation 16-byte aligned, with only a few lines of code in one place.

Doing that also means you can do things like putting aligned types in a std::vector.

Note that there's several variants of operator new and delete that you'll need to replace - you want to change the non-throwing variants as well.

 

Also note that if you compile as x64 instead of x86 then you don't need to do any of that. The standard heap allocation alignment on 64-bit Windows is 16 bytes.

Share this post


Link to post
Share on other sites

So the XMMATRIX uses special registers. Is it proper to return the XMMATRIX by value from my function, or is it equaly innefective as returning any struct by value?

Share this post


Link to post
Share on other sites

It is safe to return XMMATRIX and XMVECTOR from functions. It certainly won't be any slower than returning a struct by value, since the returned values are stored directly in the xmm registers. Returning a struct by value isn't always really slow either: https://en.wikipedia.org/wiki/Return_value_optimization (basically a pointer to the callers stack where it will store the returned value is passed to the function, and the function then writes directly into the callers stack at that address, which removes the need for a temporary)

Edited by iedoc

Share this post


Link to post
Share on other sites

I think what i had just wrote here might have been wrong. I know when i create a function that returns an XMMATRIX for example, it stores the rows in the xmm0 through xmm3 registers, but i'm running x64, which only has one calling convention, the vector calling convention. It might use that calling convention implicitly, i'm not sure. I had thought that non scalar types were put into the xmm registers when returned by default, but is that not always so? x86 might not do it without specifying the fastcall or vectorcall calling convention (which you can use XM_CALLCONV to get vectorcall if its supported, otherwise it uses fastcall)

 

I thought the vector calling convention only specified how the arguments are passed into the function

 

I found this, but its only for x64:

https://msdn.microsoft.com/en-us/library/7572ztz4.aspx

 

x86 might be different, but couldn't find anything on it.

Edited by iedoc

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!