Sign in to follow this  
YixunLiu

use mul(vector, matrix) or mul(matrix, vector) in HLSL

Recommended Posts

Hi,

In one DirectX example, I found it transpose the world transform with type XMMATRIX and then save it to XMFLOAT4x4, then send it to GPU using 

context->UpdateSubresource. On GPU side, the vertex position is transformed to world space by mul(vector, matrix), in which the vector is the vertex position and the matrix is the world transform matrix.

I think we have another way to do it. We do not conduct transpose on CPU side, and use mul (matrix, vector) in HLSL.

It looks this one should be more efficient because we do not need transpose on CPU side.

Am I right?

 

Thanks.

 

YL

Share this post


Link to post
Share on other sites
This is all convention, row major, column major. You are right that mul(vec, mat) == mul(traspose(mat), vec). The difference is how the shader do the math, either with dot products or mul and mads.

It used to be better to do the dot products, so the transpose. With our fat GPU that are scalar and not simd, it matters less.

But my advice, you have thousands of optimisation to do before that transpose become a problem unless of a very degenerative issue :)

Share this post


Link to post
Share on other sites
Write your math out on paper. This is a purely mathematical convention issue.
If the basis vectors within your matrices are written vertically, you use mul(matrix, vector), or if they're written horizontally, you use mul(vector, matrix).

However, there's also a computer-specific convention issue of how 2D arrays are stored in memory - row by row, or column by column.
If I remember correctly, the XM* classes use the row-by-row storage convention, while HLSL (by default) uses the column-by-column storage convention. Also, the XM* classes use horizontal basis vectors while HLSL works with horizontal or vertical ones.
If nothing was done to correct for this mismatch in storage conventions it would have the implicit effect of diagonally flipping (transposing) the matrix... Which is equivalent to switching to the other set of mathematical conventions. i.e. an XM* matrix has horizontal basis vectors, but if you pass it directly to HLSL, it ends up with vertical basis vectors. The transpose function can 'fix' this.

So, your example with the transpose function is using it to correct for the storage order mismatch.
And, your example without the transpose function is instead using opposite mathematical conventions in C and HLSL as a horrible way of tolerating the storage mismatch.

IMHO both are bad. Instead you should pick one storage convention and one mathematical convention and use the same choices everywhere.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this