Sign in to follow this  
Nairou

Pros and cons of matrix order?

Recommended Posts

Nairou    430
Okay so I know D3D matrices use row order and OGL matrices use column order. D3D's row order always made sense to me since it put the translation cells in a row, but mostly it's what I'm accustomed to. I just figured it was a matter of preference which method you used, and that there was no different ultimately. But, now, I've run into two cases where this doesn't appear to be the case. When dealing with HLSL shaders, whenever you pass a matrix as constants to the shader, the matrix has to be transposed. Likewise, when performing matrix math with quaternions, I've noticed several examples saying the matrix has to be transposed if it's not in OGL column order. So... if the math needs it in column order, and the shaders need it in column order, why does D3D use row order? Why set it up to require a matrix transpose all the time? It's possible I'm misunderstanding something, but I'd like to figure this out, particularly since I'm using my own matrix/quaternion library and have the option of changing the order it uses. Right now it uses row order like D3D, but only out of familiarity. What I'm looking for is an explanation of why D3D appears to do this very basic stuff differently from what the math and hardware seem to require (or, how it is really not doing it differently and why it appears that way) and whether there are compelling reasons to change to column order. Thoughts?

Share this post


Link to post
Share on other sites
Defrag    175
I don't understand the discrepancy between D3D & HLSL's row/column major order, either. I've read that column major is meant to be more efficient, it just strikes me as odd that two linked systems use different ordering.

Share this post


Link to post
Share on other sites
hogwash    139
You can avoid the transpose by swapping the multiplication order in the shader.

Eg.

Row-major:
oPosition = mul(iPosition, cWorldViewProjection);

Column-major:
oPosition = mul(cWorldViewProjection, iPosition);

You can also use a compiler directive with HLSL to specify the matrix packing order.

Eg.
#pragma pack_matrix(row_major)

or

#pragma pack_matrix(column_major)

The reason why column-major is more efficient, is 2-fold. Firstly the generated shader ASM for column-major will have less instruction registers, which only requires 3 DP's compared with a MUL and 3 MADD's for row-major. Secondly column-major will use less constant registers to perform the operation, which requires none compared with 3 for row-major.

Cheers,
Tom.

Share this post


Link to post
Share on other sites
Nairou    430
Thank you, hogwash, for those tips. Though if column order is more efficient, and D3D is the only one that uses row order, I guess my question is "Why?". What advantage does row order give to make D3D use it over what OGL and the hardware use? Is there some disadvantage to column order as well? Basically, is there any real reason not to change my matrix library to use column order instead?

I wish I had a reference which explained this in detail, the differences between the two methods and why...

Share this post


Link to post
Share on other sites
AndyTX    806
Quote:
Original post by Nairou
Though if column order is more efficient, and D3D is the only one that uses row order, I guess my question is "Why?".

One or the other is not "more efficient" overall - it depends on the operation. Indeed in almost all cases you can simply "rotate" your operations to accommodate one ordering or the other. Multiplying matrices on the other side is just one example. Matrix-vector multiplication becomes either a series of dot's or a series of mul/mad's depending on your matrix ordering - both are plenty efficient.

Note that the efficiency differences are only for non-square (4x3 or 3x4) matrices, and they are minor. Also note that they are related both to your choice of element ordering *and* to your choice of multiplication side. Just align the two properly:

Ex. left projection

float4 v;
float4x3 M;

float3 p = mul(v, M);

=> column major
p.x = DOT(v, M[col1]) // dot(4, 4)
p.y = DOT(v, M[col2]) // dot(4, 4)
p.z = DOT(v, M[col3]) // dot(4, 4)

=> row major
p = MUL(v.x, M[row1]) // mul(1, 3)
p = MAD(v.y, M[row2], p) // mad(1, 3, 3)
p = MAD(v.z, M[row3], p) // mad(1, 3, 3)
p = MAD(v.w, M[row4], p) // mad(1, 3, 3)


Ex. right projection

float4 v;
float3x4 M;

float4 p = mul(M, v);

=> column major
p = MUL(v.x, M[col1]) // mul(1, 3)
p = MAD(v.y, M[col2], p) // mad(1, 3, 3)
p = MAD(v.z, M[col3], p) // mad(1, 3, 3)
p = MAD(v.w, M[col4], p) // mad(1, 3, 3)

=> row major
p.x = DOT(v, M[row1]) // dot(4, 4)
p.y = DOT(v, M[row2]) // dot(4, 4)
p.z = DOT(v, M[row3]) // dot(4, 4)


A few notes:

- neither column nor row major is "intrinsically better" here... it depends on your left/right projection choice.

- with 4x4 matrices, they are pretty much the same. It's only with 4x3 or 3x4 that it starts to matter a bit (and even then, it's minor).

- On the G80 it doesn't really matter either way. The only reason why one is better on other cards is because if you're not using 4-vector operations, you're wasting some processing power. The G80 is (thankfully, finally) a scalar processor so you don't need to worry about such nonsense anymore.

In any case the speed difference is going to be an absolute non-issue on modern hardware until you're multiplying tends or hundreds of matrices in a single shader (perhaps for skinning). Then simply make sure that you're using a 4x3 or 3x4 matrix instead of a 4x4 one.

Quote:
Original post by Nairou
What advantage does row order give to make D3D use it over what OGL and the hardware use?

The convention of right projecting allows for naturally composing transformations with "*=", skipping a temporary and a copy on the host. It's a very minor point, but sometimes useful.

Quote:
Original post by Nairou
Basically, is there any real reason not to change my matrix library to use column order instead?

Don't bother changing things unless you're inconsistent! All you want to avoid is converting where possible. Just stick with a single convention and you'll be fine.

Honestly the only *real* case where I'd say to choose one over the other is if you're using SSE instructions on the CPU in which case you want to choose the "MUL/MADD" case rather than the "DOT" one (which isn't supported directly). On GPUs both are efficient, so either is fine.

[Edited by - AndyTX on December 12, 2006 10:16:17 AM]

Share this post


Link to post
Share on other sites
Nairou    430
Very interesting! Just the sort of explanation I was hoping for, I'll be reading that through a few times to let it sink in. Thank you!

Share this post


Link to post
Share on other sites
ajas95    767
You should note that for CPUs, column-major is better if you choose to do matrix multiplies in SSE. And in this case, there's no easy way to simply "flip" the math like in shaders.

Share this post


Link to post
Share on other sites
Nairou    430
Quote:
Original post by ajas95
You should note that for CPUs, column-major is better if you choose to do matrix multiplies in SSE. And in this case, there's no easy way to simply "flip" the math like in shaders.


Hmm. I've read that the D3DX library is SSE optimized. If that's true, and if column-major is more SSE friendly, then why would it use row-major ordering?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this