(shader) Matrix Packing in DirectX9.0c

Started by
3 comments, last by SnakeHunta 19 years, 5 months ago
Ok, my question is what is the difference between row-major packing and column major packing when setting vertex shader constants? Now I know that row-major is like so:

11  12  13  14
21  22  23  24
31  32  33  34
41  42  43  44
and column-major is:

11  21  31  41
12  22  32  42
13  23  33  43
14  24  34  44
I know that a D3DXMATRIX is row-major and that it needs to be transposed before doing: SetVertexShaderConstantF(0, &mat, 4); but then I look at D3D documentation and see that column-major packing is default and that this only affects the way that the constant registers receive the data in the matrices. As I understand, when column-major packing is enabled, c0 will get column one, c1 will get column 2, etc. But since we transposed the matrix this is the equivalent of taking all the rows from a row-major matrix, which is not what we want. So then, am I right to believe that setting the vertex shader constants puts the rows of the matrix in each constant? If I am, then why the heck does D3D support changing row-major or column-major? [Edited by - SnakeHunta on October 25, 2004 8:58:24 PM]
Advertisement
Hmmm... does anyone know. If not I guess I'll have to go find someone from microsoft. This is really bothering me even though it doesn't really matter as long as you do things correctly.

[edit]

So am I correct in saying that SetVertexShaderConstant() takes the rows of the matrix and places them in successive shader constants?

Ok here is my understanding of everything since I have had a long time to think about this with no help whatsoever except this one reply :)

SetVertexShaderConstant() takes the rows the matrix you give it and places each row in consecutive shader constants, thus the reason for trasnposing. Now, because our default shader setting is column-major matrix, the shader assumes that the columns are stored in the shader constants, which is what we have after transposition. However, if we have row-major matrix packing set there is no need to transpose the matrix before giving it to the shader and thus the shader needs to do the calculations a little differently because of the shader standard of column-major packing.

Am I right or am I wrong?
When transforming a vertex by a 4x4 matrix, you interpret your vector as a 1x4 matrix. That is, you have 1x4 * 4x4 = 1x4
During this multiplication, you you dot the vector with each column of the matrix. The dot product of the vector with the first column gives you the 'x' component of the output. The dot with the second column gives you the 'y' component, and so on.

As you can see, this transformation can be done in 4 dot products in shaders *only* when columns are stored as constants. That's why matrices should be stored in column-major ordering in shaders.

That is why when you're using SetVertexShaderConstantF, you have to transpose your matrix first. SetVertexShaderConstantF is dumb, it doesn't know you're setting a matrix, so it won't transpose it for you. So you do that yourself to get each matrix column in a constant register.

Note, however, that SetMatrix isn't dumb. It transposes your matrices for you.

Quote:Original post by SnakeHunta
Ok, my question is what is the difference between row-major packing and column major packing when setting vertex shader constants? Now I know that row-major is like so:

Err...Seems like I answered another question with my previous post. Anyway:

The difference as the docs say, is in the way the effect compiler interprets/reads the matrix data from the constant table into the registers.

If you tell it it's column-major packing, it'll assume each constant contains a column, and thus transformations can easily be done as 4 dot products.

If you tell it it's row-major packing, it'll assume so, and would generate code for transformations based on that. The documentation cites an example:

float4x3 World;float4 main(float4 pos : POSITION) : POSITION{    float4 val;    val.xyz = mul(pos,World);    val.w = 0;    return val;}


With column-major packing, you get this asm:
vs_2_0def c3, 0, 0, 0, 0dcl_position v0m4x3 oPos.xyz, v0, c0mov oPos.w, c3.x// approximately four instruction slots used


IIRC, m4x3 is a macro that's equivalent to 3 dot products, but its use clarifies the purpose of the code (to do a transformation).

With row-major packing:
vs_2_0def c4, 0, 0, 0, 0dcl_position v0mul r0.xyz, v0.x, c0mad r2.xyz, v0.y, c1, r0mad r4.xyz, v0.z, c2, r2mad oPos.xyz, v0.w, c3, r4mov oPos.w, c4.x// approximately five instruction slots used

As you can see, the compiler had to do the transformation in a different way (not 3 dot products), which took more instructions lots.

hmmm yes, I get it now LoL. I guess I should've done the math to see what that example was talking about in the first place since I saw it in the SDK. Oh well, my understanding of it has been enhanced ten fold so it is good I took time to figure it out and ask. Thank you very much Coder for your replies :) It is much appreciated.

This topic is closed to new replies.

Advertisement