Six222 453 Report post Posted October 23, 2013 (edited) I have a question about matrix multiplication order in HLSL. For example in C++ I do the following: XMMATRIX matFinal = matWorld * matView * matProj; This works correctly and I just upload the final matrix to the GPU and do position = mul(matWorld, position) But when I transfer each matrix individually and try and to the multiplication in the shader I doesn't work... Example: float4x4 matFinal = mul(mul(matWorld, matView), matProj); position = mul(matFinal, position); If anyone could explain or point me in the right direction that would be great Thanks. Edited October 23, 2013 by Six222 0 Share this post Link to post Share on other sites
Lactose 11446 Report post Posted October 23, 2013 (edited) To me, that looks like you need to change your multiplication nesting. That said, I could be horribly wrong, so if it doesn't work part 1 of the solution is probably to ignore me =) float4x4 matFinal = mul(matWorld, mul(matView, matProj)); Edited October 23, 2013 by CoreLactose 0 Share this post Link to post Share on other sites
Six222 453 Report post Posted October 23, 2013 To me, that looks like you need to change your multiplication nesting. That didn't work :( 0 Share this post Link to post Share on other sites
Zaoshi Kaba 8434 Report post Posted October 23, 2013 Try this: float4x4 matFinal = mul(matProj, mul(matView, matWorld)); But it's probably better to transform vertex thrice: position = mul(matWorld, position); position = mul(matView, position); position = mul(matProj, position); 1 Share this post Link to post Share on other sites
satanir 1452 Report post Posted October 23, 2013 Your ordering looks wrong. By default HLSL uses column major matrices, while in C they are row major. The code should be mul(proj, mul(view, world)). 4 Share this post Link to post Share on other sites
Six222 453 Report post Posted October 23, 2013 Your ordering looks wrong. By default HLSL uses column major matrices, while in C they are row major. The code should be mul(proj, mul(view, world)). Ah that worked, thanks! Is there any reason why the DirectX math library uses row major and then decides to switch in HLSL to column? 0 Share this post Link to post Share on other sites
satanir 1452 Report post Posted October 23, 2013 Ah that worked, thanks! Is there any reason why the DirectX math library uses row major and then decides to switch in HLSL to column? I don't know, probably legacy reasons. You can put '#pragma pack_matrix(row_major)' at the start of your shader, then you can use the same multiplication order as in C, but then you also need to change the multiplication order of the position and the matrix. 1 Share this post Link to post Share on other sites
Dynamo_Maestro 769 Report post Posted October 23, 2013 (edited) You can change it in your shaderflags before compiling the shader effect file See http://msdn.microsoft.com/en-us/library/windows/desktop/gg615083(v=vs.85).aspx Edited October 23, 2013 by Dynamo_Maestro 2 Share this post Link to post Share on other sites
cdoubleplusgood 895 Report post Posted October 24, 2013 (edited) Ah that worked, thanks! Is there any reason why the DirectX math library uses row major and then decides to switch in HLSL to column? Because DX math is for C / C++, and these languages, like most others, use row-major as "natural" layout: The elements of a row are consecutive in memory, assuming that the 1st index of a float[i][j] is the row, and the 2nd is the column. But the HLSL compiler by default assumes that one register (4 floats) contains one column (column-major). Assuming row vectors, the order of multiplication is this: v * M Here, the compiler can create very efficient code, because the multiplication is just 4 dp4 (dot product) instructions. However, the C code packed the matrix "wrong", and setting the shader constants puts a row into one register, not a column, and so the calculation yields nonsense. You have 3 options: 1. Change the order of multiplication, like already mentioned. This makes the compiler assume a column vector: M * v So you actually "cheat" by making an implicit matrix transpose. 2. Use the already mentioned compiler option so the compiler assumes that a register contains a row, not a column. However, the problem is that the compiler must create less efficient code here (4 vector * scalar and addition); this needs 3 instructions more. Option 3: Transpose the matrix before setting it as shader constant. Now it is actually column-major, the multiplication order is the same as it is in your C++ code, and the GPU code is optimal. Edited October 24, 2013 by cdoubleplusgood 1 Share this post Link to post Share on other sites
cdoubleplusgood 895 Report post Posted October 24, 2013 (edited) To me, that looks like you need to change your multiplication nesting. Matrix multiplication is associative, so the nesting cannot be wrong. Edited October 24, 2013 by cdoubleplusgood 0 Share this post Link to post Share on other sites
Lactose 11446 Report post Posted October 24, 2013 To me, that looks like you need to change your multiplication nesting. Matrix multiplication is transitive, so the nesting cannot be wrong. After sleeping and resetting my brain, I don't really know how I came up with my answer. That said, don't you mean matrix multiplication is associative? X(YZ) = (XY)Z 0 Share this post Link to post Share on other sites
cdoubleplusgood 895 Report post Posted October 24, 2013 That said, don't you mean matrix multiplication is associative? X(YZ) = (XY)Z Yes, of course. My mistake. 0 Share this post Link to post Share on other sites
Hodgman 51231 Report post Posted October 24, 2013 (edited) Regarding compiler optimizations, it's the same across the CPU and GPU -- where the GPU can pack one column per register to optimize multiplications, the CPU can also pack one column per SSE register. However, column-major isn't always the most efficient. When you have a vector (Vector4, float4, etc), you can either interpret it as a "column vector" (i.e. a Matrix4x1 - 4 rows and 1 column) or as a "row vector" (i.e. a Matrix1x4 - 1 row and 4 columns). Depending on which interpretation you use, the correct way to set up your transformation matrices is different, e.g. whether you put the translation in the 4th row or 4th column of a transform matrix. If you're treating your vectors as column vectors, then if you store your matrix values in column-major storage order, the compiler will be able to perform the above mentioned optimizations. Likewise if you're treating your vectors as row vectors then the compiler can produce better code if your matrix values are stored in row-major storage order. The choice is arbitrary, but it completely changes the correct order of multiplication when concatenating transform matrices, so it's best not to mix and match which interpretation you're using. When multiplying matrices, the number of columns in the left matrix has to match the number of rows in the right matrix. That means you can multiply a 2x3 with a 3x4, but it's not valid to multiply a 3x4 with a 2x3. So if you've got a 1x3 vector, you can transform it with a 3x4 matrix (with the vector on the left and the matrix on the right only). And if you've got a 3x1 vector, you can transform it with a 4x3 matrix (with the vector on the right and the matrix on the left only). Typically, mathematicians are taught to use the column-vector convention, where you write your vectors like this: ...but for some reason, early computer graphics packages often used the row-vector convention, where you'd write v = | x y z |... In the fixed-function days, GL chose the column-vector convention, whereas D3D chose the row-vector convention, hence D3DX's matrix classes still show this legacy. These days though, you're free to choose either convention, and to choose either storage layout in GLSL/HLSL (you can even mix it up and use column-vectors and create matrices designed to transform column-vectors, but store them in row-major order). It's confusing when some researchers are using one convention and others using a different convention, so these days I think most people are switching over to using column vectors. Confusingly, you can mix both conventions in the one application, and many D3D applications do do this. If you use D3DX to generate a matrix that's designed to transform a row vector (and is stored in row-major order), and you copy it over to HLSL (which by default is using column-major storage order), then that mix-up of the wrong storage convention being specified is actually just the same as transposing the matrix. If you've got a matrix that was designed to transform row-vectors, and you transpose it, you end up with a matrix that's now designed to transform column-vectors (now with the opposite multiplication order required). All this cancelling out means that you can be writing CPU-side code that's using one convention (and one correct order of multiplication), but when you send the variables over to the GPU, everything still works except now the correct order of multiplication has changed, because you've basically swicthed interpretation of your vectors and silently transposed your matrices. I wouldn't recommend this, as it's confusing, but many people end up doing this without realizing... Edited October 24, 2013 by Hodgman 7 Share this post Link to post Share on other sites
Six222 453 Report post Posted October 24, 2013 Thanks for the great replies guys! 0 Share this post Link to post Share on other sites