how are 4x4 matrices faster?

This topic is 4497 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

Recommended Posts

Are 3D engines with 4x4 matrices faster than those with 3x3 matrices? If that is so, then it isn't in my case, because both my raytracing and rasterizer engine, work faster with 3x3 matrices and a pos vector, than with 4x4 matrices, simply because the 3x3 matrix case requires so much less multiplications. So how is it that they get the 4x4 engines faster? Is it maybe because they don't allow general 4x4 matrices, but make assumptions? Also, do such engines work with vectors of size 3 (where the 4th component is always seen as 1), or homogeneous ones with size 4 where the 4th component can be anything?

Share on other sites
Well, the general case is that when you have a 4x4 matrix and a 4-vector, transformation requires 16 muls and 12 adds.

With 3x3 matrix and pos offset and 3-vector you need 9 muls and 9 adds. (Just calculating in my head, check with pen & paper that it really adds up).

But the problem with 3x3 matrix transformation part and 3-vector translation part is that combining the geometrical transformations is a bit more pain. Back in the software engine days, we always used these 3x3 / 3 matrix/translation combinations, but nowadays it's just a lot more easier to use a single 4x4 matrix and 4D homogeneous vector to do the math, since the single 4x4 matrix can represent any transformation, translation or projection we can imagine in the 3D space. Using 3x3 matrices you can't represent perspective projection (you need homogeneous vector for that).

Usually the 3x3/3 engines work with plain old 3D x,y,z vectors and do the perspective projection manually without any matrix representation (because it's not even possible). No homogeneous stuff needed then.

Share on other sites
Also keep in mind that most of those engines use SIMD extensions, such as MMX, SSE(2)(3) and/or AltiVec.
With SSE for example you can do 4 multiplications/4 adds in just one single operation so it doesn't even make a difference if you just fill 3 or all 4 of the available data slots.
Now, keeping in mind what clb already wrote - that it is more convenient to use a 4x4 matrix, since it can represent all possible operations in 3D space - using 4x4 matrices instead of 3x3 matrices is really not that much of a big deal...

Share on other sites
Suppose you want to rotate a bunch of points around some centre, translate them, then project them onto a viewing surface.

With 4x4 matrixs, you can express the entire operation as one matrix. Pre-multiply the matrix, then apply it to each point.

With a 3x3 matrixs, ... you can't. 3x3 matrixes can't express "rotate around a point", they can only rotate around 0,0. They can't translate. They can't project.

Now imagine if your scene consists of a bunch of subscenes, translated and rotated and based off points in their parent scenes. With 4x4 matrix math, each subscene requires one matrix multiplication, and each point goes through one matrix mulitplication to figure out where it is in the end.

With 3x3 matrix multiplication, you have to do the translations. Every point has to be bubbled up manually through the scene graph -- it requires up to O(height of graph * number of points) matrix multiplications.

Share on other sites
I've never heard 4x4 matrices to be faster, and if someone told this to me I'd definitely refute it. Perhaps you misunderstood or someone else misinterpreted their use in modern graphics API.

The primary purpose of using a 4x4 matrix is to represent a transformation, which encompasses the entire amount of information needed to represent a 3D object. As NotAYakk mentioned, a 4x4 matrix will hold a translation (the position) and a rotation, as well as the objects final scale. So while it's not necessarily more efficient to do all your work using a 4x4 matrix (i.e. if calculating a number of rotations will be cheaper using just 3x3 matrices), presenting your 3d object to the video card as a single 4x4 matrix is a sure bet (and can be pretty fast if you align it's memory properly).

Share on other sites
4*4 is faster because 3d accelerater cards are designed for them.
and the reason they are designed that way is for the versitility in what transforms they can represent

1. 1
Rutin
25
2. 2
3. 3
JoeJ
19
4. 4
5. 5

• 10
• 11
• 9
• 9
• 10
• Forum Statistics

• Total Topics
631752
• Total Posts
3002095
×