Calculating the Final Vertex Position with MVP Matrix and another Object's Trans, Rotation, Scaling Calc...

Started by
11 comments, last by haegarr 9 years, 11 months ago

1.) Situation A: We have sub-meshes. This is usually the case if the model has more than a single material and the rendering system cannot blend materials. So the model is divided into said sub-meshes where each sub-mesh has its own material. Further, each sub-mesh has its local placement S relative to the model. If we still name the model's placement in the world M, then the local-to-world matrix W of the sub-mesh is given by

W := M * S

2.) Situation B: We have sub-meshes as above, but the sub-meshes are statically transformed, so that S is already applied to the vertices during pre-processing. The computation at runtime is then just

W := M

3.) Situation C: We have models composed by parenting sub-models, e.g. rigid models of arms and legs structured in a skeleton, or rigid models of turrets on a tank. So a sub-model has a parent model, and the parent model may itself be a sub-model and has another parent model, up until the main model is reached. So each sub-mesh has its own local placement Si relative to its parent. Corresponding to the depth of parenting, we get a chain of transformations like so:

W := M * Sn * Sn-1 * … * S0

where S0 is the sub-model that is not itself a parent, and Sn is the sub-model that has not itself a parent.

4.) View transformation: Now having our models given relative to the global "world" space due to W, we have placed a camera with transform C relative to the world. Because we see in view space and not in world space, we need the inverse of C and hence yield in the view transform V:

V := C-1

5.) The projection P is applied in view space and yields in the normalized device co-ordinates.

6.) All together, the transform so far looks like

P * V * W

(still using column vectors) to come from a model local space into device co-ordinates.

Herein P changes perhaps never during the entire runtime of the game, V changes usually from frame to frame (letting things like portals and mirrors aside), and W changes from model to model during a frame. This can be realized by computing P * V once per frame and using it again and again during that frame. So the question is what to do with W.

7.) Comparing situation C with situation A shows that A looks like C for just a single level. Situation A would simply allow for setting the constant M once per model, and applying the varying S per sub-mesh (i.e. render call). Using parenthesis to express what I mean:

( ( P * V ) * M ) * S

But thinking in performance of render calls where switching textures, shaders, and whatever have their costs, especially often switching materials may be a no-go, so that render calls with the same material are to be batched. Obviously this contradicts the simple re-use of M as shown above. Instead, we get a situation where for each sub-mesh the own W is computed on the CPU, and the render call gets

( P * V ) * W

BTW, this is also the typical way to deal with parenting, partly due to the same reasons.

Another drawback of using both M and S is that either you supply both matrices also for stand-alone models (for which M alone would be sufficient), or else you have to double your set of shaders.

So in the end it is usual to deal with all 3 situations in a common way: Computing the MODEL matrix on the CPU and send it to the shader. (Is just a suggestion) ;)

Advertisement

Thank you for this write-up!

I will go through it in detail but I wanted to say thank you first.


I was calculating the normal matrix against the entire MVP matrix.

Would I calculate this matrix instead again the ViewProjection matrix?

How exactly the normal matrix is to be computed depends on the space in which you want to transform the normals. Often normals are used in eye space for lighting calculations. This means that the projection matrix P does not play a role when computing the normal matrix.

Normals are direction vectors and as such invariant to translation. That is the reason why the rotational and scaling part of the transformation matrix is sufficient to deal with. So let us say that

O := mat3( V * W )

defines that portion in eye space. (Maybe you want to have it in model world space, in which case you would drop V herein.)

The correct way of computing the normal matrix is to apply the transpose of the inverse:

N := ( O-1 )t

In your situation you want to support both rotation and scaling (as said, it is invariant to translation, so no need to consider translation here). With some mathematical rules at hand, we get

( O-1 )t = ( ( R * S )-1 )t = ( S-1 * R-1 )t = ( R-1 )t * ( S-1 )t

Considering that R is an orthonormal basis and S is a diagonal matrix along the main diagonal, this can be simplified to

N = R * S-1

From this you can see that ...

a) … if you have no scaling, so S == I, then the normal matrix is identical to the mat3 of the model-view matrix.

b) ... if you have uniform scaling, i.e. scaling factor in all 3 principal directions is the same, then the inverse scaling is applied to the normal vector, which can also be seen as a multiplication with a scalar:

S-1 = S( 1/s, 1/s, 1/s ) = 1/s * I

This is the case mentioned by Kaptein above. The difference is that Kaptein mentions that instead of ensuring the length of the normal vector by the matrix transformation itself, you can use the mat3 of the model-view matrix as normal matrix and undo the scaling happening by this simply by re-normalization of the resulting vector.

c) … if you have non-uniform scaling, i.e. scaling factor in all 3 principal directions is not the same

S-1 = S( 1/sx, 1/sy, 1/sz )

then you have no chance but either compute the transposed inverse of O or else use compositing of R and S-1 (if those are known).

This topic is closed to new replies.

Advertisement