Sign in to follow this  
fredrikhcs

OpenGL OpenGL Matrices, explanation?

Recommended Posts

I tried to read this (http://www.sjbaker.org/steve/omniv/matrices_can_be_your_friends.html), but it's very confusingly written in my opinion. For example, he writes, "Well, if we neglect the translation part (the bottom row)", and the very next thing he writes is "After that, you just add the translation onto each point so that: " but he doesn't add the bottom row, he adds something else. And stuff like that seems to be abundant on that page... So I was wondering, could someone give me an example of how to use it?

Share this post


Link to post
Share on other sites
IMHO the author of the cited article will say the following:

A (affine) transformation matrix as used by OpenGL has a principle layout like

[ x_x y_x z_x t_x ]
[ x_y y_y z_y t_y ]
[ x_z y_z z_z t_z ]
[ 0 0 0 1 ]
. If only rotation and translation is used, then those x_x ... z_z denote the rotation, and those t_x ... t_z the translation. (BTW: Scaling and shearing will also be encoded into the x_x ... z_z.)

Then the author claims that such a matrix can be split into the rotational part and the translational part this way:

[ x_x y_x z_x t_x ] [ 1 0 0 t_x ] [ x_x y_x z_x 0 ]
[ x_y y_y z_y t_y ] == [ 0 1 0 t_y ] * [ x_y y_y z_y 0 ]
[ x_z y_z z_z t_z ] [ 0 0 1 t_z ] [ x_z y_z z_z 0 ]
[ 0 0 0 1 ] [ 0 0 0 1 ] [ 0 0 0 1 ]

And due to a mathematical rule that says
( M1 * M2 ) * M3 == M1 * ( M2 * M3 )
one can interpret that splitted matrix T * R, although being applied to a point p in a single step
( T * R ) * p
as a two-step transformation
== T * ( R * p )
that rotates the point (i.e. R * p ) and translates the result (i.e. translates the rotated point). The order of effects if important.

Share this post


Link to post
Share on other sites
I'm not exactly an expert when it comes to matrices, but this is my take on the page;

In most programming, you would expect an array of 9 elements (lets say int array[9]) to be laid out (if you were using it as a square), like so;

[0][1][2]
[3][4][5]
[6][7][8]

So basically, your data is stored in what is known as row-centric order. This just means that address 1 is next to address 2.

Most mathematicians, and OpenGL, treat the data like so;

[0][3][6]
[1][4][7]
[2][5][8]

which is called a column-centric matrix. Notice the way the data is laid out differently to the row-centric matrix.

Now, a 3x3 matrix, like above, only allows you to rotate, scale and shear your object.

To keep the math as simple as possible, the matrix needs to be kept as a square, but we need some way to store the position of the object in the matrix. To do this, we make the matrix 4x4, which gives us (keeping the OGL format);

[0] [4] [8] [12]
[1] [5] [9] [13]
[2] [6] [10] [14]
[3] [7] [11] [15]


In this way, the translation goes in elements [3],[7] and [11], as x,y and z respectively.

Have a look at the matrix the writer shows you. It looks something like this;

[1][0][0]
[0][1][0]
[0][0][1]
[0][0][0] <-- this is the part you need to notice


The author has made the matrix a 4x3 matrix, rather than keep to the 4x4 matrix he used earlier. When he says to add the translation onto each point, he means take each value of the new position, and add it to the relevant value in the last line of the matrix. I hope this helps with some of your confusion about this.

Share this post


Link to post
Share on other sites
I know that matrix math isn't that easy to understand, but ifyou understand it you have much less problems interpreting how 3D works. Trust me. So please forgive me, but the following is again some matrix math ;)

If you compose a transformation matrix with OpenGL's API on OpenGL's so-called MODELVIEW matrix stack, you do something like
glLoadIdentity();
glTranslatef(tx,ty,tz);
glRotatef(alpha,ax,ay,az);

If you wonder why I use glLoadIdentity: It is useful to initiate OpenGL's matrix stack with a defined value. All those glRotate and glTranslate routines are ever _multiplying_ self onto what is already onto the stack; see below.

Mathematically this is described by composing a transformation matrix
( I * T(tx,ty,tz) ) * R(alpha,ax,ay,az) =: M
Herein I denotes the identity matrix, R the rotation resulting from glRotatef, and T the translation resulting from glTranslatef.

Notice the order of matrices in the formula from left to right and the order of API calls from top to bottom is the same!

Then you push a vertex into the API, e.g. (using the immediate mode)
glBegin(GL_TRIANGLES);
glVertex3f(px,py,pz);
glEnd();
with the above transformation being active that is hence applied to the vertex position, yielding in the transformed vertex position (I'm dropping the transformation arguments from here on since I'm too lazy to write them down, okay?)
p' := M * p == ( ( I * T ) * R ) * p

What does this mean? It means that OpenGL has a transformation on its stack, namely a composition of a translation and a rotation. This composed transformation is applied as a whole (notice the parantheses) to the vertex position.

Now let us inspect the effect of the transformation. Multiplying the identity matrix with another matrix has no effect, so that
== ( T * R ) * p

We already know that the parantheses play no role w.r.t. the result (see my previous post), so that we can _interpret_ the result (although it is not being applied in this way) as
== T * ( R * p )
what means nothing else than that the vertex position is rotated, and the rotated result is translated. Voila.

Now, isn't it the same as translating the vertex and rotating the result (i.e. the other order)? No, it isn't (in general)! You can simply construct an example: Say, you use
glTranslatef(0,1,0);
glRotatef(90,1,0,0);
glBegin(GL_TRIANGLES);
glVertex3f(0,1,0);
glEnd();
Using what is written above, OpenGL does
rotating [ 0,1,0 ] by 90 degrees around the x axis, yielding in [ 0,0,1 ]
translating [ 0,0,1 ] by [ 0,1,0 ] yielding in [ 0,1,1 ]

The other way
glRotatef(90,1,0,0);
glTranslatef(0,1,0); // <-- EDIT here was an error
glBegin(GL_TRIANGLES);
glVertex3f(0,1,0);
glEnd();
does
translating [ 0,1,0 ] by [ 0,1,0 ] yielding in [ 0,2,0 ]
rotating [ 0,2,0 ] by 90 degrees around the x axis, yielding in [ 0,0,2 ]
what obviously differs from the first result.

However, sometimes you apply a transformation that is already composed, using
glMultMatrixf(...);
instead of the particular transformations glRotatef, glTranslatef, ... Then, and that is the stuff of my previous post, one can decompose the transformation matrix and interpret it as
T * R
and that is, with the above explanations in mind, equivalent to
glTranslatef(...);
glRotatef(...);

_That_ is what the article said, as far as belonging to the cited section.

[Edited by - haegarr on August 10, 2008 1:17:01 PM]

Share this post


Link to post
Share on other sites
@webwraith: You're mainly correct, but not totally.

Considering that OpenGL uses column vectors, and furthur uses the 4th scalar of a (homogeneous) vector as the homogeneous co-ordinate, the translation must be in the right-most column but not in the bottom row. Hence, if using a 4x4 matrix, it looks like
[ 1  0  0  t_x ]
[ 0 1 0 t_y ]
[ 0 0 1 t_z ]
[ 0 0 0 1 ]
and for a 4x3 matrix, it looks like
[ 1  0  0  t_x ]
[ 0 1 0 t_y ]
[ 0 0 1 t_z ]

Share this post


Link to post
Share on other sites
Math is tricky. You must choose a methodological approach or you'll get lost in nowhere land in a blink of an eye. Chances are you give up learning math altogether if you choose the wrong approach in the beginning because in doing so, you'll eventually get to a point that you conclude "Well, Math is not my thing!". It is your thing my friend, but you need to choose a slow methodological approach or you'll get lost sooner than you expect it. You can't expect to learn 3D math (or any other field of science) by reading a tutorial or two, because Math is no picnic. You can't rush it. You need to invest a lot of time in learning it. It's tough, I know, but so is life and there is no easy way around either of them!

My apologies if that doesn't apply to you, in which case you can skip this post altogether, but if it does, please take a moment to consider it for your own good.

I don't care what those tutorials are advertising, but there is no fast way to learn Math or programming or any other scientific field. I hold a BSc. in Computer Science, have read a lot of math related books during the years and I do enjoy thinking about mathematical problems, so I'm neither the lazy guy nor do I lack proper trainings, but even I get stuck at math quite a lot, because as I said, math is no picnic.

I strongly suggest that you pick up a copy of "3D Math Primer for Graphics and Game Development" by Fletcher Dunn and Ian Parberry from your favorite library or bookstore and start reading it right away. It's the best elementary book on 3D math that I've come by. There IS a difference between reading 400 pages of 3D math and skimming over a couple of tutorials: you'd be much more knowledgeable when you invest more time and choose the right approach.

... and no, I'm not affiliated with the authors of this book in any way, shape or form. [smile]

Good luck!

[Edited by - Ashkan on August 10, 2008 7:28:14 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by haegarr
and for a 4x3 matrix, it looks like
[ 1  0  0  t_x ]
[ 0 1 0 t_y ]
[ 0 0 1 t_z ]


Minor detail: That's a 3x4 matrix.

Share this post


Link to post
Share on other sites
Quote:
Original post by Prefect
Quote:
Original post by haegarr
and for a 4x3 matrix, it looks like
[ 1  0  0  t_x ]
[ 0 1 0 t_y ]
[ 0 0 1 t_z ]


Minor detail: That's a 3x4 matrix.


I disagree. :)
Col x Row.

Share this post


Link to post
Share on other sites
Quote:
Original post by haegarr
@webwraith: You're mainly correct, but not totally.
...

Thank you for pointing that out, but the example on the page is the same as the 4x3 matrix that you can see in my post. Perhaps the confusion then lies in whether the author of that page is using row- or column-centric matrices at that point?

[Edited by - webwraith on August 17, 2008 2:37:39 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Rasmadrak
I disagree. :)
Col x Row.
Every reference I've ever seen on the topic uses the convention RowXCol. I'm pretty sure this convention is used more or less without exception, but if you can provide an example to the contrary I'd be interested to see it.

Share this post


Link to post
Share on other sites
Quote:
Original post by jyk
Quote:
Original post by Rasmadrak
I disagree. :)
Col x Row.
Every reference I've ever seen on the topic uses the convention RowXCol. I'm pretty sure this convention is used more or less without exception, but if you can provide an example to the contrary I'd be interested to see it.


This is also the form I have been taught, a matrix which has 4 rows (m) and 3 columns (n) is denoted an m*n or a 4*3 matrix.

Share this post


Link to post
Share on other sites
Quote:
Original post by webwraith
Do your references include OpenGL? :)
I assume this is in reference to my post? If so, then yeah, of course they include OpenGL :)

Again, I've never come across a reference (OpenGL or otherwise) that uses the convention Col-Row when referring to matrix dimensions or to individual matrix elements. Can you point me to a reference that contradicts this? (And I'm not being snide - if there is such a reference, I would really like to be aware of it!).

Just to eliminate potential confusion, note that we are talking here about a specific aspect of mathematical notation: whether the row or column is listed first when describing matrix dimensions or specifying a matrix element.

Note that vector notation (row or column) and matrix storage (row- or column-major) are entirely separate and unrelated issues. (I'm guessing this is where you're getting confused...)

Share this post


Link to post
Share on other sites
I didn't intend to derail this thread in such a way, but since it's at least partially on-topic... the choice of the Rows*Cols notation is not arbitrary.

Say you have two matrices A and B, where A is an m*n matrix and B is a k*l matrix. Then the matrix product A*B is defined iff n = k. When you write it down on paper, you'll see
 A * B
m*n k*l

It gets even more obvious when multiplying more than two matrices. The rule is sometimes formulated as "the inner dimensions of a matrix product must agree". So once you're used to it, everything flows naturally because the definitions are very consistent. [Also, think about how this works with vector-matrix or matrix-vector multiplication.]

Hope this helps in memorizing how things work.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this