Sign in to follow this  
belfegor

Homogenous space

Recommended Posts

I am reading book "3D Math primer for graphics and game developement", and i am stuck at this topic as it does not explain it well.


To understand how the standard physical 3D space is extended into 4D, let’s first examine
homogenous coordinates in 2D, which are of the form (x, y, w). Imagine the standard 2D plane
existing in 3D at the plane w= 1. So the physical 2D point (x, y) is represented in homogenous
space (x,y, 1). For all points that are not in the plane w= 1, we can compute the corresponding 2D
point by projecting the point onto the plane w= 1 and dividing by w. So the homogenous coordinate (x,y,w) is mapped to the physical 2D point (x/w,y/w).

 

1. Where this w = 1 come from?

2. If some point is outside of this plane w = 1 then it have different w value?

3. Is w absolute distance to plane?

4. If i divide points with w which is 1 then i get same points, so what is the point?

5. Any other book you could recommend that explains this w thingy better?

 

Thank you for your time.

Share this post


Link to post
Share on other sites

Not a mathematical explanation, but a practical:

 

Its useful when doing the perspective transformation.

This w is what is responsible for things further away looking smaller, and things closer looking bigger.

For 3D graphics, you let w=1 be the projective plane, the screen, and points further away will have w > 1, and closer will have w < 1.

The division will map points from the projective space to screen space.

 

Hope that gives some hint

Edited by Olof Hedman

Share this post


Link to post
Share on other sites

First of all, the correct (or at least more common) name of this space is (real) projective space and not homogeneous space. An homogeneous space is usually something different in mathematics*. I've never seen the term homogeneous space used in this context.

 

The term projective space comes from the fact it is the smaller/more natural space in which (central) projections can be defined. You can visualize this space as the set of lines passing through the origin in the (real) vector space one dimension larger. So the projective plane (the projective space of dimension two) is for example the set of 3D lines in the form

 

P(t) = t * (X, Y, Z).

 

Each line can be identified by any of its non-zero points. The homogeneous coordinates of a point of the projective space are the coordinates of one of the non-zero points of the corresponding lines and are written between square brackets (and often separated by a colon) [X0 : X1 : ... : Xn]. These homogeneous coordinates are not unique: two coordinates which differ by a non-zero scalar multiple represent the same point. The division by W is simply a way to choose a unique representative of a point. You basically represent all lines by their intersection with the plane W=1. All the lines parallel to that plane cannot be represented in this way. They are called points at infinity and they usually represent directions.. Note that dividing by W is just a convention and it is possible to divide by any other coordinate or use any other general plane. 

 

So, why using a space like that for projections? The main reason is that a projection transform a point with coordinates [x : y : z : 1] to some point [X : Y : Z : W] where W is no longer equal to one.. It is thus necessary to divide by W to get back a 3D point.. This is the main reason homogeneous coordinates have been defined and used in computer graphics.

 

The projection matrices are usually chosen so that the W coordinates basically represents the depth of the transformed point and the view frustum is transformed to the cube with all three coordinates inside [-1,1] after W divide (there are actually different conventions here between DirectX and OpenGL here..). But this is just a useful convention.

 

* It is usually used to denote some kind of space with a transitive G-action.. 

Edited by apatriarca

Share this post


Link to post
Share on other sites

@Olof Hedman

Thank you for nice explanation.

 

@apatriarca

Well, that is all nice what you have wrote, i reread it multiple times and i still can't get it.

 


First of all, the correct (or at least more common) name of this space is (real) projective space and not homogeneous space. An homogeneous space is usually something different in mathematics*. I've never seen the term homogeneous space used in this context.

I don't care about math concepts, just want to learn basic math application used in graphics and game development.

"Homogenous space" was the topic label in the book, how could i know which is correct name?

 


The homogeneous coordinates of a point of the projective space are the coordinates of one of the non-zero points of the corresponding lines and are written between square brackets (and often separated by a colon) [X0 : X1 : ... : Xn]. These homogeneous coordinates not unique: two coordinates which differs by a non-zero scalar multiple represents the same point...

...transitive G-action..

I am totally lost from here, too much abstract and generalized. blink.png

Do you realize this thread is in "For beginners" forum, so could you possibly put this in easily understandable way?

Maybie i could visualize if i have some practical example, so i could know in what space/ranges these points/directions are and see why this division by w  mean/results out.

 

I have a feeling that most of you on this forum have a need to brag with your knowledge but that doesn't help us who want to learn something. unfortunately most of the authors who wrote books have that need too which makes learning a lot harder then it needs to be. No disrespect meant.

 

Thank you for understanding and time.

Share this post


Link to post
Share on other sites

First of all, the correct (or at least more common) name of this space is (real) projective space and not homogeneous space.

Yeah, AFAIK, the word "space" doesn't belong there. We're contrasting 4D homogeneous coordinates to 3D/2D Cartesian coordinates.

 

Personally, I call it post-projection space, and after the divide by w, I call it NDC-space/clip-space.

 

We (programmers) generally use "blah-space" to refer to some particular "basis" (view-space, world-space, post-projection space, etc...), but "w" comes into play not just because we're using some particular basis/space, but because we've switched from one coordinate system to another.

 

i.e. When working in object-space, view-space or world-space, we generally use 3D cartesian coordinates to identify points. When we're dealing with post-projection-space we switch over to using 4D homogenous coordinates to represent our points. We then transform them back to 2D cartesian coordinates in screen/viewport-space to identify pixels.

 

The reason we switch over to using 4D homogenous coordinates in order to implement perspective is because all our linear algebra (matrix/vector math) is... well... linear... whereas perspective is a non-linear effect! i.e. you want to scale something with "1/distance", which when you plot it, it isn't a straight line (not linear).

By working with 4D coordinates, where we say "we will divide by w later on", then we can continue to use linear algebra (matrices) to operate in a "non-linear space" (laymens terms, not formal math terms tongue.png).

Edited by Hodgman

Share this post


Link to post
Share on other sites

I understand that "homogeneous space" is how your book call it, but if you search that on google you get something completely different. "projective space" is a much better search query. This is the main reason I have introduced that name in my previous post.

 

I think you should simply look for something like "homogeneous coordinates computer graphics" on google. There are hundreds of beginner articles about these things already and it is not possible to explain everything is a simple forum post. If you don't care about mathematical concepts, then you should probably just understand how perspective projections are derived and basically consider the W division step as a technical trick. Indeed, if you compute the perspective projection equations, you will see that it is necessary to divide by z at some point. You can't divide by a coordinate using matrices. The trick is thus to write the denominator in a fourth coordinate and then divide by it after the matrix multiplication.

 

The additional coordinate is actually also used to make it possible to use matrices to represent translations but this has nothing to do with the notion of homogeneous coordinates. 

Edited by apatriarca

Share this post


Link to post
Share on other sites

Yeah, AFAIK, the word "space" doesn't belong there. We're contrasting 4D homogeneous coordinates to 3D/2D Cartesian coordinates.

 

Personally, I call it post-projection space, and after the divide by w, I call it NDC-space.

 

We (programmers) generally use "blah-space" to refer to some particular "basis" (view-space, world-space, post-projection space, etc...), but "w" comes into play not just because we're using some particular basis/space, but because we've switched from one coordinate system to another.

 

i.e. When working in object-space, view-space or world-space, we generally use 3D cartesian coordinates to identify points. When we're dealing with post-projection-space we switch over to using 4D homogenous coordinates to represent our points. We then transform them back to 2D cartesian coordinates in screen/viewport-space to identify pixels.

 

The reason we switch over to using 4D homogenous coordinates in order to implement perspective is because all our linear algebra (matrix/vector math) is... well... linear... whereas perspective is a non-linear effect! i.e. you want to scale something with "1/distance", which when you plot it, it isn't a straight line (not linear).

By working with 4D coordinates, where we say "we will divide by w later on", then we can continue to use linear algebra (matrices) to operate in a "non-linear space" (laymens terms, not formal math terms tongue.png).

 

 

Yes, I know. But homogeneous space is not a particularly useful search query in any case. I've actually always used the terms clip(ping) coordinates (and thus clip(ping) space) and normalized device coordinates (NDC) (and thus NDC-space) to denote the coordinates post-projection and post W-division in the graphics pipeline. The homogeneous coordinates has always been a more general term for me.

 

I understand that the way we use the homogeneous coordinates in the graphics pipeline is often not mathematically correct. We don't really care about what a projective space really is and what operation we can do on these coordinates. Some more advanced theory is however in my opinion sometimes useful to understand, for example, what is preserved by a projective transformation.

Edited by apatriarca

Share this post


Link to post
Share on other sites

Hello

 

The 2 only needed things to know about homogeneous coordinates :

  • w=0 for direction 4D vectors (i.e. normals)
  • w=1 for position 4D vectors (i.e. vertices)

When rendering, the GPU will itself do the w-division, you don't need to do it yourself in this context.

If you need to do yourself calculations on 4D vectors, be sure to do yourself the w-division at the end if you have (at least) one perspective projection matrix in your transformation.

 

Hope it's clear rolleyes.gif

Edited by Tournicoti

Share this post


Link to post
Share on other sites

What I tried to say was if there is at least one perspective projection matrix in your transformation matrix, the w-division is needed. (Edit : because w is no more 0 or 1 !)

With projective spaces,the transformation  is not only multiplying matrices. It's dividing the homogeneous vector by w at the end to get the equivalent 3D cartesian vector(x,y,z).

Edited by Tournicoti

Share this post


Link to post
Share on other sites

If w is is not 0 then it represents the point (x/w, y/w, z/w). So it is a point.

 

It's like how 1/2 and 2/4 both represent the same number, 0.5.

 

Don't worry about projecting onto a plane, that's just a way of visualising projective space in 2d using 3d homogeneous coordinates. You can't visualise the projection in 3d with 4d homogeneous coordinates because the projection is onto the entirety of 3d space instead of a plane.

Share this post


Link to post
Share on other sites
The 2 only needed things to know about homogeneous coordinates :
  • w=0 for direction 4D vectors (i.e. normals)
  • w=1 for position 4D vectors (i.e. vertices)
[...]

 

w=0 for directions, but normals are not directions. Normals are co-vectors, which means that the way to transform them by an affine transformation is to apply to them the transpose of the inverse of the endomorphism (a.k.a., the 3x3 sub-matrix that doesn't involve the extra coordinate), and you may want to renormalize after that. If you are only dealing with orientation-preserving isometries (rotations and translations), you are lucky and both computations agree; but if you allow for non-uniform scalings or other non-isometric transforms, you'll see that your normals get messed up.

 

A proper vector (e.g., a translation, or the difference between two positions) is not strictly speaking a projective point either, but at least the arithmetic of using w=0 does work out.

 

A directional light, for example, does have a direction which is a projective point with w=0.

Edited by Álvaro

Share this post


Link to post
Share on other sites

I do not have need to scale something that is using normals so far, as i do that in modeling program, and i read somewhere else on this forum that is not necessary to transform normal with inverse & transpose if i don't use scale. Is it necessary to do renormalization after this, or normals preserve "unit-ness"?

Edited by belfegor

Share this post


Link to post
Share on other sites

I do not have need to scale something that is using normals so far, as i do that in modeling program, and i read somewhere else on this forum that is not necessary to transform normal with inverse & transpose if i don't use scale. Is it necessary to do renormalization after this, or normals preserve "unit-ness"?

 

No, no need to renormalize: If you stick to rotations and translations you should be fine.

Share this post


Link to post
Share on other sites

If there is no scale involved you don't need to renormalise the normals.

 

You only need to renormalise if your upper 3x3 part of the transform matrix has a determinant that is not +1. EDIT: All 3x3 rotation matrices have determinant +1.

 

You may want to renormalise if 2 or more normals get interpolated though (e.g. in a pixel shader where the vertex normals are interpolated across the triangle).

Edited by Paradigm Shifter

Share this post


Link to post
Share on other sites


You only need to renormalise if your upper 3x3 part of the transform matrix has a determinant that is not +1.

The norm of a non-uniformly scaled unit vector is not necessarily a unit vector even if the matrix has unit determinant. You may still have to normalize.

Share this post


Link to post
Share on other sites

 


You only need to renormalise if your upper 3x3 part of the transform matrix has a determinant that is not +1.

The norm of a non-uniformly scaled unit vector is not necessarily a unit vector even if the matrix has unit determinant. You may still have to normalize.

 

 

Good point, the +1 determinant is a necessary but not sufficient criterion ;) I wasn't thinking about non-uniform scaling. EDIT: Or a shear transformation matrix.

Edited by Paradigm Shifter

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this