belfegor 2835 Report post Posted June 17, 2013 I am reading book "3D Math primer for graphics and game developement", and i am stuck at this topic as it does not explain it well. To understand how the standard physical 3D space is extended into 4D, let’s first examine homogenous coordinates in 2D, which are of the form (x, y, w). Imagine the standard 2D plane existing in 3D at the plane w= 1. So the physical 2D point (x, y) is represented in homogenous space (x,y, 1). For all points that are not in the plane w= 1, we can compute the corresponding 2D point by projecting the point onto the plane w= 1 and dividing by w. So the homogenous coordinate (x,y,w) is mapped to the physical 2D point (x/w,y/w). 1. Where this w = 1 come from? 2. If some point is outside of this plane w = 1 then it have different w value? 3. Is w absolute distance to plane? 4. If i divide points with w which is 1 then i get same points, so what is the point? 5. Any other book you could recommend that explains this w thingy better? Thank you for your time. 0 Share this post Link to post Share on other sites
alh420 5995 Report post Posted June 17, 2013 (edited) Not a mathematical explanation, but a practical: Its useful when doing the perspective transformation. This w is what is responsible for things further away looking smaller, and things closer looking bigger. For 3D graphics, you let w=1 be the projective plane, the screen, and points further away will have w > 1, and closer will have w < 1. The division will map points from the projective space to screen space. Hope that gives some hint Edited June 17, 2013 by Olof Hedman 2 Share this post Link to post Share on other sites
apatriarca 2365 Report post Posted June 17, 2013 (edited) First of all, the correct (or at least more common) name of this space is (real) projective space and not homogeneous space. An homogeneous space is usually something different in mathematics*. I've never seen the term homogeneous space used in this context. The term projective space comes from the fact it is the smaller/more natural space in which (central) projections can be defined. You can visualize this space as the set of lines passing through the origin in the (real) vector space one dimension larger. So the projective plane (the projective space of dimension two) is for example the set of 3D lines in the form P(t) = t * (X, Y, Z). Each line can be identified by any of its non-zero points. The homogeneous coordinates of a point of the projective space are the coordinates of one of the non-zero points of the corresponding lines and are written between square brackets (and often separated by a colon) [X0 : X1 : ... : Xn]. These homogeneous coordinates are not unique: two coordinates which differ by a non-zero scalar multiple represent the same point. The division by W is simply a way to choose a unique representative of a point. You basically represent all lines by their intersection with the plane W=1. All the lines parallel to that plane cannot be represented in this way. They are called points at infinity and they usually represent directions.. Note that dividing by W is just a convention and it is possible to divide by any other coordinate or use any other general plane. So, why using a space like that for projections? The main reason is that a projection transform a point with coordinates [x : y : z : 1] to some point [X : Y : Z : W] where W is no longer equal to one.. It is thus necessary to divide by W to get back a 3D point.. This is the main reason homogeneous coordinates have been defined and used in computer graphics. The projection matrices are usually chosen so that the W coordinates basically represents the depth of the transformed point and the view frustum is transformed to the cube with all three coordinates inside [-1,1] after W divide (there are actually different conventions here between DirectX and OpenGL here..). But this is just a useful convention. * It is usually used to denote some kind of space with a transitive G-action.. Edited June 17, 2013 by apatriarca 3 Share this post Link to post Share on other sites
belfegor 2835 Report post Posted June 17, 2013 @Olof Hedman Thank you for nice explanation. @apatriarca Well, that is all nice what you have wrote, i reread it multiple times and i still can't get it. First of all, the correct (or at least more common) name of this space is (real) projective space and not homogeneous space. An homogeneous space is usually something different in mathematics*. I've never seen the term homogeneous space used in this context. I don't care about math concepts, just want to learn basic math application used in graphics and game development. "Homogenous space" was the topic label in the book, how could i know which is correct name? The homogeneous coordinates of a point of the projective space are the coordinates of one of the non-zero points of the corresponding lines and are written between square brackets (and often separated by a colon) [X0 : X1 : ... : Xn]. These homogeneous coordinates not unique: two coordinates which differs by a non-zero scalar multiple represents the same point... ...transitive G-action.. I am totally lost from here, too much abstract and generalized. Do you realize this thread is in "For beginners" forum, so could you possibly put this in easily understandable way? Maybie i could visualize if i have some practical example, so i could know in what space/ranges these points/directions are and see why this division by w mean/results out. I have a feeling that most of you on this forum have a need to brag with your knowledge but that doesn't help us who want to learn something. unfortunately most of the authors who wrote books have that need too which makes learning a lot harder then it needs to be. No disrespect meant. Thank you for understanding and time. 3 Share this post Link to post Share on other sites
Ryan_001 3477 Report post Posted June 17, 2013 (edited) Apatriarca is being accurate, not trying to 'show off'. I'll try to explain it in simpler terms. 3D linear algebra is the standard way of performing 3D math in games. Its allows us to do all the fun things (rotate, scale objects, ect...) with rather simple math. Simple is good because its fast. Problem is, you can't actually do two important things with standard 3D linearl algeabra. The 1st is translations, and the 2nd is perspective projections. Translations allow you to move objects around the scene, and perspective projection is necessary to mimic reality, where things farther away from us are smaller than things closer. So to fix this problem we extend the 3 dimensional space to a 4 dimensional space, often called by programmers "Homogenous space". We do all our math in 4D space, then at very end the video card 'fixes' everything up for us. This fix is actually a 4D to 3D projection. If we have a 4D vector of the form (x,y,z,w) the video card projects it to 3D by dividing the vector by w, which gives us (x/w,y/w,z/w,1). This gives it its final 3D point on screen. The (x,y) is then used for the triangle coordinates on screen, and the z used for depth buffering and interpolation, and the w coordinate is generally ignored. Apatriarca was explaining in detail the 4D to 3D projection, and while it is good to know, to be honest isn't probably that necessary. Edited June 17, 2013 by Ryan_001 5 Share this post Link to post Share on other sites
Hodgman 51511 Report post Posted June 17, 2013 (edited) First of all, the correct (or at least more common) name of this space is (real) projective space and not homogeneous space. Yeah, AFAIK, the word "space" doesn't belong there. We're contrasting 4D homogeneous coordinates to 3D/2D Cartesian coordinates. Personally, I call it post-projection space, and after the divide by w, I call it NDC-space/clip-space. We (programmers) generally use "blah-space" to refer to some particular "basis" (view-space, world-space, post-projection space, etc...), but "w" comes into play not just because we're using some particular basis/space, but because we've switched from one coordinate system to another. i.e. When working in object-space, view-space or world-space, we generally use 3D cartesian coordinates to identify points. When we're dealing with post-projection-space we switch over to using 4D homogenous coordinates to represent our points. We then transform them back to 2D cartesian coordinates in screen/viewport-space to identify pixels. The reason we switch over to using 4D homogenous coordinates in order to implement perspective is because all our linear algebra (matrix/vector math) is... well... linear... whereas perspective is a non-linear effect! i.e. you want to scale something with "1/distance", which when you plot it, it isn't a straight line (not linear). By working with 4D coordinates, where we say "we will divide by w later on", then we can continue to use linear algebra (matrices) to operate in a "non-linear space" (laymens terms, not formal math terms ). Edited June 17, 2013 by Hodgman 3 Share this post Link to post Share on other sites
apatriarca 2365 Report post Posted June 17, 2013 (edited) I understand that "homogeneous space" is how your book call it, but if you search that on google you get something completely different. "projective space" is a much better search query. This is the main reason I have introduced that name in my previous post. I think you should simply look for something like "homogeneous coordinates computer graphics" on google. There are hundreds of beginner articles about these things already and it is not possible to explain everything is a simple forum post. If you don't care about mathematical concepts, then you should probably just understand how perspective projections are derived and basically consider the W division step as a technical trick. Indeed, if you compute the perspective projection equations, you will see that it is necessary to divide by z at some point. You can't divide by a coordinate using matrices. The trick is thus to write the denominator in a fourth coordinate and then divide by it after the matrix multiplication. The additional coordinate is actually also used to make it possible to use matrices to represent translations but this has nothing to do with the notion of homogeneous coordinates. Edited June 17, 2013 by apatriarca 2 Share this post Link to post Share on other sites
apatriarca 2365 Report post Posted June 17, 2013 (edited) Yeah, AFAIK, the word "space" doesn't belong there. We're contrasting 4D homogeneous coordinates to 3D/2D Cartesian coordinates. Personally, I call it post-projection space, and after the divide by w, I call it NDC-space. We (programmers) generally use "blah-space" to refer to some particular "basis" (view-space, world-space, post-projection space, etc...), but "w" comes into play not just because we're using some particular basis/space, but because we've switched from one coordinate system to another. i.e. When working in object-space, view-space or world-space, we generally use 3D cartesian coordinates to identify points. When we're dealing with post-projection-space we switch over to using 4D homogenous coordinates to represent our points. We then transform them back to 2D cartesian coordinates in screen/viewport-space to identify pixels. The reason we switch over to using 4D homogenous coordinates in order to implement perspective is because all our linear algebra (matrix/vector math) is... well... linear... whereas perspective is a non-linear effect! i.e. you want to scale something with "1/distance", which when you plot it, it isn't a straight line (not linear). By working with 4D coordinates, where we say "we will divide by w later on", then we can continue to use linear algebra (matrices) to operate in a "non-linear space" (laymens terms, not formal math terms ). Yes, I know. But homogeneous space is not a particularly useful search query in any case. I've actually always used the terms clip(ping) coordinates (and thus clip(ping) space) and normalized device coordinates (NDC) (and thus NDC-space) to denote the coordinates post-projection and post W-division in the graphics pipeline. The homogeneous coordinates has always been a more general term for me. I understand that the way we use the homogeneous coordinates in the graphics pipeline is often not mathematically correct. We don't really care about what a projective space really is and what operation we can do on these coordinates. Some more advanced theory is however in my opinion sometimes useful to understand, for example, what is preserved by a projective transformation. Edited June 17, 2013 by apatriarca 1 Share this post Link to post Share on other sites
Adaline 710 Report post Posted June 17, 2013 (edited) Hello The 2 only needed things to know about homogeneous coordinates : w=0 for direction 4D vectors (i.e. normals) w=1 for position 4D vectors (i.e. vertices) When rendering, the GPU will itself do the w-division, you don't need to do it yourself in this context. If you need to do yourself calculations on 4D vectors, be sure to do yourself the w-division at the end if you have (at least) one perspective projection matrix in your transformation. Hope it's clear Edited June 17, 2013 by Tournicoti 2 Share this post Link to post Share on other sites
belfegor 2835 Report post Posted June 17, 2013 Thanks. I know the difference between direction (xyz, w=0) and point (xyz, w=1) and how to multiply/transform vector with matrix. What i don't understand is this projecting to plane where w is not 0 nor 1. 1 Share this post Link to post Share on other sites
Adaline 710 Report post Posted June 17, 2013 (edited) What I tried to say was if there is at least one perspective projection matrix in your transformation matrix, the w-division is needed. (Edit : because w is no more 0 or 1 !) With projective spaces,the transformation is not only multiplying matrices. It's dividing the homogeneous vector by w at the end to get the equivalent 3D cartesian vector(x,y,z). Edited June 17, 2013 by Tournicoti 1 Share this post Link to post Share on other sites
Paradigm Shifter 5832 Report post Posted June 17, 2013 If w is is not 0 then it represents the point (x/w, y/w, z/w). So it is a point. It's like how 1/2 and 2/4 both represent the same number, 0.5. Don't worry about projecting onto a plane, that's just a way of visualising projective space in 2d using 3d homogeneous coordinates. You can't visualise the projection in 3d with 4d homogeneous coordinates because the projection is onto the entirety of 3d space instead of a plane. 1 Share this post Link to post Share on other sites
alvaro 21276 Report post Posted June 17, 2013 (edited) The 2 only needed things to know about homogeneous coordinates :w=0 for direction 4D vectors (i.e. normals) w=1 for position 4D vectors (i.e. vertices) [...] w=0 for directions, but normals are not directions. Normals are co-vectors, which means that the way to transform them by an affine transformation is to apply to them the transpose of the inverse of the endomorphism (a.k.a., the 3x3 sub-matrix that doesn't involve the extra coordinate), and you may want to renormalize after that. If you are only dealing with orientation-preserving isometries (rotations and translations), you are lucky and both computations agree; but if you allow for non-uniform scalings or other non-isometric transforms, you'll see that your normals get messed up. A proper vector (e.g., a translation, or the difference between two positions) is not strictly speaking a projective point either, but at least the arithmetic of using w=0 does work out. A directional light, for example, does have a direction which is a projective point with w=0. Edited June 17, 2013 by Álvaro 0 Share this post Link to post Share on other sites
belfegor 2835 Report post Posted June 17, 2013 (edited) I do not have need to scale something that is using normals so far, as i do that in modeling program, and i read somewhere else on this forum that is not necessary to transform normal with inverse & transpose if i don't use scale. Is it necessary to do renormalization after this, or normals preserve "unit-ness"? Edited June 17, 2013 by belfegor 0 Share this post Link to post Share on other sites
alvaro 21276 Report post Posted June 17, 2013 I do not have need to scale something that is using normals so far, as i do that in modeling program, and i read somewhere else on this forum that is not necessary to transform normal with inverse & transpose if i don't use scale. Is it necessary to do renormalization after this, or normals preserve "unit-ness"? No, no need to renormalize: If you stick to rotations and translations you should be fine. 1 Share this post Link to post Share on other sites
Paradigm Shifter 5832 Report post Posted June 17, 2013 (edited) If there is no scale involved you don't need to renormalise the normals. You only need to renormalise if your upper 3x3 part of the transform matrix has a determinant that is not +1. EDIT: All 3x3 rotation matrices have determinant +1. You may want to renormalise if 2 or more normals get interpolated though (e.g. in a pixel shader where the vertex normals are interpolated across the triangle). Edited June 17, 2013 by Paradigm Shifter 0 Share this post Link to post Share on other sites
Brother Bob 10347 Report post Posted June 17, 2013 You only need to renormalise if your upper 3x3 part of the transform matrix has a determinant that is not +1. The norm of a non-uniformly scaled unit vector is not necessarily a unit vector even if the matrix has unit determinant. You may still have to normalize. 1 Share this post Link to post Share on other sites
Paradigm Shifter 5832 Report post Posted June 17, 2013 (edited) You only need to renormalise if your upper 3x3 part of the transform matrix has a determinant that is not +1. The norm of a non-uniformly scaled unit vector is not necessarily a unit vector even if the matrix has unit determinant. You may still have to normalize. Good point, the +1 determinant is a necessary but not sufficient criterion ;) I wasn't thinking about non-uniform scaling. EDIT: Or a shear transformation matrix. Edited June 17, 2013 by Paradigm Shifter 0 Share this post Link to post Share on other sites