Visualizing Perspective projection

Started by
4 comments, last by myk45 11 years, 5 months ago
This is w.r.t Perspective projection and homogeneous coordinates. I have read that we use homogeneous coordinates when it comes to perspective projection.

So, the very simple matrix is as follows:(to project a 3D point on a 2D plane)

[1 0 0 0]
[0 1 0 0]
[0 0 1 0]
[0 0 1/d 0]

Where d = distance of plane onto which the projection is made. So, now i understand the following:

1) Use the matrix above to obtain a 4D homogeneous coordinate.
2) Do a perspective divide, thus obtaining the complete projection.

Now, my question is this: What i do not understand is, what exactly is the result we get after step (1)? i.e how exactly do i visualize [x, y, z, z/d] as? I'm not able to figure this out.

Can anyone please help me out on this? Thanks in advance!
Advertisement
[s]I am not sure there is a good answer. You can decompose the projection as a homography followed by removing one coordinate. [x, y, z, z/d] is basically what you get after doing only the homography. Unfortunately, a diagram of what the homography does (similar to the one on the Wikipedia page) requires 4 dimensions, so perhaps it's not very good for building intuition.[/s]

Hmmm... That wasn't quite right, so please ignore it.
Luckily for you, you don't need to visualize the 4D coords since you're working in RP^3.
If you're not into topology, then the easiest way to visualize the point is dividing by the last coordinate (assume nonzero) and treat it as a point in R^3, because points in RP^3 are equivalent that way. With this approach it's a bit hard to understand what the projection matrix actually does to points in RP^3, so you should work your way from bottom up, i.e. to derive the projection operation for an arbitrary projection plane in R^3 and see how to utilize the extra coordinate in RP^3 to make your equations "neat" (put them in a matrix).

Another way is to look at RP^3 as the set of all lines in R^4 that pass through the origin. What the projection matrix does, is just what any other matrix does, it rotates*scales*rotates those lines. Rotation is straightforward, scaling isn't. Scaling a set of lines in a singular way is like identifying subsets of lines (in a non-singular way you just pack them closer or farther apart).
For instance, if you have two different lines [x:y:z:w] and [x:y:z:w'] with w!=w', then scaling by [1,1,1,0] identifies these two lines. This correlates to projecting points in 3D space onto a plane, as many different points go to the same point on the plane.

Step 2, is pretty much like finding the intersection of the transformed lines with the hyperplane w=1, which results in a 3D point. But since we already identified some lines with others, we obtain points on some plane instead.

Also, a true projection matrix is indeed singular with a zeroed last column, as is yours. Projection matrices in graphics applications are a bit different, as they also compute the local coordinates of the point on the projection plane (up to the division factor) and preserve the depth information. However, these two operations have nothing to do with the projection per se.
Hi max343,

Wonderful reply! Thanks a lot!

Scaling a set of lines in a singular way is like identifying subsets of lines (in a non-singular way you just pack them closer or farther apart). For instance, if you have two different lines [x:y:z:w] and [x:y:z:w'] with w!=w', then scaling by [1,1,1,0] identifies these two lines. This correlates to projecting points in 3D space onto a plane, as many different points go to the same point on the plane.

I had some difficulty following this. Could you please explain what you meant by "scaling by [1,1,1,0]"?
Also, could you please point me to some book/link that talks about the exact same thing you posted about? Most books i've read do not go into these details :(


Also, a true projection matrix is indeed singular with a zeroed last column, as is yours. Projection matrices in graphics applications are a bit different, as they also compute the local coordinates of the point on the projection plane (up to the division factor) and preserve the depth information. However, these two operations have nothing to do with the projection per se.

Yes, well i was not referring to the typical projection matrix defined by OpenGL.

@alvaro
No problem, Thanks :)
Scaling by [1,1,1,0] is just multiplying element-wise, or in matrix notation: diag(1,1,1,0)*[x,y,z,w]^T. Since there's a zero in one of the elements, the scale is singular.
A full fledged scaling will look something like this: [|d|, |d|, |d|, sqrt(d^2+||n||^2), 0], with 'n' being the normal of the projection plane and 'd' is the plane parameter.

I should also note, that this is only one interpretation of RP^3. There are more colorful, so to speak, interpretations of it. Though generally they're harder to grasp. However, the advantage of some of those interpretations is that they relate to objects which have a direct connection to R^3.

Anywho, textbooks. The kicker with projective geometry is that all textbooks (that I know of) that talk about RP^3 assume at least late undergrad (or mostly grad) level of math knowledge.
However, there's one notable book that describes these topics very well for an audience other than math students. For projective geometry they use a simpler model than RP^3, namely RP^2. But it shouldn't get in your way of understanding how projections work, since RP^2 is applicable as well to project 3D objects onto a plane.
The book's name is "Geometry, by David A. Brannan, Matthew F. Esplen, Jeremy J. Gray".
Thanks a lot max343 :)

This topic is closed to new replies.

Advertisement