Screen-space to view-space (un)projection?

This topic is 1650 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

Recommended Posts

I'm currently reading the Forward+ paper in the book GPU pro 4. In the very first sample code, the author shows how to build the frustum for each tile. The code shows like this:

float4 frustum[4];
{ // construct frustum
float4 v[4];
v[0] = projToView(8 * GET_GROUP_IDX, 8 * GET_GROUP_IDY, 1.0f );
v[1] = projToView(8 * (GET_GROUP_IDX + 1), 8 * GET_GROUP_IDY, 1.0f );
v[2] = projToView(8 * (GET_GROUP_IDX + 1), 8 * (GET_GROUP_IDY + 1), 1.0f );
v[3] = projToView(8 * GET_GROUP_IDX, 8 * (GET_GROUP_IDY + 1), 1.0f );

float4 o = make_float4( 0.0f, 0.0f, 0.0f, 0.0f );

for ( int i = 0; i M 4; ++i )
frustum[i] = createEquation( o, v[i], v[(i+o)& 3 );
}


Where 8 is the size of the tile and GET_GROUPID is the index of the thread group.

The paper says that projToView takes a screen-space pixel indices and depth value and returns coordinates in view space.

My question is, how do i project(unproject?) screen-space position back to view space? Do i just multiply the inverse of projection matrix? But something tells me this is not complete. If screen-space is already mapped in the viewport, do i reverse the viewport mapping first, undevide by w, then multiply the inverse projection matrix? I'm confused... :\

Edited by BrentChua

Share on other sites

Edit: Lol nevermind about this. This entire post in my 2nd message doesn't make any sense. I still also need help on this.

Also the createEquation() function just creates a plane equation from the three vertex positions.  This also confuses me as i'm assuming the first parameter is the target position, the 2nd parameter is the first point in plane, and the 3rd parameter is the 2nd position on the plane. If the first parameter always points at the center, (0.0f, 0.0f, 0.0f, 0.0f). If we would construct the frustum for the first thread group, wouldn't all the planes in the frustum be pointing in the center of the view(see attached picture)? It doesn't seem to make much sense...

Edited by BrentChua

Share on other sites

My question is, how do i project(unproject?) screen-space position back to view space?

object space -> world transform -> world space -> view transform -> camera space -> projection transform -> screen space.

do you mean back to camera space, or world space?

given a 2d point on the screen. you can cast a ray into the scene and determine intersections in world space.

without z information, which is usually discarded after the divide by z foreshortening, there's no way to back-solve (un-project) from screen space back to camera space. going from camera space back to world space is trivial: perform your camera (view) transform in reverse order: move(-cam_x,  -cam_y, -cam_z). then rotate(-cam_zrot), then  rotate(-cam_yrot) then rotate(-cam_xrot). this assumes an x,y,z rotation order is used in the game.

if you DO have z info, simply reverse the mathematical process used to transform your camera space vertex to screen coordinates and a left over z value. this would be the projection transform, and the translation to screen coordinates, applied in reverse order. IE convert your screen pixel and leftover z value back to a (homogeneous?) projection in the range -1,-1  to 1,1 (undo the screen coordinate translation). then apply the reverse of the math in the projection matrix (un-project). you can find the formulas for both in the directx docs, Ogl surely has similar formulas in their docs.

i don't know if the left over z values can be extracted from directx/Ogl or not. you could always apply a transform matrix on a vertex and use the result to get your z, but there you're reproducing the math that directx is already doing when drawing.

the brute force approach would be to put your vertex in a v3, then use v3transform() [i think thats the name of it] with the world, view, and projection mats. this would give you the x,y,z coords of the vertex in object space (before you start), world space (after world transform), and camera space (after view transform), as well as the 2d screen coordinates of the vertex (after projection transform). then you just use the values from the section of the pipeline you want. you could even create a projection matrix that doesn't include the screen translation, just the projection, and get the x,y,z coords for homogeneous screen space (after projection, but before translation to screen coords). but again this is solving for things that the graphics pipeline is already solving for (duplication of effort).

Edited by Norman Barrows

Share on other sites

the brute force approach would be to put your vertex in a v3, then use v3transform() [i think thats the name of it] with the world, view, and projection mats. this would give you the x,y,z coords of the vertex in object space (before you start), world space (after world transform), and camera space (after view transform), as well as the 2d screen coordinates of the vertex (after projection transform). then you just use the values from the section of the pipeline you want. you could even create a projection matrix that doesn't include the screen translation, just the projection, and get the x,y,z coords for homogeneous screen space (after projection, but before translation to screen coords). but again this is solving for things that the graphics pipeline is already solving for (duplication of effort).

Using that technique is probably too prohibitive for the light collection stage in the Forward+ renderer. But i appreciate your detailed response and info on these various transformations.

I have stumbled upon MJPs light indexed implementation and the HLSL contains the tile frustum building code. I understand how the equation works now but don't quite understand how it ended up with that equation exactly. His implementation is much simpler to understand and directly builds the frustum planes from the projection matrix with some vector bias offset relative to each tile the thread group is in.

From what i recall(from memory), the code looks like this.

...

float2 tileOffset = displaySize / ( 2 * float2(tileSize, tileSize) );
float2 tileBias = tileOffset - threadGroup.xy;

float4 c1 = float4( projection._11 * tileOffset, 0.0f, tileBias.x, 0.0f );
float4 c2 = float4( 0.0f, -projection._22 * tileOffset, tileBias.y, 0.0f );
float4 c4 = float4( 0.0f, 0.0f, 1.0f, 0.0f );

float4 frustumPlanes[6];

frustumPlanes[0] = c4 - c1; // left
frustumPlanes[1] = c4 + c1; // right
frustumPlanes[2] = c4 - c2; // top
frustumPlanes[3] = c4 + c2; // bottom

// near, far
frustumPlanes[4] = float4( 0.0f, 0.0f, -1.0f, -fNear );
frustumPlanes[5] = float4( 0.0f, 0.0f, 1.0f, fFar );

// normalize planes
for ( int i = 0; i < 6; ++i )
frustumPlanes[i] = (PlaneNormalize(...));

...


Sigh... I wish i'm much more comfortable with math.

Edited by BrentChua