Reconstructing WorldPos from Depth Texture/Projection question

Graphics and GPU Programming Programming

Started by Spidey March 13, 2013 08:17 AM

4 comments, last by Lightness1024 11 years, 1 month ago

766

Author

March 13, 2013 08:17 AM

I'm trying to figure out how to reconstruct a world space position from a depth texture sample (stored as z/w) from a pixel shader. I eventually plan to move to linear depth but I'm trying to get this to work with non linear depth first. I mostly understand everything, I'm just confused about this one bit.

I've been reading online and the general way to convert a depth value to a world position seems to be something like this:

float depth = depthTex.Sample(defaultSampler, IN.TexCoords).r; //sample depth - z over w

float4 pixelPos = float4(x,y, depth, 1.0f); //store as pixel pos (x/w, y/w, z/w, 1.0f)

float4 worldProjPos = mul ( pixelPos, invWorldViewProj ); //apply inverse world view proj mat to bring into world space

float3 worldPos = worldProjPos.xyz / worldProjPos.w; // <---

My question is about the last line.

Why do we divide by the w coordinate during unprojection ? I know we do a perspective divide when going from world space to projection/clip space, but why does the opposite transform which undoes the projection require another divide. I thought the 'w' was already in the denominator, so wouldn't we need to multiply by 'w' ?

I guess I'm just confused about how projection matrices work (never fully understood them), any links which explain this or tips on what I'm missing ? That's the only thing I'm confused about.

Thanks!

Lightness1024

939

March 14, 2013 02:40 PM

I think it may be because the WVP matrix is inverted therefore the w it 'spits' is also inverted, and your feeling is right for having to do the inverse operation than the perspective divide, therefore having to multiply. But since w is inversed here, you divide.

It's just a feeling though. I believe there is no problem proving that writing the whole operation on paper in details. I'll do that in the next days if nobody posts a better answer before that :)

Hodgman

52,717

March 14, 2013 02:53 PM

I'm not sure if this is correct either, but I believe what we call the 'perspective divide' is actually us converting from 4D homogenous coordinates to 3D Cartesian coordinates.
Projection matrices operate in 4D space where every 3D point becomes a 4D line -- there are many 4D points that correspond to a single 3D point. By dividing by the homogenous coordinate, you're 'normalizing' this 4D point to a standardized format, so when you drop the 'w', you end up with a single sensible/consistent 3D 'xyz' value.

. 22 Racing Series .

Spidey

766

Author

March 15, 2013 04:48 AM

Thanks Hodgman, it makes a little more sense now. So we are basically converting from 4d -> 3d space in both directions so the divide is necessary. Do you know of any links which explains this 4d/projection matrix math in detail ?