I'm trying to figure out how to reconstruct a world space position from a depth texture sample (stored as z/w) from a pixel shader. I eventually plan to move to linear depth but I'm trying to get this to work with non linear depth first. I mostly understand everything, I'm just confused about this one bit.

I've been reading online and the general way to convert a depth value to a world position seems to be something like this:

float depth = depthTex.Sample(defaultSampler, IN.TexCoords).r; //sample depth - z over w

float4 pixelPos = float4(x,y, depth, 1.0f); //store as pixel pos (x/w, y/w, z/w, 1.0f)

float4 worldProjPos = mul ( pixelPos, invWorldViewProj ); //apply inverse world view proj mat to bring into world space

float3 worldPos = worldProjPos.xyz / worldProjPos.w; // <---

My question is about the last line.

Why do we divide by the w coordinate during unprojection ? I know we do a perspective divide when going from world space to projection/clip space, but why does the opposite transform which undoes the projection require another divide. I thought the 'w' was already in the denominator, so wouldn't we need to multiply by 'w' ?

I guess I'm just confused about how projection matrices work (never fully understood them), any links which explain this or tips on what I'm missing ? That's the only thing I'm confused about.

Thanks!