# DX11 16 bit z-position post processing

This topic is 2813 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I have a deferred renderer (supporting DX10 and DX11), where I use R16G16B16A16_FLOAT format for view space normal vector and view space z-coordinate.

Today, I happened to notice some screen space ambient occlusion errors in the distance. After some investigating, I found the errors to occur with distance and be precision related (the problem goes away with 32-bit float component format). 16-bit float depth does not give me enough precision to measure differences far away.

My question is what is the best option:

1) Stay with 16-bit float, but fade SSAO out with distance.
2) Change gbuffer layout (say, R32_FLOAT for view space depth, and R16G16_FLOAT for (x,y) coordinates of view space normal vectors and derive view space z.

Although 2 doesn't increase my overall gbuffer memory count from what I have now, but it does mean an additional render target. Do extra render targets have extra cost, or is it just the overall memory bandwidth that matters?

##### Share on other sites
Why don't you just read back the actual depth buffer?

##### Share on other sites
For some reason I can't think of right now, I thought it was better to store view space z, but maybe that was DX9 advice?

When I render my light meshes (as part of my deferred renderer) I need to reconstruct view space position, but I also need the depth buffer set for the depth test. Although I do not need to write to the depth buffer. I think DX11 has a read-only depth buffer support, but I don't think DX10 has that.

##### Share on other sites
Right now I'm thinking DXGI_FORMAT_R16G16B16A16_UNORM would give me better precision, and I would just have to divide by farPlaneZ when storing depth, and multiply by farPlaneZ when reading depth.

##### Share on other sites
What he meant was, why are you trying to store the normals and depth into the same target?
Normals should be 32-bit ARGB (A unused), and your depth values will be written to the standard depth buffer, from which you will later read them.

L. Spiro

##### Share on other sites
Yeah, read-only depth stencil views are D3D11 only. However you can copy a depth buffer, which is pretty cheap.

Using an integer format will definitely give you better precision than a FLOAT format for [0, 1] values.

##### Share on other sites
I made the change to use the depth buffer (24-bit), and the precision is much better than the 16-bit formats. Plus I like the idea of saving bandwidth of not writing depth when I lay down my g-buffers. After I lay down by g-buffer, I am using CopyResource to copy the depth buffer. Is that the best way?

I also have one other concern. My depth is at full screen resolution. If I wanted to do SSAO at half resolution, I think I will have problems with the depth buffer. It seems the most correct approach would be to redraw the depth buffer at half resolution and use that. However, in my DX9 engine, I was CPU bound for object drawing. This is much improved in my DX10/11 engine, but I am still reluctant to double my draw calls doing a 2nd pass. In this case, would it be better to just do SSAO at full resolution, but then blur it at a lower resolution?

Edit: I tried doing SSAO with linear filtering at half resolution (but depth map at full resolution), and the SSAO still works! I guess filtering depth is not a problem for SSAO, or the problems go unnoticed.

##### Share on other sites
If you are dead set on FP16 depth, I researched a lot of encodings (exp, log, sqrt etc) and ended up with a distance squared based scheme, with sqrt as the decode. This was designed to have enough precision to use in a deferred renderer when reading back the depth, and a good distrubtution of values all the way the edge of the level and beyond a bit:

It is meant to store raw W values, with a maximum range of around 2^21 (2097152).

We work with a world size +-524288, which makes the diagonal sqrt(3) times the side of the cube (1048576) = 1816186 units, which can be handled by the range I chose. The magic number 65503 should probably be replaced by a more accurate exact number (65472). 65504 is dangerous on older SM3 cards (Geforce7 and older), as they map those values to +-INF instead of 65504. A large part of the precision distribution comes from the sqrt, and using the sign bit doubles the number of values you can have. The code use to have half variables which explains some of the sillyness renaming the parameters to locals. . . .

 float EncodeFloatW(float W) { float Distance = float(W); float Value; if (W > 4096) { Value = (float)Distance / 8192; Value = Value * Value; Value = clamp(Value, 0, 65503); } else { Value = (float)Distance / 32; Value = Value * Value; Value = -Value; Value = clamp(Value, -65503, 0); } return Value; } float DecodeFloatW(float W) { float FloatW = abs(W); float Value; if (W > 0) { Value = ((float)sqrt(FloatW)) * 8192; } else { Value = ((float)sqrt(FloatW)) * 32; } return Value; } 

• ### Game Developer Survey

We are looking for qualified game developers to participate in a 10-minute online survey. Qualified participants will be offered a \$15 incentive for your time and insights. Click here to start!

• 9
• 18
• 13
• 9
• 9