Depth buffer precision

Started by
8 comments, last by MJP 14 years ago
Lately I've been doing some research into reducing amount of error when reconstructing position from depth for deferred rendering. My results indicate that rendering linear depth (view-space z divided by distance to far-clip plane) into a 32-bit floating point surface gives great results, but perspective z/w from a 24-bit integer depth buffer can have a lot of error as you get closer to the far clip plane. This isn't surprising, since the precision issues due to the non-linearity of perspective z/w are well-known. I would just go with linear z in a color texture, but it would be nice to be able to save the memory and bandwidth required for an additional RT is pretty appealing for consoles. After reading through this delicious tidbit from Humus I re-ran my tests with a floating-point depth buffer and the near and far planes swapped. The results were great, but unfortunately it looks like it isn't really doable on PC's in D3D9. The caps spreadsheet that comes with SDK says that ATI DX10 GPU's support the D3DFMT_D24FS8 format, but nothing else. And even if I could use it on PC, I can't sample from it since there's no driver hacks that use it. I don't mind that last part so much since I already have a fallback in place, and PC can handle the extra RT. However to avoid having horrible z-testing precision I would have to put in all sorts of nasty platform-specific code that would let me use normal projection parameters/depth testing states on PC and backwards parameters on platforms that support float depth buffers. Not impossible, but potentially pretty nasty. Combining that with the added cost of recovering liner depth from z/w depth buffer values makes me start to lean towards the option of just using a seperate RT with linear depth on all platforms. So here's what I'd like to discuss: 1. Are there any ways to use floating-point depth buffers on the PC that I don't know of? How about in D3D10/D3D11? I don't even see a DXGI_FORMAT that specifies a 24-bit fp depth format. 2. Are there any other options for getting a better distribution of precision in a depth buffer? Thanks in advance. [smile] EDIT: I'm going to write a blog post real soon with the results of my testing, but if anyone would like to see some of the images I can post them here. EDIT 2: Blog post is up [Edited by - MJP on March 23, 2010 2:44:44 AM]
Advertisement
1:

In D3D10 and D3D11, you can create a real 32-bit float depth buffer.

If you create the backing resource as a typeless 32-bit surface (DXGI_FORMAT_R32_TYPELESS), you can create both depth stencil and shader resource views against it so you can directly access it as a texture after you've used it as an actual depth buffer.

If you use the DXGI_FORMAT_D32_FLOAT (which is common), you can't create a shader resource view for it but its usage as a depth buffer can potentially be optimized better by the driver.

There are other formats that you can use for this purpose, but I'm not aware of a 24-bit floating point format. With creative use of typeless resources, it is possible to interpret any values as any bitness floating point in the shaders, but the hardware's depth compare logic can't natively handle anything that can't be resolved as the predefined depth formats in the end. Even if AMD's implementation might allow D24F_S8, DXGI doesn't support such a format directly because it isn't that common. The nearest format is DXGI_FORMAT_D24_UNORM_S8_UINT but UNORM implies just an unsigned integer normalized to 0...1.

2:

Depends entirely on what your app's requirements actually are. You can bias the precision manually in shaders to near, mid-range or far depending on what depth range you want to have the most definition (though this is not easy in all situations). Also, it is still possible to use the classic technique of drawing separate depth ranges in separate passes, clearing the depth buffer in between them so they don't collide with each other.

Niko Suni

Quote:Original post by Nik02
1:

In D3D10 and D3D11, you can create a real 32-bit float depth buffer.

If you create the backing resource as a typeless 32-bit surface (DXGI_FORMAT_R32_TYPELESS), you can create both depth stencil and shader resource views against it so you can directly access it as a texture after you've used it as an actual depth buffer.

If you use the DXGI_FORMAT_D32_FLOAT (which is common), you can't create a shader resource view for it but its usage as a depth buffer can potentially be optimized better by the driver.



Indeed, I missed that at first. As I think about it more, it seems to make more sense that this format would included instead of a weird 24-bit format. It would be nice to have bits for stencil...although it appears you can use DXGI_FORMAT_D32_FLOAT_S8X24_UINT if you're willing to use double the memory/bandwidth for your depth-stencil buffer.

Quote:Original post by Nik02
There are other formats that you can use for this purpose, but I'm not aware of a 24-bit floating point format. With creative use of typeless resources, it is possible to interpret any values as any bitness floating point in the shaders, but the hardware's depth compare logic can't natively handle anything that can't be resolved as the predefined depth formats in the end.


Right, but the main issue I'm dealing with is the precision of the storage format. Being able to interpret as a float is convenient though...a lot nicer than having to manually reconstruct from 8-bit UINTs. [smile]

Quote:Original post by Nik02
2:

Depends entirely on what your app's requirements actually are. You can bias the precision manually in shaders to near, mid-range or far depending on what depth range you want to have the most definition (though this is not easy in all situations). Also, it is still possible to use the classic technique of drawing separate depth ranges in separate passes, clearing the depth buffer in between them so they don't collide with each other.


Well I'm mainly interested in using depth for reconstructing position in deferred rendering, or for SSAO and other post-processing. This pretty much kills the idea of using multiple depth ranges, unless I want to also do those things with multiple passes (which isn't really appealing). I suspect I'd end up with a lot of wasted pixels that wouldn't get caught by early z-cull.
I hate to butt in on your thread MJP. But seeing as you've done a bunch of research into the zBuffer, maybe you could answer a few of my questions.

Ive never really understood the need for having a non linear depth buffer. The reply you generally get is so that you get better precision for close up rendering. Sure thats nice, but then you get problems for far rendering. Why have excellent close up rendering and bad far rendering when you can have "good" precision for all distances, and also have a zbuffer that can be used for many other effects without having to do any conversions?

This obviously depends on the application. But with regard to gaming, which as far as I know, all games use a non-linear zbuffer. When are you ever viewing things that close up? if anything, the big thing these days is wide open environments with lush scenery, requiring lots of long distance rendering.
Quote:Original post by maya18222
I hate to butt in on your thread MJP. But seeing as you've done a bunch of research into the zBuffer, maybe you could answer a few of my questions.

Ive never really understood the need for having a non linear depth buffer. The reply you generally get is so that you get better precision for close up rendering. Sure thats nice, but then you get problems for far rendering. Why have excellent close up rendering and bad far rendering when you can have "good" precision for all distances, and also have a zbuffer that can be used for many other effects without having to do any conversions?

This obviously depends on the application. But with regard to gaming, which as far as I know, all games use a non-linear zbuffer. When are you ever viewing things that close up? if anything, the big thing these days is wide open environments with lush scenery, requiring lots of long distance rendering.


It's not a problem at all, any discussion on the topic is okay with me.

Anyway with a z-buffer it's typically not desirable to have the non-linear distribution of precision (in fact for z-testing and for position reconstruction life would be a lot easier if it were linear), however storing z/w means that rasterization and interpolation actually work correctly. It's also what allows early z-cull to work efficiently. Humus has a good explanation here, and there's some more background here.

With that article, i assume when hes referring to 'Z' or the "ZBuffer", thats 'z/w', which is non-linear 0-1, giving better precision closer up than further away. And when hes referring to 'w' or the "WBuffer", thats zView, ( z in view space), which is linear, ie zView/far plane to give 0-1, giving equal precision from zNear to zFar?

"While W is linear in view space it's not linear in screen space. Z, which is non-linear in view space, is on the other hand linear in screen space."

Not understanding that .... what difference does mapping a buffer which contains z/w or zView/far to screen space have on whether they change from linear to non-linear.

And finally, he mentions using linearZ causing problems with rasterization. How is this so? Both z/w and zView/zfar will be 0-1. So how does that interfere with anything?
Quote:Original post by maya18222
And finally, he mentions using linearZ causing problems with rasterization. How is this so? Both z/w and zView/zfar will be 0-1. So how does that interfere with anything?


The problem is that in order to have perspective correct interpolation the rasterizer interpolates Z/W, texcoord/W and 1/W and linearly and computes per-pixel z and texcoords by dividing the two values (Z/w/(1/W)). This has little to do with the mapping of the computed depth value to the Z-buffer, it could have been a different thing, but actually it's taken from this and mapped linearly.

You may be interested to see the logarithmic depth buffer trick and its comparison with floating point depth buffer. The rasterization problem is also being dealt with there.

There's also Ysaneya's blog post about it.
Quote:Original post by cameni

The problem is that in order to have perspective correct interpolation the rasterizer interpolates Z/W, texcoord/W and 1/W and linearly and computes per-pixel z and texcoords by dividing the two values (Z/w/(1/W)). This has little to do with the mapping of the computed depth value to the Z-buffer, it could have been a different thing, but actually it's taken from this and mapped linearly.


Well like I mentioned before, the reason that the depth value in the z-buffer is taken directly from z/w is because this format has some nice advantages from a hardware perspective. Mainly that the gradients of the depth value are constant in screen space for any particular face, which is what allows early-z and z compression to work efficiently. This is why I've been having a lot of trouble with the issue...it's hard to get a good distribution of precision and make sure you don't get in the way of hardware optimizations.

Quote:Original post by MJP
Well like I mentioned before, the reason that the depth value in the z-buffer is taken directly from z/w is because this format has some nice advantages from a hardware perspective. Mainly that the gradients of the depth value are constant in screen space for any particular face, which is what allows early-z and z compression to work efficiently. This is why I've been having a lot of trouble with the issue...it's hard to get a good distribution of precision and make sure you don't get in the way of hardware optimizations.

Surely it depends on the situation, but for example we didn't observe any performance hit when using per-pixel writes to depth component and thus effectively disabling the z buffer optimizations. Ysaneya didn't, either.
So it's still a viable option in some cases.
Quote:Original post by cameni
Surely it depends on the situation, but for example we didn't observe any performance hit when using per-pixel writes to depth component and thus effectively disabling the z buffer optimizations. Ysaneya didn't, either.
So it's still a viable option in some cases.


Yeah, definitely. We get some huge performance gains from early-z on consoles (since we're often fill or fragment bound), so I don't think we'd be able to do without it.

This topic is closed to new replies.

Advertisement