Conservative depth output, 1-z/w depth, earlydepthstencil and early z

Started by
2 comments, last by Tonkytonk 11 years, 8 months ago
I've got two questions with regards to early z handling:

1) I am under the impression that conservative depth output using either SV_DepthLessEqual or SV_DepthGreaterEqual from the pixel shader will allow early z culling to occur so long as the opposite inequality is used for the DepthStencilState depth comparison.

SV_DepthGreaterEqual combined with Comparison.Less works as expected in testing. However, SV_DepthLessEqual combined with Comparison.Greater seems to prevent early Z culling; drawing front to back versus back to front has no effect on performance. This seems to pose a problem for a 1 - z/w style depth buffer.

I modified a SharpDX sample to show the effect and attached it. It has only been tested on a 670 and a 660 ti.
[attachment=10907:MiniCube.zip]

Is my expectation of early Z culling with SV_DepthLessEqual and Comparison.Greater faulty for some reason? Is my implementation flawed? Or, is the GPU under no requirement to actually take advantage of it reliably since it is merely an optimization?


2) Using [earlydepthstencil] certainly succeeds in forcing an early depth test, but it also appears to write depth early. Clipping/discarding in the pixel shader successfully clips everything but the depth buffer write. Writing conservative depth out from the pixel shader does not change the depth buffer value.

Is this early depth write expected and well-defined behavior for all compatible GPUs? Is there a way to poke and prod it into merely testing the value early while still allowing clips/discards of depth writes or a conservative depth write from the pixel shader?

Thanks!
Advertisement
according to this you won't get full perf anyway: https://mynameismjp.wordpress.com/2010/11/

also, this early z cull stuff has never been documented/specified. therefore there is no real relying on this :) only hoping.
I was happy that you posted that stuff on 1-z/w style DB, great article.

I was explained that early Z cull works by storing min/max of already present Z values per compute unit block, then if your VS/GS streams a prim that has all 3 vertices outside of this min/max box (in front with Less or behind with Greater) it gets clipped before rasterization.
also, like somebody mentioned as a comment on your article, if you want true early Z culling, just use classic Z/W and no SV_Depth fiddling.
finally, drivers could change and take that better into account some days later.

i don't understand your second part, its late... :)
Yeah there's still no documentation for SV_DepthLessEqual/SV_DepthGreaterEqual/etc. So it's possible that your setup is invalid, but who the hell knows. More likely it's a driver bug.

As for your second question, as far as I know the documentation doesn't say anything about how earlydepthstencil interacts with discard. It might be defined somewhere in the driver-side D3D spec, it might not. And even if it is defined, there's always the possibility that the driver won't do it right anyway.
I scrounged up a few more testers with interesting results. Here's the complete set so far:

GTX 670: Early z only works with regular depth; 1-z/w depth performs the same with either back-to-front or front-to-back drawing.
GTX 660 ti: Same as GTX 670.
GTX 460: Early z seems to work for both regular and reversed depth.
GTX 460 SE: Same as GTX 460. (this user is known to have the same driver version as the computer with the 670)
GTX 560 ti: Same as GTX 460.
Radeon HD 6870: Same as GTX 460.
Radeon HD 6450: Crashes on launch.

So, it looks like early z with 1-z/w depth and conservative depth output usually works except on the 600 series. It's reassuring that I don't seem to be missing something obvious, but I guess I need to leave the fringes of undocumented wilderness and return to safer, more commonly used territory.

Thanks guys!

This topic is closed to new replies.

Advertisement