Forcing early Z, which extension to use?

Started by
11 comments, last by dpadam450 10 years, 9 months ago

As far as my research has shown that there are two different extensions that can be used to force an early Z test in OpenGL 3.0. these are GL_ARB_conservative_depth and GL_ARB_shader_image_load_store. According to the spec they can be used to force early Z in glsl like so:

GL_ARB_conservative_depth:


layout(depth_unchanged) out float gl_FragDepth;

GL_ARB_shader_image_load_store:


layout(early_fragment_tests) in;

My question is do these have the exact same effect? if so can I use them interchangeably, if not which one should I use?

Advertisement

http://www.opengl.org/wiki/Image_Load_Store

Does not load like image_load_store has anything to do with early depth test. So no, you should use the conservative depth.

NBA2K, Madden, Maneater, Killing Floor, Sims http://www.pawlowskipinball.com/pinballeternal

http://www.opengl.org/wiki/Image_Load_Store

Does not load like image_load_store has anything to do with early depth test. So no, you should use the conservative depth.

Yeah I know that the purpose of the extension is not specifically for early z tests but it can be used to enable them as I read here:

http://www.opengl.org/wiki/Early_Depth_Test#Explicit_specification

What I was hoping was that cards that didn't support GL_ARB_conservative_depth might support GL_ARB_shader_image_load_store to be used as a fallback or vice versa.

I don't know. I just read that and nothing there suggests anything you are talking about. It is talking about how one thing affects the other. It says nothing about how textures (image load store) will effect depth testing. It does however say how depth testing will affect the image load store.

FYI from that page: "Thus the first restriction on early depth tests is that they cannot happen if the fragment shader writes gl_FragDepth?. If the fragment shader modifies the depth, then the depth test must wait until after the fragment shader executes."

In GL 2.0 (and it seems it has carried on to newer versions). If you don't write the depth in the shader, early z-cull and depth writing already takes place.

NBA2K, Madden, Maneater, Killing Floor, Sims http://www.pawlowskipinball.com/pinballeternal

FYI from that page: "Thus the first restriction on early depth tests is that they cannot happen if the fragment shader writes gl_FragDepth?. If the fragment shader modifies the depth, then the depth test must wait until after the fragment shader executes."

I'm sorry but you clearly didn't read that entire article, the use of that syntax for forcing early Z requires GL_ARB_shader_image_load_store:

More recent hardware can force early depth tests, using a special fragment shader layout qualifier:

layout(early_fragment_tests) in;

This will also perform early stencil tests.

...

This feature exists to ensure proper behavior when using Image Load Store or other incoherent memory writing.

Its mentioned in the spec too:

http://www.opengl.org/registry/specs/ARB/shader_image_load_store.txt

An explicit control is provided to allow fragment shaders to enable early

fragment tests. If the fragment shader specifies the
"early_fragment_tests" layout qualifier, the per-fragment tests described
in Section 3.X will be performed prior to fragment shader execution.
Otherwise, they will be performed after fragment shader execution.


In GL 2.0 (and it seems it has carried on to newer versions). If you don't write the depth in the shader, early z-cull and depth writing already takes place.

Yep.

I don't think OpenGL 2 even has the notion of early depth test at all (much like how it doesn't specify the exact algorithm for defining the shape of triangles). It was an optimization done by the hardware and as long as it gave the expected results it could do anything it wanted, so early depth tests worked by default simply because there was nothing against it. I imagine that disabling it if you modify the depth in a pixel shader has to do with caching (it invalidates the value in the cache).

Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.
I don't think OpenGL 2 even has the notion of early depth test at all (much like how it doesn't specify the exact algorithm for defining the shape of triangles). It was an optimization done by the hardware and as long as it gave the expected results it could do anything it wanted, so early depth tests worked by default simply because there was nothing against it. I imagine that disabling it if you modify the depth in a pixel shader has to do with caching (it invalidates the value in the cache)..

Using blending will also disable this hardware optimization, for PowerVR at least. They call this feature "Tile Based Deferred Rendering" in case anyone wants to look it up. Those little machines can handle a lot until you turn on blending, and presumably pixel shader depth writes. Once you do this they slow to a crawl.

I haven't gone to any trouble to see if the other embedded system manufacturers have similar schemes running under the hood.

Consider it pure joy, my brothers and sisters, whenever you face trials of many kinds, 3 because you know that the testing of your faith produces perseverance. 4 Let perseverance finish its work so that you may be mature and complete, not lacking anything.


Well as this article (which I didn't initially read) suggests, some cards support this and they use the word "explicitly" which implies that there is an "implicit" case.
ATI and NVIDIA are going to be supporting this early optimization. I don't know anything that suggests otherwise and have read some internal docs.

This extension also provides the capability to explicitly enable "early"
    per-fragment tests, where operations like depth and stencil testing are
    performed prior to fragment shader execution.  In unextended OpenGL,
    fragment shaders never have any side effects and implementations can
    sometimes perform per-fragment tests and discard some fragments prior to
    executing the fragment shader.

NBA2K, Madden, Maneater, Killing Floor, Sims http://www.pawlowskipinball.com/pinballeternal

If you investigate this page it further supports my claim:
http://gamedev.stackexchange.com/questions/16588/computing-gl-fragdepth

It has been existing for a long time. Also its not just early z-cull, its hierarchical early z-cull. Look up hierarchical occlusion culling. Graphics cards support this on a per-triangle level, which would not be possible if the shader executed first.

I believe the extension is explicitly able to perform the depth test by reading the depth buffer. Look up "discard". You can discard any fragment in GL, to explicitly discard if (z < depthBuffer.z), you were not allowed direct access to the depth buffer. I don't know but am assuming that you are now allowed to read it. This may only be if you are using an FBO though........

NBA2K, Madden, Maneater, Killing Floor, Sims http://www.pawlowskipinball.com/pinballeternal

Using blending will also disable this hardware optimization, for PowerVR at least. They call this feature "Tile Based Deferred Rendering" in case anyone wants to look it up. Those little machines can handle a lot until you turn on blending, and presumably pixel shader depth writes. Once you do this they slow to a crawl.

Is this tested? Have you tried enabling and then disabling depth test on enough blended fragments to test the performance is actually different? Seems strange this would happen since GL is a state machine.

NBA2K, Madden, Maneater, Killing Floor, Sims http://www.pawlowskipinball.com/pinballeternal

This topic is closed to new replies.

Advertisement