Early-Z

Started by
8 comments, last by Suen 12 years, 1 month ago
I'm not sure if I should really ask this here or in the Graphics Programming & Theory but here goes:

I'm slightly confused about Early-Z. Conceptually depth testing would be performed after the fragments are processed from what I understand. But early-z is supposed to let one do the depth test before the fragments are processed and through that test discard fragments which fail the test. This would give an increased running performance as a fragment shader with a heavy implementation wouldn't need to be executed as many times.

After searching on the net and reading about it I've come to understand that hardware vendors already have this optimization built-in granted that one fulfill some requirements (such as not altering the depth value in the fragment shader because the shader can't be evaluated at that stage). So does this actually mean that if I have allocated a depth buffer (either for the default framebuffer or for my own FBO) and enabled GL_DEPTH_TEST the hardware is actually performing early-z?

Because I've seen a few implementations where people do a multipass rendering where they in the first pass do a z-pass rendering to store the result from the depth test in the depth buffer and then in the second pass merely use the same buffer by making sure no values are written to the buffer anymore on the client-side through glDepthMask.

So I'm slightly confused about if you actually have to do it by yourself or if the hardware does so internally for you when you are running your app.

As for the latter part where people implemeneted it I got curious about one more thing. In the first pass you would want depth testing enabled to get the data for your depth buffer. But in the second pass this data is not supposed to be changed (as it would only be changed when a new frame is rendered when you clear the values). Would you still need to have depth testing enabled for the second pass as well? Or can it be turned off?
Advertisement
Usually there are three things that are included in the term "early Z", and they're all hardware optimisations that you shouldn't be concerned about too much (except in cases where you might inadvertantly switch them off): Z Compression, Fast Z Clear and Hierarchical Z-Buffer. I read some time ago that doing something like a kill operation in a fragment shader would switch it off, but I'm not 100% sure.

Now doing a depth only pass might make sense in some circumstances, but in others not (as it implies a second pass, unless you're doing something like deferred shading). The key here is to profile I guess.
I have no idea about this because in GLSL you can change a depth value in fragment shader (gl_FragDepth, which is, by default, equal to "gl_FragDepth = gl_FragCoord.z"). (As I am only using an old graphics card, so, I can't say is it possible to do "early depth testing")

But if talking about 2-pass rendering, so there could be:
1st pass: render scene to depth texture (depth test enabled).
2nd pass: render a hard part of scene using a depth texture (from 1st pass) in shader:

// fragment
uniform sampler2DRect depthTexture;

void main()
{
if(texture2DRect(depthTexture, gl_FragCoord.xy).r < gl_FragCoord.z) // do your own depth test operation (this means hardware's depth test is not needed here)
discard;
...............
render a pixel
...............
}


Best wishes, FXACE.
You seem to be colliding a few different concepts here so let's break things up a little.

Forget about the prepass for now; let's look at early-Z in complete isolation.

With depth buffering there are two operations that can happen, either of which can be enabled/disabled independently of the other:

- Writing new values to the depth buffer.
- Testing an incoming fragment's depth against the depth stored in the buffer, and rejecting it based on a condition.

In a traditional pipeline the depth test (the second operation I mention) happens after the fragment shader stage. This is well specified by both OpenGL and D3D.

However, the hardware will know the incoming depth value before the fragment shader runs, and - in some cases - can make use of this knowledge to reject the fragment before the fragment shader stage. These cases are actually the common case in much 3D drawing - opaque depth-tested geometry.

Note that neither OpenGL nor D3D specify any kind of early-Z operation - all of this is hardware-specific and you don't enable or disable it, nor do you need to do any kind of special tricksiness in your code to get it to work. All you need is to satisfy the case it operates under, and the hardware will do it automatically for you.

As you've noted, if a fragment shader needs to output Z, or if you have a state (or combination of states) set that depend on depth test running after the fragment shader, then the hardware will disable early Z optimizations. Again, no other programmatic intervention on your part is required.

So in summary, for early-Z, if you meet the conditions in which it can run, and if your hardware supports it, then it is running.

For a Z-only prepass we initially write Z values to the depth buffer, but for subsequent passes we don't need to write, we just need to test (the written values would be the very same as for the prepass as the same input positions are used). This can give something of a fillrate saving, and early-Z optimizations can help with the subsequent passes in cases where you have high levels of overdraw. But it's important to note that early-Z is not required to do a Z-only prepass (some hardware may offer a fillrate saving for this even if you don't have early-Z), and a Z-only prepass is not required to get early-Z (an other example where it can be useful is if you're drawing a skybox after all other scene objects). They can and do interact in situations where they're both used, but they're really two completely separate concepts.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

Hi,

The depth optimizations mentioned by RobinsionUK are explained here by Persson.
Depth optimizations only work under certain conditions. Hierarchical z-culling for instance only works when the stencil test is set correctly (fail and zfail must be GL_KEEP). If you turn on depth writing you can still use it, if you use as depth test GL_LESS and guarantee per conservative depth that the written depth value will only become bigger than the current value (same for GL_GREATER).

It may be worth doing a pre-depth pass anyway if your fragment shaders are expensive. In this case you spent much time processing fragments that were overwritten later. You could avoid this by never ever running the fragment shader in the first place for those fragments by doing the pre-depth pass.
By the way most drivers will have some extra acceleration when color writes are disabled and depth writes are enabled.

When shading fragments with an already built depth buffer you still need the depth test, otherwise the object rendered last will render atop everything else.
Thanks for all the answers. It does help clarify things certainly. To begin with I wasn't aware that early-z included more then what I mentioned (Z Compression, HiZ etc.) as RobinsionUK mentioned but I will look into these concepts as well through the link given by Tsus so thanks to both.

Also big thanks to mhagain, you basically answered most of my questions including some other ones I wondered about with regard to the subject. To summarize what we've discussed, as long as I meet the conditions then early-z is actually used internally by my hardware and that it has nothing itself to do with a multi-pass technique (z-pass and then a regular pass for example).

Also both you and Tsus mentionted that if I already have a built depth buffer (say from a z-pass) I would still need to do depth test. This might sound stupid but I'm still confused with regard to this part. So I will just take a really simple example and see if I can get a better understanding of this:

Assume we render two triangles where the second one is exactly behind the first one. If we have depth test enabled then the depth buffer should at first be filled with the z-values of the first triangle. But then as the second triangle is rendered it should fail the depth test due to being behind the first one. So the depth buffer should still have the corresponding z-values generated by the first rendered triangle (accounting the projections, perspective divisions and those things happening in the pipeline)

Now if I were to simply do exactly this by filling the depth buffer in a z-pass (which would be the first pass) and then use the buffer for subsequent passes why would I still need to perform depth testing in the subsequent passes if I already have the correct values from the z-pass? mhgain mentioned that we don't write in any subsequent pass but only test so I fail to understand why we would need to do a depth test for any of the following passes.

For example with regard to what Tsus said, the last object would be rendered atop everything else but if I just were to disable writing anything to the depth buffer after the z-pass (and enable writing to it again before I clear the buffer) then wouldn't that work?

Again the question might be really stupid or I might have forgotten something really basic here. Still.thanks for all the great answers.
There's no such thing as a stupid question. :)

Maybe it would help some if you thought of depth writing as building a wall and depth testing as checking if something is behind that wall.

At the start of the frame you tear the wall down. We're at a totally clean state here, and that's clearing the depth buffer.

Now you build the wall - that's your Z-only prepass and that's your depth-writing.

When you're checking to see if objects are visible you don't need to rebuild the wall each time - the wall is already there, so instead you just need to check "is this object behind the wall or in front of it?" That's depth-testing with writing disabled.

Sometimes you might want to build an extension to the wall. This extension might be behind the wall, it might be in front of it, or it might be to one side. If it's in front of it then objects behind the extension also get rejected. If it's off to one side then more objects get rejected. If it's behind it however, then there's no need - you won't see the extension anyway and anything that was rejected by the first wall is still going to be rejected anyway. That's a fairly simplified and crude explanation of when you've got both testing and writing enabled.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

It does make sense and that's how I usually think of it (well not in specific the wall-example but in general about depth writing and depth testing). But from what I get out of this is that in the z-prepass we merely write and never perform any depth test there, is that correct?

Here's how I thought of it at first:

What I was thinking was that we were doing the same thing twice. In the z-only prepass we would perform depth test with writing enabled. This would give us the correct z-values for our rendered scene in the depth buffer.

Then when using this depth buffer in the subsequent pass(es) we would test for values which we already have tested for before.

Going by your quite nicely explained wall example it would be like building the wall and checking what objects are in front of it and behind it. Let's say that we know there is an object behind the wall and we reject it as we would not see it. Then when we get to the next pass we would check the very same object again, despite knowing from our previous pass that the object is behind the wall.

If we just assume that two objects cover the SAME random position (x,y) with one having the z-value as 15.0 and the other as 16.0 it would be like:

1. Write 15.0 to the depth buffer (depth writing)
2. Compare 16.0 to 15.0 (depth test)
3. Reject 16.0

That's the prepass. Then in the next pass for the same two objects:

1. Compare 15.0 to 15.0 (depth test)
2. Compare 16.0 to 15.0 (depth test)

What's the benefit of the depth testing in the second pass? Of course this is ONLY if we actually perform a depth test in the z-prepass
Hi!

You have to write and test in the z-prepass. Otherwise you would not get in the end the depth values of the objects in front.
Alright, short example.
Imagine you have three objects A (depth=1), B (depth=2), C (depth=3) (so they are sorted front to back.)

Let’s look at the z-prepass only.
Depth write enabled, depth test disabled (this is wrong, but I’d like to show you why.)
Render A (depth = 1) -> depth buffer = 1
Render B (depth = 2) -> depth buffer = 2
Render C (depth = 3) -> depth buffer = 3
As you can see the objects rendered last simply overwrote the value in the depth buffer.

Now, with depth write and depth test enabled.
Render A (depth = 1) -> depth buffer = 1
Render B (depth = 2) -> depth buffer = 1 (failed depth test)
Render C (depth = 3) -> depth buffer = 1 (failed depth test)
Now, we got the correct result. In the z-prepass we want to determine the depth of the object closest to the camera (for each pixel).

In the second pass, all you have is the depth value of the object closest to the camera. You don’t know anymore which object this was. If you would render the second pass without depth test you wouldn’t use the depth buffer you just built at all. Instead you would simply draw the objects in the order they are coming on the screen (ignoring their depth entirely). Objects behind a wall would simply be drawn atop, just because they are drawn later.
In the second pass you only want to shade the objects closest to the camera, which is why you found out in the first pass *where* the closest objects are and now you just ask: Does the object I’m looking at right now have the depth stored in the depth buffer (= depth test) ? Or in other words: Is it the object closest to the camera? If so shade it, if not reject it.

So to summarize it:
In z-prepass: enable depth test, enable depth write, disable color write.
In second pass: enable depth test, disable depth write, enable color write.

I hope that clears things up for you. smile.png
That definitely cleared things up! biggrin.png Big thanks to Tsus and mhagain for you guys taking your time to explain both these parts to me and thanks to you other guys who replied as well. This thread has been quite helpful!

Best regards!

This topic is closed to new replies.

Advertisement