texkill is disabling Early Stencil in the Xbox 360?

Started by
12 comments, last by Beholder 13 years, 8 months ago
Hi all,

I am doing an edge detection pass, creating a stencil mask by means of the texkill instruction (clip). Then in the following passes I use the stencil buffer to only process edge pixels. The curious thing is that I am not getting any performance gain from stenciling, but a performance drop.

After reading this:

Quote:As a result this impacts other functionality that kills
fragments such as alpha test and texkill (called “clip” in HLSL and “discard” in GLSL). If Early Z
would be left on and the alpha test kills a fragment, the depth- and/or stencil-buffer would have been
incorrectly updated for the killed fragments. Therefore, Early Z is disabled for these cases. However, if
depth and stencil writes are disabled there are no updates to the depth-stencil buffer anyway, so in this
case Early Z will be enabled. On the Radeon HD 2000 series, Early Z works in all cases.

(from http://developer.amd.com/media/gpu_assets/Depth_in-depth.pdf)

It seems to me that using the clip operation is disabling Early Stencil, so the stencil operation is being executed after the pixel shader has already been executed. I supose Xbox is older than a Radeon HD 2000.

Some opinions about this? Can someone confirm this information? There is a workaround for this? Basically I want to perform edge detection of the color buffer, filling the stencil buffer so that it can be used later for optimizations purposes.

Thanks!
Advertisement
I'm not sure of a workaround that will let you use tex-kill but keep early-Z.

An alternative that you could try is to write your edge mask into a regular render-target, and then later on perform dynamic branching based on the contents of that 'edge buffer'.

[Edited by - Hodgman on July 18, 2010 9:35:20 PM]
Hodgman, do you know if the Xbox 360 GPU behaves like older GPUs than Radeon HD 2000 series (I mean if it disables Early Stencil when using texkill)?

The results I'm getting are telling me so, but I would like to be sure.
Quote:Original post by IrYoKu1
Some opinions about this? Can someone confirm this information? There is a workaround for this? Basically I want to perform edge detection of the color buffer, filling the stencil buffer so that it can be used later for optimizations purposes.

I can't confirm it for the specific XBox360 case, although I think I remember someone once complaining the Xbox360 had this limitation too. But my memory is fuzzy.

As for a workaround, no.
However, depending on what you're trying to do and how you're doing it, you may be able to avoid the texkill and use some Z or stencil trick so that it behaves the same way (sometimes even modifying the Z depth in the vertex shader!). But it varies on a per case basis. Sometimes, this just can't be solved.

For example in your case, it might work that instead of using texkill, you use a special stencil value that means "ignored", instead of killing the fragment. I haven't seen your code nor your full compositor, so I'm just giving it a shot with this example

Cheers
Dark Sylinc
Quote:Original post by Matias Goldberg
For example in your case, it might work that instead of using texkill, you use a special stencil value that means "ignored", instead of killing the fragment. I haven't seen your code nor your full compositor, so I'm just giving it a shot with this example

As I am using a full screen quad for a postfx effect, I can't change the stencil value per pixel, besides using texkill. So, I think I am screwed.

If someone can confirm the Xbox behavior with respect to Early Stencil it would be really great.

Thanks!
Are you using tex-kill to skip unwanted calculations, or just to fill out the stencil buffer?

Am I right that you've got two shaders: a cheap one that generates the mask via tex-kill, and an expensive one that should be able to be optimised by only processing masked regions?

If this is correct, then it shouldn't really matter that your first shader (the tex-kill one) isn't using early-Z -- what matters is that the second shader has uses early-Z (and it should do, because it doesn't use tex-kill).
I'm using texkill to fill the stencil buffer.

And yes, I have a cheap shader generating the mask (the edge detection pass), and a more heavy weight pass that uses the stencil.

I supposed that when you use texkill, early stencil gets disabled until next stencil clear. That is not the case?

If early stencil is really being used, I don't get why I'm not getting a performance improvement (I double checked and the stencil buffer is working properly).
I have seen someone who has a similar problem:
http://forums.xna.com/forums/p/34167/196128.aspx

Maybe it's just that Early Stencil don't work in XNA?
For stencil culling to occur before the pixel shader, you need to enable the hierarchical stencil buffer ("hi-stencil").

I don't know how it works through XNA (or if you can control it at all), but in C++ we have to manually enable hi-stencil testing (and enable writing to the hi-stencil) by setting the appropriate render-states. If XNA doesn't expose any "hi-stencil" states, you might be out of luck on the 360 :(
Thanks Hodgman!

I am asking it here as I think it's more an XNA issue than other thing:
http://forums.xna.com/forums/p/57394/350740.aspx#350740

Just in case someone has the same problem and ends up here.

This topic is closed to new replies.

Advertisement