Why discard pixel take a noticeable performance hit?

Graphics and GPU Programming Programming

Started by xlope01 April 16, 2014 05:02 PM

7 comments, last by xlope01 10 years ago

xlope01

215

Author

April 16, 2014 05:02 PM

something as simple as that


if(textureColor.a < 0.5f)
	discard;

take around 25% hit in frame rate (assuming all the objects on your scene use that)

C0lumbo

4,415

April 16, 2014 05:20 PM

Normally, GPUs can do depth testing and writing before running a fragment shader. When there's a possibility that a fragment might be discarded, then the GPU has to do the depth test/depth write after running the fragment shader (else it risks writing pixels to the z-buffer that might end up discarded). This means that simple addition can cause a lot of extra work to be done.

Also, GPUs have various optimisations for depth buffers such as hierarchical depth and depth buffer compression going on under the hood, it's possible that these are also being bypassed by the possibility of a discard.

That's just a simple explanation. For more detail this article is great: http://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/ (parts 7, 8 and 9 look relevant)

Rapture - World Conquest

RDragon1

1,205

April 16, 2014 05:39 PM

For a discarded pixel you've already paid the cost of its fragment shader even though it doesn't writa anything. Further, you'll have to at least run the fragment shader for the visible geometry behind it, so that's an added cost. This really adds up if there's lots of overlapping discarded pixels - lots of fragment shader computation performed, with the results thrown away.

kauna

2,925

April 16, 2014 07:29 PM

Things that you may try in this case:

- use discard / clip only with objects absolutely needing it, so fully opaque objects should use a pixel shader without discard / clip in it

- optimize meshes - works best with particles maybe, but optimizing the mesh size / form in respect of the texture may reduce the pixel shader invocations

Cheers!

21st Century Moose

13,459

April 16, 2014 08:46 PM

Some other tips:

If you can replace the discard call with an alpha blending function you'll be able to remove it entirely. You'll pay some overhead for blending of course, but it should balance out in your favour. Look at the step function for one way of handling "< 0.5" cases. For your example, something like "output.a *= step (0.5, textureColor.a)" with an alpha blending mode would be appropriate.
If you have a GS active, and if you can move the comparison you're making the discard on to the GS, you could have that output a zero-area triangle instead of discarding in the pixel shader. You'll pay some overhead for the GS branching here, but the triangle won't even get rasterized, so again you should win.
If you're always comparing texture alpha to a fixed value then you could just hack the texture alpha values at load time - if they're less than 0.5 slam them to 0 there.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

Erik Rufelt

5,903

April 16, 2014 09:06 PM

One reason to do what the op does is that it can give smooth edges on magnified textures, and it can be easily used regardless of rendering order. The reasons for the slowdown seems outlined by others, and I just wanted to point out that I have seen the opposite behavior, where adding discard for alpha < 0.5 increases performance for alpha-blended triangles that have large areas with alpha = 0. However, when alpha-blended geometry is drawn back to front as it was in my case, there is no need for depth writes so there were no conditional depth writes.

Using the technique only on triangles that need it (and if the reason for it is not rendering order, possibly combined with drawing affected geometry last) should limit the performance impact.

MJP

20,295

April 16, 2014 09:10 PM

If you can replace the discard call with an alpha blending function you'll be able to remove it entirely. You'll pay some overhead for blending of course, but it should balance out in your favour. Look at the step function for one way of handling "< 0.5" cases. For your example, something like "output.a *= step (0.5, textureColor.a)" with an alpha blending mode would be appropriate.

I'm not sure if I agree with this device. The main issue with switching to alpha blending isn't the blending itself, but the fact that you can no longer write to your depth buffer when using it. This can result in overdraw which increases your shader costs, and also poses problems for deferred techniques or post-processing stages that rely on the depth buffer.

The Blog | The Book

21st Century Moose

13,459

April 16, 2014 10:32 PM

If you can replace the discard call with an alpha blending function you'll be able to remove it entirely. You'll pay some overhead for blending of course, but it should balance out in your favour. Look at the step function for one way of handling "< 0.5" cases. For your example, something like "output.a *= step (0.5, textureColor.a)" with an alpha blending mode would be appropriate.

I'm not sure if I agree with this device. The main issue with switching to alpha blending isn't the blending itself, but the fact that you can no longer write to your depth buffer when using it. This can result in overdraw which increases your shader costs, and also poses problems for deferred techniques or post-processing stages that rely on the depth buffer.

That depends on exactly what the OP is trying to achieve. It would be useful to actually have that info as right now I'm feeling that we've jumped into the middle of a problem-solving exercise and are trying to assist with the OP's chosen solution, without knowing what the original objective that prompted it was. There may be other completely different approaches.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

xlope01

215

Author

April 17, 2014 10:36 AM

I need to discard the pixels so the alpha part of the texture will not be visible, that i use for the foliage. I can minimize the performance hit by using "discard" only on the parts that i need, however doing this I have to break down the objects creating more objects meaning more cpu load.(example a building have some foliage on it and it is one object all together, now will became two object).

Why discard pixel take a noticeable performance hit?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Why discard pixel take a noticeable performance hit?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines