Does the "discard" statement incur slowdown?

Graphics and GPU Programming Programming DX11

Started by Eric F. January 15, 2017 09:17 AM

4 comments, last by Eric F. 7 years, 3 months ago

Eric F.

264

Author

January 15, 2017 09:17 AM

I've been reading in a number of articles that using the discard statement in a pixel shader is a bad idea.

My question is this, if I already have an if() statement in my shader, will the addition of discard in the if block cause any slowdown? Or should I just set the alpha of that pixel to zero?

I'm using DX11 for displaying 2D graphics, so I do not do anything fancy and the shader I will be using the discard statement in will only be used to display text in a specific instance. My tests have shown no difference, but I'm afraid I might be missing something.

Thanks!

Mr_Fox

806

January 15, 2017 09:34 AM

I am not sure whether it is relevant to what you asked, but from what I learned from my intern experience: on GCN kill pixel in shader will disable EarlyZ so it means your shader will be executed even occluded (unless you use ReZ, which is still slower than EarlyZ), and that will be slow if you ps is expensive and have a lot of overlapped pixels.

Adam_42

3,664

January 15, 2017 09:42 AM

It varies. In some cases it can speed things up, and in others it can reduce performance. You need to test for your specific usage, on the hardware that you care about, and see what happens.

One way it can speed things up is when you're hitting memory bandwidth limits on writing to the frame buffer. In my experience this mostly affects lower end graphics cards. In those cases using discard to implement an alpha test so you don't write out fully transparent pixels can improve performance. This might apply to the rendering of a particle system, for example.

On the other hand, in shaders that write to the depth buffer, using discard can hurt performance. This is because using discard has a side effect of disabling some of the hardware's depth buffer optimizations, because it makes the depth output of the shader impossible to predict before the shader runs. Disabling those optimizations can make future draw calls go more slowly, especially ones that would fail the depth test.

In addition note that enabling alpha blending can also have a performance cost - it uses more memory bandwidth than opaque rendering, because it has to do a read-modify-write of the frame buffer instead of just a write.

Hodgman

52,717

January 15, 2017 10:32 AM

Typically a shader binary will include a flag internally that specifies whether the program contains a discard instruction or not, because it impacts the GPU at a pretty high level.

Mobile and desktop have very different perf characteristics here. Mobile GPUs tend to use "deferred" architectures internally, which don't actually execute pixel shaders immediately after rasterization - instead they record which triangle covers each pixel, and then run one PS for each pixel at the end.
'Discard' completely messes with this, so these mobile GPUs have to fall back to a less optimized approach.

On PC, 'discard' often messes with early-Z / Hi-Z optimizations, which affects the impact of overdraw in your scenes.

In both cases, it's common to try to render all your opaque objects first, followed by your objects that make use of the discard instruction, followed by transparent/blended objects.

Despite these impacts, as mentioned above, discard can help in other cases. If discarded early, any texture fetches that occur after the discard statement should have no memory bandwidth impact, and the final PS output should also havr no bandwidth impact, assuming you're not using a write-mask.

. 22 Racing Series .

nbertoa

1,022

January 15, 2017 01:18 PM

Here you have a good article about Pixel Shader where the author describes discard too

It says:

Another pixel shader specific is the discard instruction. A pixel shader can decide to “kill” the current pixel, which means it won’t get written. Again, if all pixels inside a batch get discarded, the shader unit can stop and go to another batch; but if there’s at least one thread left standing, the rest will be dragged along. DX11 adds more fine-grained control here by way of writing the output pixel coverage from the pixel shader (this is always ANDed with the original triangle/Z-test coverage, to make sure that a shader can’t write outside its primitive, for sanity). This allows the shader to discard individual samples instead of whole pixels; it can be used to implement Alpha-to-Coverage with a custom dithering algorithm in the shader, for example.

I bolded the relevant part for your question.

My Blog

nbertoa.wordpress.com

Eric F.

264

Author

January 15, 2017 10:19 PM

Many thanks for the link and information. I will read up and do tests on some different machine.

Much appreciated.

Does the "discard" statement incur slowdown?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Does the "discard" statement incur slowdown?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines