if clip will interrupt the pixel shader?

Started by
15 comments, last by db123 12 years, 1 month ago
clip (DirectX HLSL)

Discards the current pixel if the specified value is less than zero. clip(x)

Use this function to simulate clipping planes if each component of the x parameter represents the distance from a plane.


==========================================
if x is less than zero, the following code will be interrupted???wub.png
Advertisement
clip just prevents the pixel from being written to the render targets or depth buffer. The shader still runs to completion, though hardware typically avoids fetching textures 'for real' after the clip.
http://www.gearboxsoftware.com/

clip just prevents the pixel from being written to the render targets or depth buffer. The shader still runs to completion, though hardware typically avoids fetching textures 'for real' after the clip.

Thank you very much~!!smile.png smile.png smile.png
To be more precise, as long as there is one pixel in the SIMT block that wasn't clipped, execution will still continue (and the "clipped" pixel's operations will be mostly masked out). If they are all clipped execution should cease.
To be more precise, as long as there is one pixel in the SIMT block that wasn't clipped, execution will still continue (and the "clipped" pixel's operations will be mostly masked out). If they are all clipped execution should cease.
Execution might in fact stop if there are pixels left in the same block, anyway, but to the same net effect. Since only complete wavefronts are scheduled, it will of course still have to wait for the sync with the execution of the nearby pixel shaders.

However, some clever hardware might be intelligent enough (hypothetically?) to take the cores that do nothing offline in the mean time to conserve power. I don't know if any hardware does that, but it would really make sense to implement the clip/kill instruction in such a way, in my opinion -- seeing how power hungry GPUs are and how much heat is a problem (and since everyone wants to be "green").

Basically, the GPU would only need to use something like the x86 MONITOR/MWAIT instruction pair for this...

However, some clever hardware might be intelligent enough (hypothetically?) to take the cores that do nothing offline in the mean time to conserve power. I don't know if any hardware does that, but it would really make sense to implement the clip/kill instruction in such a way, in my opinion -- seeing how power hungry GPUs are and how much heat is a problem (and since everyone wants to be "green").


Interesting thought, but since GPUs rapidly swap in/out SIMT blocks in order to hide latency, I doubt it would make sense to power gate at that granularity.
What about a situation in which a PS5.0 shader writes (potentially conditionally) to a read/write resource after the clip instruction?

Niko Suni

it depends on the hardware, older hardware executes the shader till end, but doesn't do any texture fetches and framebuffer writes anymore, this way you could optimize particle, that won't cost you fillrate bandwidth in areas that are completely transparent (in a common case it's ~30%-40%).


On newer hardware it's an early out for the shader, as they anyway schedule instruction by instruction and have special hardware that handles branching, a clip is nothing else than a return (branch out) for those shaders (it works on the granularity of the the thread group size, which is from 4x4 to 8x8 fragments, hardware specific).

All instructions after a successful clip/kill are not writing or reading anymore, in SM5 it could otherwise mean you'd write out of bounds.
The dangerous thing about clip is that it may also interfere with your hardware's early-Z rejection, which may kill your performance. This may of course not be an issue (you may already be running fast enough) but it's something you need to be aware of. AMD at least have been advising not to use it during reasonably current times (DX10) and of course since earlier - http://developer.amd.com/media/gpu_assets/ATI-DX9_Optimization.pdf and http://developer.amd...10_Hardware.pdf

If you're considering using it as a performance optimization (which I suspect may be the case from your OP) - don't. If you're using it because it's behaviour is what you need - do.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.


The dangerous thing about clip is that it may also interfere with your hardware's early-Z rejection, which may kill your performance. ...
it depends on the situation, it might break hardware zbuffer optimization (not really early-z), but if you render just transparent objects at the end of your pipeline, it might speed up. There is no general "makes it slower" or "makes it faster", this really would need to be profiled on a case by case basis. (case depends on what you're rendering, when you're rendering it and on what hardware).

This topic is closed to new replies.

Advertisement