Sign in to follow this  
db123

if clip will interrupt the pixel shader?

Recommended Posts

db123    222
[b] clip (DirectX HLSL)[/b]

Discards the current pixel if the specified value is less than zero. clip([i]x[/i])

Use this function to simulate clipping planes if each component of the [i]x[/i] parameter represents the distance from a plane.


==========================================
if x is less than zero, the following code will be interrupted???[img]http://public.gamedev.net//public/style_emoticons/default/wub.png[/img]

Share this post


Link to post
Share on other sites
Zoner    232
clip just prevents the pixel from being written to the render targets or depth buffer. The shader still runs to completion, though hardware typically avoids fetching textures 'for real' after the clip.

Share this post


Link to post
Share on other sites
db123    222
[quote name='Zoner' timestamp='1330154030' post='4916442']
clip just prevents the pixel from being written to the render targets or depth buffer. The shader still runs to completion, though hardware typically avoids fetching textures 'for real' after the clip.
[/quote]
Thank you very much~!![img]http://public.gamedev.net//public/style_emoticons/default/smile.png[/img] [img]http://public.gamedev.net//public/style_emoticons/default/smile.png[/img] [img]http://public.gamedev.net//public/style_emoticons/default/smile.png[/img]

Share this post


Link to post
Share on other sites
Crowley99    194
To be more precise, as long as there is one pixel in the SIMT block that wasn't clipped, execution will still continue (and the "clipped" pixel's operations will be mostly masked out). If they are all clipped execution should cease.

Share this post


Link to post
Share on other sites
[quote name='Crowley99' timestamp='1330197463' post='4916554']To be more precise, as long as there is one pixel in the SIMT block that wasn't clipped, execution will still continue (and the "clipped" pixel's operations will be mostly masked out). If they are all clipped execution should cease.[/quote]Execution [i]might [/i]in fact stop if there are pixels left in the same block, anyway, but to the same net effect. Since only complete wavefronts are scheduled, it will of course still have to wait for the sync with the execution of the nearby pixel shaders.

However, some clever hardware might be intelligent enough (hypothetically?) to take the cores that do nothing offline in the mean time to conserve power. I don't know if any hardware does that, but it would really make sense to implement the clip/kill instruction in such a way, in my opinion -- seeing how power hungry GPUs are and how much heat is a problem (and since everyone wants to be "green").

Basically, the GPU would only need to use something like the x86 MONITOR/MWAIT instruction pair for this...

Share this post


Link to post
Share on other sites
Crowley99    194
[quote name='samoth' timestamp='1330201741' post='4916562']
However, some clever hardware might be intelligent enough (hypothetically?) to take the cores that do nothing offline in the mean time to conserve power. I don't know if any hardware does that, but it would really make sense to implement the clip/kill instruction in such a way, in my opinion -- seeing how power hungry GPUs are and how much heat is a problem (and since everyone wants to be "green").
[/quote]

Interesting thought, but since GPUs rapidly swap in/out SIMT blocks in order to hide latency, I doubt it would make sense to power gate at that granularity.

Share this post


Link to post
Share on other sites
Krypt0n    4721
it depends on the hardware, older hardware executes the shader till end, but doesn't do any texture fetches and framebuffer writes anymore, this way you could optimize particle, that won't cost you fillrate bandwidth in areas that are completely transparent (in a common case it's ~30%-40%).


On newer hardware it's an early out for the shader, as they anyway schedule instruction by instruction and have special hardware that handles branching, a clip is nothing else than a return (branch out) for those shaders (it works on the granularity of the the thread group size, which is from 4x4 to 8x8 fragments, hardware specific).

All instructions after a successful clip/kill are not writing or reading anymore, in SM5 it could otherwise mean you'd write out of bounds.

Share this post


Link to post
Share on other sites
mhagain    13430
The dangerous thing about clip is that it may also interfere with your hardware's early-Z rejection, which may kill your performance. This may of course not be an issue (you may already be running fast enough) but it's something you need to be aware of. AMD at least have been advising not to use it during reasonably current times (DX10) and of course since earlier - http://developer.amd.com/media/gpu_assets/ATI-DX9_Optimization.pdf and [url="http://developer.amd.com/media/gpu_assets/Ultimate_Graphics_Performance_for_DirectX_10_Hardware.pdf"]http://developer.amd...10_Hardware.pdf[/url]

If you're considering using it as a performance optimization (which I suspect may be the case from your OP) - don't. If you're using it because it's behaviour is what you need - do.

Share this post


Link to post
Share on other sites
Krypt0n    4721
[quote name='mhagain' timestamp='1330272211' post='4916733']
The dangerous thing about clip is that it may also interfere with your hardware's early-Z rejection, which may kill your performance. ...[/quote]it depends on the situation, it might break hardware zbuffer optimization (not really early-z), but if you render just transparent objects at the end of your pipeline, it might speed up. There is no general "makes it slower" or "makes it faster", this really would need to be profiled on a case by case basis. (case depends on what you're rendering, when you're rendering it and on what hardware).

Share this post


Link to post
Share on other sites
MJP    19755
[quote name='Nik02' timestamp='1330256973' post='4916704']
What about a situation in which a PS5.0 shader writes (potentially conditionally) to a read/write resource after the clip instruction?
[/quote]

I haven't tried it, but I assume that the hardware would be forced to execute enough of the shader to successfully write to the UAV. This is because the spec says that discard should just indicate that the pixel should not be output to the render/depth target...if the hardware decides to early-out the shader as the result of a discard this is purely an opimization, and not part of the behavior specified by that instruction in the spec.

There's a similar situation with UAV's and early z/stencil testing. If you use a UAV the hardware has to turn off early z hardware, since the logical pipeline dictates that the z/stencil testing should actually happen [i]after[/i] the pixel shader and thus shouldn't prevent UAV writes from occurring. This is why there's an earlydepthstencil attribute which you can use to force early z/stencil tests to prevent UAV writes.

EDIT: So I tried writing to a RWTexture2D with a pixel shader that uses discard, and it turns out that the discard will prevent UAV write on my 6970. Interestingly enough, if I put the write [i]before[/i] the discard then the write always goes through. I have no idea whether or not this is defined behavior.

Share this post


Link to post
Share on other sites
Krohm    5030
[quote name='mhagain' timestamp='1330272211' post='4916733']
The dangerous thing about clip is that it may also interfere with your hardware's early-Z rejection, which may kill your performance. This may of course not be an issue (you may already be running fast enough) but it's something you need to be aware of. AMD at least have been advising not to use it during reasonably current times (DX10) and of course since earlier[/quote]I hoped this would go away with an hardware revision, this is pretty bad news to me. Considering D3D1x has no alpha test functionality, how to emulate discard?

Share this post


Link to post
Share on other sites
Crowley99    194
If these drawcalls take up a large chunk of your frame, you should consider doing a z prepass. EarlyZ will be off on the first run, but his doesn' matter since you are not doing any shading, lighing, shadowing, etc. Then on your second pass you can disable z-writes, which will in most cases give you back earlyZ.

Share this post


Link to post
Share on other sites
Krypt0n    4721
[[quotename='MJP' timestamp='1330308948' post='4916873']
So I tried writing to a RWTexture2D with a pixel shader that uses discard, and it turns out that the discard will prevent UAV write on my 6970. Interestingly enough, if I put the write [i]before[/i] the discard then the write always goes through. I have no idea whether or not this is defined behavior.
[/quote]
like I already said:) [quote]All instructions after a successful clip/kill are not writing or reading anymore, in SM5 it could otherwise mean you'd write out of bounds.[/quote] it's just the normal behavior, discard doesn't say[quote]the spec says that discard should just indicate that the pixel should not be output to the render/depth target[/quote] it clearly says, the pixel is discarded from the point you call clip/kill.per pixel early-z is nothing else than a clip on a z-compare ;)

Share this post


Link to post
Share on other sites
MJP    19755
[quote name='Krypt0n' timestamp='1330338782' post='4916946']
it clearly says, the pixel is discarded from the point you call clip/kill.
[/quote]

Where does it say this? The [url="http://msdn.microsoft.com/en-us/library/windows/desktop/bb943995%28v=vs.85%29.aspx"]docs for the discard statement[/url] merely say: "Do not output the result of the current pixel"

EDIT: nevermind, the docs for the [url="http://msdn.microsoft.com/en-us/library/windows/desktop/hh446968%28v=vs.85%29.aspx"]discard assembly instruction[/url] are pretty clear on this:

"This instruction flags the current pixel as terminated, while continuing execution, so that other pixels executing in parallel may obtain derivatives if necessary. Even though execution continues, all Pixel Shader output writes before or after the [b]discard[/b] instruction are discarded."

However it doesn't say anything about read operations being disabled.

Share this post


Link to post
Share on other sites
mhagain    13430
[quote name='Krohm' timestamp='1330326792' post='4916910']
[quote name='mhagain' timestamp='1330272211' post='4916733']
The dangerous thing about clip is that it may also interfere with your hardware's early-Z rejection, which may kill your performance. This may of course not be an issue (you may already be running fast enough) but it's something you need to be aware of. AMD at least have been advising not to use it during reasonably current times (DX10) and of course since earlier[/quote]I hoped this would go away with an hardware revision, this is pretty bad news to me. Considering D3D1x has no alpha test functionality, how to emulate discard?
[/quote]

If you have to do it, you have to do it. Alpha test ultimately has the same effect as clip, so you end up losing nothing that you wouldn't have lost anyway.

Share this post


Link to post
Share on other sites
db123    222
I wan't to use clip to draw some transparent objects.[img]http://public.gamedev.net//public/style_emoticons/default/smile.png[/img]

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this