Jump to content



if clip will interrupt the pixel shader?

  • You cannot reply to this topic
17 replies to this topic

#1 db123   Members   -  Reputation: 103

Like
0Likes
Like

Posted 25 February 2012 - 12:26 AM

clip (DirectX HLSL)

Discards the current pixel if the specified value is less than zero. clip(x)

Use this function to simulate clipping planes if each component of the x parameter represents the distance from a plane.


==========================================
if x is less than zero, the following code will be interrupted???Posted Image

Ad:

#2 Zoner   Members   -  Reputation: 212

Like
0Likes
Like

Posted 25 February 2012 - 01:13 AM

clip just prevents the pixel from being written to the render targets or depth buffer. The shader still runs to completion, though hardware typically avoids fetching textures 'for real' after the clip.

#3 db123   Members   -  Reputation: 103

Like
0Likes
Like

Posted 25 February 2012 - 02:20 AM

View PostZoner, on 25 February 2012 - 01:13 AM, said:

clip just prevents the pixel from being written to the render targets or depth buffer. The shader still runs to completion, though hardware typically avoids fetching textures 'for real' after the clip.
Thank you very much~!!Posted Image Posted Image Posted Image

#4 Crowley99   Members   -  Reputation: 117

Like
0Likes
Like

Posted 25 February 2012 - 01:17 PM

To be more precise, as long as there is one pixel in the SIMT block that wasn't clipped, execution will still continue (and the "clipped" pixel's operations will be mostly masked out). If they are all clipped execution should cease.

#5 samoth   Members   -  Reputation: 690

Like
0Likes
Like

Posted 25 February 2012 - 02:29 PM

View PostCrowley99, on 25 February 2012 - 01:17 PM, said:

To be more precise, as long as there is one pixel in the SIMT block that wasn't clipped, execution will still continue (and the "clipped" pixel's operations will be mostly masked out). If they are all clipped execution should cease.
Execution might in fact stop if there are pixels left in the same block, anyway, but to the same net effect. Since only complete wavefronts are scheduled, it will of course still have to wait for the sync with the execution of the nearby pixel shaders.

However, some clever hardware might be intelligent enough (hypothetically?) to take the cores that do nothing offline in the mean time to conserve power. I don't know if any hardware does that, but it would really make sense to implement the clip/kill instruction in such a way, in my opinion -- seeing how power hungry GPUs are and how much heat is a problem (and since everyone wants to be "green").

Basically, the GPU would only need to use something like the x86 MONITOR/MWAIT instruction pair for this...

#6 Crowley99   Members   -  Reputation: 117

Like
0Likes
Like

Posted 25 February 2012 - 07:37 PM

View Postsamoth, on 25 February 2012 - 02:29 PM, said:

However, some clever hardware might be intelligent enough (hypothetically?) to take the cores that do nothing offline in the mean time to conserve power. I don't know if any hardware does that, but it would really make sense to implement the clip/kill instruction in such a way, in my opinion -- seeing how power hungry GPUs are and how much heat is a problem (and since everyone wants to be "green").

Interesting thought, but since GPUs rapidly swap in/out SIMT blocks in order to hide latency, I doubt it would make sense to power gate at that granularity.

#7 Nik02   Members   -  Reputation: 1299

Like
0Likes
Like

Posted 26 February 2012 - 05:49 AM

What about a situation in which a PS5.0 shader writes (potentially conditionally) to a read/write resource after the clip instruction?
Niko Suni
Software developer

#8 Krypt0n   Members   -  Reputation: 444

Like
0Likes
Like

Posted 26 February 2012 - 06:59 AM

it depends on the hardware, older hardware executes the shader till end, but doesn't do any texture fetches and framebuffer writes anymore, this way you could optimize particle, that won't cost you fillrate bandwidth in areas that are completely transparent (in a common case it's ~30%-40%).


On newer hardware it's an early out for the shader, as they anyway schedule instruction by instruction and have special hardware that handles branching, a clip is nothing else than a return (branch out) for those shaders (it works on the granularity of the the thread group size, which is from 4x4 to 8x8 fragments, hardware specific).

All instructions after a successful clip/kill are not writing or reading anymore, in SM5 it could otherwise mean you'd write out of bounds.

#9 mhagain   Members   -  Reputation: 701

Like
0Likes
Like

Posted 26 February 2012 - 10:03 AM

The dangerous thing about clip is that it may also interfere with your hardware's early-Z rejection, which may kill your performance. This may of course not be an issue (you may already be running fast enough) but it's something you need to be aware of. AMD at least have been advising not to use it during reasonably current times (DX10) and of course since earlier - http://developer.amd...ptimization.pdf and http://developer.amd...10_Hardware.pdf

If you're considering using it as a performance optimization (which I suspect may be the case from your OP) - don't. If you're using it because it's behaviour is what you need - do.

#10 Krypt0n   Members   -  Reputation: 444

Like
0Likes
Like

Posted 26 February 2012 - 12:24 PM

View Postmhagain, on 26 February 2012 - 10:03 AM, said:

The dangerous thing about clip is that it may also interfere with your hardware's early-Z rejection, which may kill your performance. ...
it depends on the situation, it might break hardware zbuffer optimization (not really early-z), but if you render just transparent objects at the end of your pipeline, it might speed up. There is no general "makes it slower" or "makes it faster", this really would need to be profiled on a case by case basis. (case depends on what you're rendering, when you're rendering it and on what hardware).

#11 MJP   Moderators   -  Reputation: 2128

Like
0Likes
Like

Posted 26 February 2012 - 08:15 PM

View PostNik02, on 26 February 2012 - 05:49 AM, said:

What about a situation in which a PS5.0 shader writes (potentially conditionally) to a read/write resource after the clip instruction?

I haven't tried it, but I assume that the hardware would be forced to execute enough of the shader to successfully write to the UAV. This is because the spec says that discard should just indicate that the pixel should not be output to the render/depth target...if the hardware decides to early-out the shader as the result of a discard this is purely an opimization, and not part of the behavior specified by that instruction in the spec.

There's a similar situation with UAV's and early z/stencil testing. If you use a UAV the hardware has to turn off early z hardware, since the logical pipeline dictates that the z/stencil testing should actually happen after the pixel shader and thus shouldn't prevent UAV writes from occurring. This is why there's an earlydepthstencil attribute which you can use to force early z/stencil tests to prevent UAV writes.

EDIT: So I tried writing to a RWTexture2D with a pixel shader that uses discard, and it turns out that the discard will prevent UAV write on my 6970. Interestingly enough, if I put the write before the discard then the write always goes through. I have no idea whether or not this is defined behavior.

#12 Nik02   Members   -  Reputation: 1299

Like
0Likes
Like

Posted 27 February 2012 - 12:02 AM

That was my point :)
Niko Suni
Software developer

#13 Krohm   Members   -  Reputation: 560

Like
0Likes
Like

Posted 27 February 2012 - 01:13 AM

View Postmhagain, on 26 February 2012 - 10:03 AM, said:

The dangerous thing about clip is that it may also interfere with your hardware's early-Z rejection, which may kill your performance. This may of course not be an issue (you may already be running fast enough) but it's something you need to be aware of. AMD at least have been advising not to use it during reasonably current times (DX10) and of course since earlier
I hoped this would go away with an hardware revision, this is pretty bad news to me. Considering D3D1x has no alpha test functionality, how to emulate discard?

#14 Crowley99   Members   -  Reputation: 117

Like
0Likes
Like

Posted 27 February 2012 - 02:11 AM

If these drawcalls take up a large chunk of your frame, you should consider doing a z prepass. EarlyZ will be off on the first run, but his doesn' matter since you are not doing any shading, lighing, shadowing, etc. Then on your second pass you can disable z-writes, which will in most cases give you back earlyZ.

#15 Krypt0n   Members   -  Reputation: 444

Like
0Likes
Like

Posted 27 February 2012 - 04:33 AM

[[quotename='MJP' timestamp='1330308948' post='4916873']
So I tried writing to a RWTexture2D with a pixel shader that uses discard, and it turns out that the discard will prevent UAV write on my 6970. Interestingly enough, if I put the write before the discard then the write always goes through. I have no idea whether or not this is defined behavior.
[/quote]
like I already said:) [quote]All instructions after a successful clip/kill are not writing or reading anymore, in SM5 it could otherwise mean you'd write out of bounds.[/quote] it's just the normal behavior, discard doesn't say[quote]the spec says that discard should just indicate that the pixel should not be output to the render/depth target[/quote] it clearly says, the pixel is discarded from the point you call clip/kill.per pixel early-z is nothing else than a clip on a z-compare ;)

#16 MJP   Moderators   -  Reputation: 2128

Like
0Likes
Like

Posted 27 February 2012 - 04:25 PM

View PostKrypt0n, on 27 February 2012 - 04:33 AM, said:

it clearly says, the pixel is discarded from the point you call clip/kill.

Where does it say this? The docs for the discard statement merely say: "Do not output the result of the current pixel"

EDIT: nevermind, the docs for the discard assembly instruction are pretty clear on this:

"This instruction flags the current pixel as terminated, while continuing execution, so that other pixels executing in parallel may obtain derivatives if necessary. Even though execution continues, all Pixel Shader output writes before or after the discard instruction are discarded."

However it doesn't say anything about read operations being disabled.

#17 mhagain   Members   -  Reputation: 701

Like
1Likes
Like

Posted 27 February 2012 - 07:56 PM

View PostKrohm, on 27 February 2012 - 01:13 AM, said:

View Postmhagain, on 26 February 2012 - 10:03 AM, said:

The dangerous thing about clip is that it may also interfere with your hardware's early-Z rejection, which may kill your performance. This may of course not be an issue (you may already be running fast enough) but it's something you need to be aware of. AMD at least have been advising not to use it during reasonably current times (DX10) and of course since earlier
I hoped this would go away with an hardware revision, this is pretty bad news to me. Considering D3D1x has no alpha test functionality, how to emulate discard?

If you have to do it, you have to do it. Alpha test ultimately has the same effect as clip, so you end up losing nothing that you wouldn't have lost anyway.

#18 db123   Members   -  Reputation: 103

Like
0Likes
Like

Posted 27 February 2012 - 09:10 PM

I wan't to use clip to draw some transparent objects.Posted Image






We are working on generating results for this topic
PARTNERS