Jump to content
  • Advertisement
Sign in to follow this  
BearishSun

Disabling color writes performance?

This topic is 2641 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm wondering if I disable color writes does the pixel/fragment shader still run normally? If it differs from card to card, then please mention that as well, but I'm primarily interested in DirectX 9 hardware :)

Share this post


Link to post
Share on other sites
Advertisement

I'm wondering if I disable color writes does the pixel/fragment shader still run normally? If it differs from card to card, then please mention that as well, but I'm primarily interested in DirectX 9 hardware :)


I'm actually not sure if drivers always disable fragment shaders with color writes disabled...I always set them to NULL myself whenever I'm rendering shadows or something like that. It seems like they could do that pretty easily, as long as they detect whether the shader uses discard or depth writes.

Either way you should still get benefits from disabling color writes on any Nvidia DX9 or higher GPU, as well as on any ATI DX10 or higher GPU.

Share this post


Link to post
Share on other sites
I would think it does run since there is no other way to get the pixel Z value to the depth buffer after rasterization. Typically I just have a "return float4(1,1,1,1);" as my depth-only pixel shaders to keep things simple. EarlyZ will cause some fragments to not run if the depth fails, but in general I would say the Pixel shaders execute. I have no documentation to verify my thoughts though.

Share this post


Link to post
Share on other sites

I would think it does run since there is no other way to get the pixel Z value to the depth buffer after rasterization.


Unless your fragment shader explictly outputs depth, the depth value is just the interpolated triangle depth from rasterization (which is how early-z is able to work).

Share this post


Link to post
Share on other sites

I'm wondering if I disable color writes does the pixel/fragment shader still run normally? If it differs from card to card, then please mention that as well, but I'm primarily interested in DirectX 9 hardware :)

no, it does not run. some GPUs have even the possibility to use a "double Z" mode where they can output twice as many z-samples per cycle. So, this is not just saving pixelshader work.

Share this post


Link to post
Share on other sites
Does double-z mode still exist?
It made sense for FFP, but in nowaday's "unified cores" approach, it seems to me the scheduler would just dispatch as fast as it could...

Share this post


Link to post
Share on other sites

Does double-z mode still exist?
It made sense for FFP, but in nowaday's "unified cores" approach, it seems to me the scheduler would just dispatch as fast as it could...

double Z is a ROPmode, it's barely related to the unified cores or a thread scheduler.

ROPs are usually made in a way that fits the bandwidth e.g. both can handle about 20GPixel+20GZexel, doubling some units, in this case ROPs, would not speed up, you'd just hit the bandwidth limit of 20GPixel+20GZexel in usual rendering cases.

BUT if you switch to a different work load, in this case z-only, obviously you won't write 20GZexel and 20GPixel. You'll occupy 50% of the available memory bandwidth, so it would make sense to speed that up, on the other side, the part of the ROPs that would handle color writes would be idle, again, it makes sense to reconfigure them for some z-work. Especially if you think about the general work load in this case. Usually you are quite fragment limited, but in a z-only pass, all unified shaders handle vertex transformation, which on the other side doesn't output more than a position and is quite cheap (in worst case you'll do some skinning, but mostly just a matrix x vector), all you're limited to is quite probably the fillrate, doubling it by maybe adding some special "blend mode" to the color part of the ROPs is quite a low hanging fruit.


I don't claim all hardware does it, but I know even some mobile hardware works that way.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!