• Advertisement
Sign in to follow this  

Shader branching ruins performance

This topic is 2948 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I think Lutz is on to something - my SSAO loop runs a very similar type of calculation (two for loops nested with some arithmetic in the center), and even on a 8600M it runs significantly faster than 30 fps for a similar number of iterations. I've also used dynamic branching with for loops on a parallax occlusion calculation and there wasn't any big problems with performance... Have you tried his suggested test out yet?

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by Jason Z
I think Lutz is on to something - my SSAO loop runs a very similar type of calculation (two for loops nested with some arithmetic in the center), and even on a 8600M it runs significantly faster than 30 fps for a similar number of iterations. I've also used dynamic branching with for loops on a parallax occlusion calculation and there wasn't any big problems with performance... Have you tried his suggested test out yet?


Not yet, I'm currently busy with other stuff right now, but I'll try it very soon.
Thanks for all your help, guys.

Share this post


Link to post
Share on other sites
As already mentioned, Nvidia cards can't do register indexing, IIRC they use scratch memory to do it, which explains all those movs.

Share this post


Link to post
Share on other sites
Quote:
Original post by Lutz
I assume that rays and spheres are constant registers? Then that's the problem! I had the same issue with either a 7800 or a 9800, don't remember, but this hardware does not support constant register indexing in a pixel shader! At least not in an efficient way.


You were absolutely right, Lutz. It was the array indexing that was causing the major slow down. I replaced the arrays by textures with the same information and the array indexing by texture reads, and now I get 30 FPS.

But what scares me the most is that I had already several shaders doing heavily use of array indexing for the SSAO and soft shadowing effects. Although their performance wasn't bad, I wonder how much will I increase the frame rate of my application just by removing the array indexing.

Share this post


Link to post
Share on other sites
Hmmm, I overcomplicated the solution. Instead of using textures I'm now using an uniform array of vec3's and also works fine, and it also saves me the trouble of encoding/decoding the needed information in textures.
What is really surprising is that the GPU accepts the uniform array as a fast input method, but doesn't handle a simple constant array declared on the shader the same way.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement