Jump to content
  • Advertisement
Sign in to follow this  
coderchris

Question about dynamic branching efficiency

This topic is 3641 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

So, im thinking about switching my shader system to use an "ubershader" approach, and im curious about how efficient dynamic branching is on modern cards. I know I know, profile :) (I will profile, but I dont actually have this implemented yet and want to do some research before I take the plunge) I had heard that it is particularly bad on nvidia cards (especially pre-G80), and that ATI has always had pretty efficient dynamic branching. From what I understand, dynamic branching is so bad on some cards because they process pixels in blocks, and on early geforce cards, the block size was huge, and if any pixels in that block took different paths, the whole block had to evaluate both paths. For what I want to do (ubershader), the dynamic branching will happen based on some shader parameters, however, they will be uniform for each different object (so object A will have the same parameters for each pixel of A drawn). In my case, I will be passing lighting parameters, shadow maps, ect... Does this mean that I wont have that problem I mentioned above since all pixels for that object will take the same exact path? Also, how long does the actual branching instructions take relative to, say, a texture fetch or something? Heres a quick sudo code example of the kind of ubershader im talking about. Do you guys think this type of thing is viable (and by viable, i mean really fast) on modern cards? I am targeting shader model 4 cards, but im curious about DB performance on SM 3 cards as well.
int numLights
int lightType[8]
bool hasShadow[8]
sampler shadowMaps[8]

pixelShader(...) {
    for (int i=0;i<numLights;i++) {
        if (lightType == DIRECTIONAL) {
            if (hasShadow) { //shadow stuff }
        //other complicated stuff
        }
        else if (lightType == SPOT) {
           // some complicated stuff like above
        }
        else if (.. //and so on
    }

    // more stuff not invovling branching
}
THanks, Chris

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by coderchris
For what I want to do (ubershader), the dynamic branching will happen based on some shader parameters, however, they will be uniform for each different object (so object A will have the same parameters for each pixel of A drawn). In my case, I will be passing lighting parameters, shadow maps, ect...
Does this mean that I wont have that problem I mentioned above since all pixels for that object will take the same exact path?

Also, how long does the actual branching instructions take relative to, say, a texture fetch or something?

What you intend on doing is not dynamic branching but rather static branching. It's practically for free because the graphics driver will extract an optimized shader for each of your objects.

For true dynamic branching the instruction itself can take between zero to about four clock cycles, depending on the hardware and the specific instruction.

Share this post


Link to post
Share on other sites
A graphics processor is a massive SIMD machine. As such, any uniform branching evaluates only the used branch, and any non-uniform branching requires the processor to evaluate both branches.

Share this post


Link to post
Share on other sites
The critical thing to remember is that when sampling a texture in a dynamic branch you must specify the mipmap level explicitly, using tex2Dgrad or tex2Dlod for example...otherwise the branch cant be dynamic. Correctly used, dynamic branching can save a lot of cycles.

Share this post


Link to post
Share on other sites
Quote:

What you intend on doing is not dynamic branching but rather static branching. It's practically for free because the graphics driver will extract an optimized shader for each of your objects.


Static branching? I thought that was used when your passing in uniform variables, like the kind of thing you would have different techniques for. The variables im talking about are actual shader constants set per object, such that I can have just one technique in my FX file. Is the compiler smart enough to actually compile the hundreds of variations of shader parameters and then choose the right one based on the values I set?

Quote:

A graphics processor is a massive SIMD machine. As such, any uniform branching evaluates only the used branch, and any non-uniform branching requires the processor to evaluate both branches.


Ok, so assuming it is infact dynamic branching im looking at here, it shoudl be safe to assume that only one of the many branches will actually be taken, since it is uniform

Share this post


Link to post
Share on other sites
What you're talking about is static branching and typically compiles down to static branch instructions in the D3D asm level shaders. The driver may choose to implement that behind the scenes by creating multiple cached copies of hardware level shaders for each combination you use or it may support it using actual static branching in hardware, which is generally 'free'. There are costs to using static branching and an uber-shader approach though which is why I put free in quotes - the compiler may have a harder time optimizing a shader that uses static branching and the shader may end up using more GPRs than if you compiled several unique shaders which can lead to performance problems.

Dymamic branching is where you branch on values that may vary per pixel or per vertex and can be expensive.

Using multiple techniques with each using a different shader compiled by passing literal values in for uniform parameters isn't branching at all, it's just a way of using the FX system to generate a bunch of different hard coded shaders without having too much code duplication. If you look at what's generated in the asm shader output you won't see any branch instructions if you take that approach - from the API/hardware point of view there's no branching going on at all, only the hlsl compiler sees the branches.

Share this post


Link to post
Share on other sites
Quote:
Original post by coderchris
Static branching? I thought that was used when your passing in uniform variables,
Quote:
Original post by coderchris
dynamic branching will happen based on some shader parameters, however, they will be uniform for each different object

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!