Advertisement Jump to content
Sign in to follow this  
cephalo

branch and flatten attributes in SM 5

This topic is 1883 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Advertisement

This will stop the compiler "magic" tricks. The compiler trys to avoid branching. But sometimes you know that it is needed that kind-of-branching and the compiler is wrong.

Edited by imoogiBG

Share this post


Link to post
Share on other sites

Why does the compiler try to avoid that in SM 5? Without knowing how it works internally, it would seem that branching would always require less code being executed. Why would you want to flatten it?

Share this post


Link to post
Share on other sites

Because the same code is executed at the same time for multiple programs(single program multiple data):

 

GPUs execute your shader program multiple times at once to save instruction cache.

By doing this a limitation comes up  - all programs that are executed togrether must process the same instriction at the same time. if some of the executed programs pick TRUE path of the branch, and other pick the FALSE path of the branch, then both paths of the branch will be performed for each group of programs.

Also In other cases some threads may *stall* because the path they've picked is *shorter*  than the others.

 

Compiler will try to flatten your branch to avod stalls.

 

That is why we try to avoid branched as much as possible. There are some other reasons to avoid branching but this one is the main reason.

Share this post


Link to post
Share on other sites

So the more parallelism that is going on, the less there is to be gained by true branching, or when you say 'stall', do you mean that its actually slower than simply executing all branches?

Share this post


Link to post
Share on other sites

So the more parallelism that is going on, the less there is to be gained by true branching, or when you say 'stall', do you mean that its actually slower than simply executing all branches?

No i dont mean that. By stalling I mean that some 'programs wont do anything.

 

If you want good explanation look at the CUDA tutorials by nVidia or OpenCL by ATI.

Share this post


Link to post
Share on other sites

The compiler certainly won't always try to avoid branching, since always doing that would be bad. Whether or not using a dynamic branch instruction is a good idea depends on what's in the branch, what's in the "else" (or if there is one), and whether the branch will be coherent among neighboring threads. The compiler will use heuristics that evaluate those first two criteria, and attempt to make an intelligent decision about whether or not to use a branch or flatten it. However it can't know about the third criteria (coherency) since that largely depends on the data fed into your shader, so it can't factor that into its decisions. So the compiler gives you the "branch" and "flatten" attributes which lets you override the compiler's heuristics, in case you have additional knowledge that leads you to believe that one will be faster  than the other.

Share this post


Link to post
Share on other sites

...and you can still profile it to see if your beliefs were correct. Also, I've ran into cases when NOT using [branch] (for example) led to non-compilable code because of too-deep function calls and nested loops and similar. So in the end, do try it and see if it helps :)

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!