Android Shader Question Android GL Es 2.0

Started by
1 comment, last by rockseller 11 years, 8 months ago
Hi there,


If my shade Code has many lines of code, even if they are inside an IF() condition that is always going to be FALSE, the performance goes DOWN alot, example:



main (){

if( false ){
//lines of code that are never reached
//lines of code that are never reached
//lines of code that are never reached
//lines of code that are never reached
//lines of code that are never reached
//lines of code that are never reached
//lines of code that are never reached
//lines of code that are never reached
//lines of code that are never reached

}
}


This shade example it's very slow on Android even though the IF statement never reaches,
Is the Shade size a performance issue even though some will never execute?
Advertisement
Yes, it can be. Why? Because on some GPUs, all the instructions are executed -- even inside a false test. They're just guarded out so their results don't matter. Why do this? Because it means you can execute the same code on many operation cores at the same time; you only have 1 PC and one instruction fetch/decode logic, but multiple execution units running in parallel. This architecture is also used on desktop systems, but their throughput is so high these days you probably won't notice the speed loss.

Other reasons it could cause performance loss is that the code now contains jumps (past the unexecuted code), which on most embedded GPU chips cause pipeline flushes (there being no power/space budget for hardware for branch prediction or register rename or any of the fancy stuff desktop CPUs do to avoid all this). On particular GPU chips it's more optimal to write assuming branches will or will not be taken in order to avoid the flushes.

Or it causes your code to lie in multiple cachelines and the GPU cannot hold the whole program in its (relatively few) cachelines so there are lots of instruction fetch misses.

It can particularly affect mobile GPU code because the compiler for shader code simply can't be as sophisticated as a desktop shader code compiler -- it has much smaller memory/CPU footprints to operate in. Typically the compilers will not be able to do the static analysis to determine unreacheable code.

Moral to this story; express the actual thing you want the GPU to do as succinctly as possible.
Thank you Katie, very good explanation

This topic is closed to new replies.

Advertisement