So the provided example may not issue branch instructions at all. This is a candidate for uniform branching, in which case the runtime or driver may choose to produce multiple compilations of the shader where all of the branches have been resolved and the loops unrolled. This was really common before we had hardware branching, in the 2.x days. I'm not sure to what extent it's still used now, but you can hint the compiler to unroll loops and avoid branches.
Uber shaders give you much more precise control over compilation though.
That sounds really nice. So it basically knows which compiled version to use depending on what arguments I send it?
I would say, glUseProgram(someShader), and based on the parameters I set, it would actually select the real shader I want? In many cases, the branches are super obvious, like this material is not using normal maps, or this light is not casting specular reflections...
Right now I have a system that uses bit masks to figure out the permutations of the uber shader to load and it's cool and all, but if I can have something much simpler, that would be awesome.
And yeah I can see the problem with using too many registers now. It's always good to know how something actually works.