Efficiently Adding "Optional" Features to Shaders

Started by
5 comments, last by Hodgman 12 years, 11 months ago
I'm writing some graphics code that will load models up in OpenGL, and render them using GLSL shaders. What I'd like to do is use the same shader to render all of my models to reduce shader swaps per frame. I'd like to have these shaders process specular and normal mapping if specular and normal maps are provided for the models' meshes. If not, I'd like my shaders to skip processing these. These shaders MUST also be compliant with OpenGL ES 2.0 since my game is being developed for iPad as well. I haven't figured out a way to do this efficiently though. Does anyone have an idea on how a game engine would do this?
Advertisement
using simple a simple uniform variable in your ubershader declaring what 'mode' the shader should be operating in, and then creating multiple case scenarios that most effeciently combine the desired graphical effects into the output fragment, and control this by setting the mode variable from your engine.


At least this has been my plan thus far until that point comes in my game dev adventures.
That sounds reasonable. I haven't used switch statements in GLSL yet --didn't even know they were supported yet haha! Another idea I had was to just support a bare-minimum shader that renders textured geometry. Once I come up with a standard lighting procedure, I'll also implement that as well (with projective shadows too O_o). Then, I could also add custom shader support, so that my model renderer doesn't have to use my base code's stock shader for rendering. I'm doing that right now with a few other classes I have setup.

Hmm... Your switch statement idea is a good one though.
Branching statements inside shaders are extremely slow compared to your traditional x86 CPUs.

Switching shaders is small a one-time cost on the CPU, whereas branching in a fragment shader is a large GPU cost incurred by every rendered pixel (or if in a vertex shader - incurred by every single vertex).

For optional features, I use [font="Courier New"]#ifdef[/font]/[font="Courier New"]#else[/font]/[font="Courier New"]#endif[/font] statements, and then compile my shader source file many different times (once for each of the permutations of the shader's options).
Branching is less expensive as long as you do it at the ends of the programs. The reason is that processors which take the branch are "parked", the remainder run to completion and then the parked ones restarted.

(This means that the total run time of a shader is approximately proportional to the length of the program raised to the power of the number of branches in it.)

So it's often cheaper to compute multiple things and pick one at the end, than to choose a route at the start.

If you can choose by maths instead of branching do it. Instead of using an if or a switch, pass in a bunch of 1 or 0 variables, multiply results by them or 1-them and sum the answers. Doing the computations and discarding them can be less expensive. It's worth doing a couple of extra maths operations to avoid a branch.

Branching statements inside shaders are extremely slow compared to your traditional x86 CPUs.


Well, yes and no... it does depend somewhat on the hardware but if all threads in a wavefront/warp take the same path then the cost is pretty close to 'zero' beyond the requirement to check. The branching itself is generally dealt with in an instruction sequencer outside of the ALU units themselves so it won't cost any ALU time to setup.

But, as I said, this does depend on the hardware somewhat so when deciding if you want to use brancing or not the advice I'd give it find out what your target hardware does in this regard.
But, as I said, this does depend on the hardware somewhat so when deciding if you want to use brancing or not the advice I'd give it find out what your target hardware does in this regard.
Yeah on a modern PC (~DX10/11 level) it's not as much of a concern, but on the OPs iPad I'd be quite concerned.

Apple's 'best practices' doc says that if you have to branch, then radioteeth's branch-on-uniform technique is okay.
Avoid Branching
Branches are discouraged in shaders, as they can reduce the ability to execute operations in parallel on 3D graphics processors. If your shaders must use branches, follow these recommendations:
  • Best performance: Branch on a constant known when the shader is compiled.
  • Acceptable: Branch on a uniform variable.
  • Potentially slow: Branching on a value computed inside the shader.
Instead of creating a large shader with many knobs and levers, create smaller shaders specialized for specific rendering tasks. There is a tradeoff between reducing the number of branches in your shaders and increasing the number of shaders you create. Test different options and choose the fastest solution.[/quote]At the same time though PowerVR notes that you may be able to group more geometry into a draw-call if they're using the same shader, while also pointing out the costs of branching.. so it's quite situation dependent on what the best path is:
Batch your primitives to keep the number of draw calls low.
Try to minimise the number of calls used to render your scene as these can be expensive. Using branching in your shaders may help to have better batching.
...
Static flow control can be used to combine many shaders into one big shader. This is not generally a win, though, and you should benchmark thoroughly when deciding whether to put multiple paths into one shader.[/quote]http://developer.app...forShaders.html
http://www.imgtec.co...1f.External.pdf

This topic is closed to new replies.

Advertisement