I'm currently trying my hand at writing a flexible multitexture/multiUV shader, and I'm trying to decide the best way to go about it. I'd like to support up to 16 textures, 4 UV sets, stencil maps, specular maps, etc etc. So you can imagine that this would need a lot of flexibility.
I'm imagining that this shader would be compiled separately for each different kind of material that uses it, so when it actually runs there should be no loops or if statements, just a set of texture lookups and floating point operations.
I can either try to write this with defines, like so:
#ifdef TEXTYPE0
#if TEXTYPE0 == TEX_COLOR3
layercolor = texture2D(S_tex0,TEXUV0).rgb;
#endif
#if TEXTYPE0 == TEX_STENCIL
basecolor = mix(basecolor,layercolor,mask);
mask = texture2D(S_tex1,TEXUV1).r;
#endif
#endif
#ifdef TEXTYPE1 ......
Or write it (much more easily) with a loop statement:
for (i=0;i<texsize;i++){
if(TexTypes == TEX_COLOR3){
layercolor = texture2D(S_tex,TexUVMap).rgb;
} else if (TexTypes == TEX_STENCIL) {
basecolor = mix(basecolor,layercolor,mask);
mask = texture2D(S_tex,TexUVMap).r;
}
}
However I certainly don't want my actual runtime shader to see the loops or if statements. I'm wondering if I can be sure, or at least reasonably confident, that the compiler will optimize away all this control flow, given that it should always be constant?
I know I've read a lot of stories about less-than-smart glsl compilers, so I'm kind of afraid that the defines way is the only way to guarantee good performance on a wide range of GPU's.
Also as a third option I can write my own pre-pre-processor that will handle a #LOOP statement that will be manually unrolled before the source is actually sent to the GPU. This seems like a good tradeoff between performance and ease of coding, although with this I lose ability to port my shader to any other program (ShaderDesigner, etc), and its just extra work to worry about.
I'm sure many others have come across this situation before, so I'm just curious in what ways has this been solved?