So I have this pixel shader for texture splatting a "patch" of terrain using up to 4 blend maps and 12 "splat" textures (layer textures).
I wrote a "super" shader that can handle the maximum case: 4 blend map samples and 12 splat samples = 16 texture samples per pixel (and close to 64 arithmetic operations per-pixel too). This brings me to the edge of ps_2_0 (and even in ps_3_0, you can't exceed 16 texture samples per pixel).
My question is whether a "super-shader" that uses global 0.0f/1.0f floats to eliminate the blends/splats that aren't used is more optimal than unrolling this into 12 different pixel shaders.
The art resources I'm using specify the number of splat textures and blend maps on a small per-patch basis, and not every patch uses the maximum (4/12) amount, some use considerably less.
Here is the "super" splat pixel shader:
float4 DAOCSplatTerrainPS( float2 tiledTexC : TEXCOORD0,
float2 nonTiledTexC : TEXCOORD1,
float shade : TEXCOORD2,
float fogLerpParam : TEXCOORD3) : COLOR
{
float3 c0 = tex2D(SplatTex0S, tiledTexC * gTexScale[ 0 ]).rgb;
float3 c1 = tex2D(SplatTex1S, tiledTexC * gTexScale[ 1 ]).rgb * g_hasSplat[ 1 ];
float3 c2 = tex2D(SplatTex2S, tiledTexC * gTexScale[ 2 ]).rgb * g_hasSplat[ 2 ];
float3 c3 = tex2D(SplatTex3S, tiledTexC * gTexScale[ 3 ]).rgb * g_hasSplat[ 3 ];
float3 c4 = tex2D(SplatTex4S, tiledTexC * gTexScale[ 4 ]).rgb * g_hasSplat[ 4 ];
float3 c5 = tex2D(SplatTex5S, tiledTexC * gTexScale[ 5 ]).rgb * g_hasSplat[ 5 ];
float3 c6 = tex2D(SplatTex6S, tiledTexC * gTexScale[ 6 ]).rgb * g_hasSplat[ 6 ];
float3 c7 = tex2D(SplatTex7S, tiledTexC * gTexScale[ 7 ]).rgb * g_hasSplat[ 7 ];
float3 c8 = tex2D(SplatTex8S, tiledTexC * gTexScale[ 8 ]).rgb * g_hasSplat[ 8 ];
float3 c9 = tex2D(SplatTex9S, tiledTexC * gTexScale[ 9 ]).rgb * g_hasSplat[ 9 ];
float3 c10 = tex2D(SplatTex10S, tiledTexC * gTexScale[ 10 ]).rgb * g_hasSplat[ 10 ];
float3 c11 = tex2D(SplatTex11S, tiledTexC * gTexScale[ 11 ]).rgb * g_hasSplat[ 11 ];
float3 B0 = tex2D(BlendMap0S, nonTiledTexC).rgb;
float3 B1 = (tex2D(BlendMap1S, nonTiledTexC).rgb) * g_hasBlend[ 1 ];
float3 B2 = (tex2D(BlendMap2S, nonTiledTexC).rgb) * g_hasBlend[ 2 ];
float3 B3 = (tex2D(BlendMap3S, nonTiledTexC).rgb) * g_hasBlend[ 3 ];
float3 color = (c0 * shade);
color = (B0.g * c1) + (1 - B0.g) * color;
color = (B0.b * c2) + (1 - B0.b) * color;
color = (B1.r * c3) + (1 - B1.r) * color;
color = (B1.g * c4) + (1 - B1.g) * color;
color = (B1.b * c5) + (1 - B1.b) * color;
color = (B2.r * c6) + (1 - B2.r) * color;
color = (B2.g * c7) + (1 - B2.g) * color;
color = (B2.b * c8) + (1 - B2.b) * color;
color = (B3.r * c9) + (1 - B3.r) * color;
color = (B3.g * c10) + (1 - B3.g) * color;
color = (B3.b * c11) + (1 - B3.b) * color;
return (lerp(float4(color, 1.0f), gFogColor, fogLerpParam));
}
And here is an example of the splatting in action:

So what do you guys think, is it better to use the super shader or to cut-n-paste out 12 shaders that handle the different combos and put them in something like this?
PixelShader psSplatArray20[ MAX_SPLATS ] = { compile ps_2_0 DAOCSplat1TerrainPS(),
compile ps_2_0 DAOCSplat2TerrainPS(),
compile ps_2_0 DAOCSplat3TerrainPS(),
compile ps_2_0 DAOCSplat4TerrainPS(),
compile ps_2_0 DAOCSplat5TerrainPS(),
compile ps_2_0 DAOCSplat6TerrainPS(),
compile ps_2_0 DAOCSplat7TerrainPS(),
compile ps_2_0 DAOCSplat8TerrainPS(),
compile ps_2_0 DAOCSplat9TerrainPS(),
compile ps_2_0 DAOCSplat10TerrainPS(),
compile ps_2_0 DAOCSplat11TerrainPS(),
compile ps_2_0 DAOCSplat12TerrainPS()
};