Hi,
I was looking a little bit around into performance of shaders. Obviously, the less operations the GPU has to execute, the better. Now I was checking the assembler code from FX Composer. I don't know much about optimization, so I tested a simple line:
// 1.
float3 value = tex2D( tex, texcoord );
value = 2 * value -1;
// Compiled, 3 operations:
def c0, -1,-1,-1,-1
tex t0
add r0, t0, t0
add r0, r0, c0
// 2.
float3 value = tex2D( tex, texcoord );
value = 0.5 * value -1;
// Compiled, 2 operations:
def c0, 0.5,0.5,0.5,0.5
def c1, -1,-1,-1,-1
tex t0
mad r0, t0, c0, c1
The second formula takes only 2 operations. But I would say formula 1 is more simple. MAD is a combination of 2 instructions, so I suppose its slower than ADD for example. Is there an overview to see how many cycles each instruction takes? I looked into the FX Composer Perf tool, but it says "cyles:1.00" for both formula's.
Talking about optimizations. I compared 2 other lines:
float3 normal = 2 * tex2D( normalMap, texcoord ) -1;
float3 normal = 2 * ( tex2D( normalMap, texcoord ) -0.5 );
I would say the first line is faster, since I'm working with integer numbers. But the second line took 2 instructions instead 3:
tex t0
add r0, t1_bias, t1_bias
t0_bias? Doesn't that need to be calculated first somehow before?
Greetings,
Rick