I recently read the same documentation page one more time and saw that just under the quote I gave in the last message:
Quote:
Samplers may also be used in array, although no back end currently supports dynamic array access of samplers. Therefore, the following is valid because it can be resolved at compile time:
tex2D(s[0],tex)
However, this example is not valid.
tex2D(s[a],tex)
Dynamic access of samplers is primarily useful for writing programs with literal loops. The following code illustrates sampler array accessing:
sampler sm[4];
float4 main(float4 tex[4] : TEXCOORD) : COLOR
{
float4 retColor = 1;
for(int i = 0; i < 4;i++)
{
retColor *= tex2D(sm,tex);
}
return retColor;
}
I think it works like feal87 says; this code works because the compiler can unroll the loop. It knows that it will read sm[0],sm[1],sm[2] and sm[3] in that order.
But, still I have a few questions:
1-
Quote:You you can't do proper dynamic access of textures by an array until SM40.
If the "numSplits" from the NVIDIA example were a constant, which is set by the application via SetPixelShaderConstantI function, would the code work? I think it wouldn't because the compiler couldn't unroll the for loop without knowing the value of the numSplits.
2-
Quote:With SM30 you can use dynamic branching, but of course that can have potential performance pitfalls.
How is the dynamic branching implemented by the SM3.0? I am asking this because I have the impression, that somehow in the pixel shaders both of the branching ways are executed. For example if I have the following code:
if(value1 > variable){ DoActionA();}else{DoActionB();}
In this case if both DoActionA() and DoActionB() hit the performance by 5 fps, then this piece of code results in nearly 9-10 fps of performance loss. I tested this branching issue in the pixel shaders on many occasions and the result was always the same. (By the way, my graphics card is a GeForce 8600 GS). The DirectX Documentation says the following about this:
Quote:
The most familiar branching support is dynamic branching. With dynamic branching, the comparison condition resides in a variable, which means that the comparison is done for each vertex or each pixel at run time (as opposed to the comparison occuring at compile time, or between two draw calls). The performance hit is the cost of the branch plus the cost of the instructions on the side of the branch taken. Dynamic branching is implemented in shader model 3 or higher. Optimizing shaders that work with these models is similar to optimizing code that runs on a CPU.
But I am not very convinced on that. Besides, if I remember it correctly, I have read somewhere that in shaders both of the branching paths are executed and the results of the correct path are taken thereafter. My observations seem to verify this, too. So I need to learn what is really going on in dynamic branches,especially for SM3.0.