• 15
• 15
• 11
• 9
• 10

# Are shaders really that limited?

This topic is 3383 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I can't believe the following error message. I declared an array of 24 floats.
float fArrayOfFloats[24];

I then fill the whole array with 0es.
for (int nFill = 0; nFill < 24; nFill++)
{
fArrayOfFloats[nFill] = 0.0f;
}

Then I want to fill 4 values of that array with values of another float4.
float4 valuesToWriteIntoPositionsIndicatedByMyIndexFloat4 = float4(0.125, 0.125, 0.125, 0.125);
float4 myIndexFloat4 = float4(0, 3, 5, 10);
int nWhereToWrite0 = myIndexFloat4.x;
fArrayOfFloats[nWhereToWrite0] = valuesToWriteIntoPositionsIndicatedByMyIndexFloat4 [0];

This should be a common task, does it? Compiling gives me: error C5025: lvalue in assignment too complex How do I do that another way?

##### Share on other sites
The operation you're trying to do is called "dynamically addressed scattered write" which means that the destination memory addresses (in this case, those corresponding to the array indices) cannot be determined until runtime.

This is not at all well suited to parallel stream processors (GPU), which generally expect that the destination registers are in a fixed array. This is the price of performance as compared to CPUs.

The GPU is perfectly capable of reading from many variable addresses during primitive processing (scattered read) so the best course of action would be to refactor your algorithm to match this capability.

D3D11 introduces limited scatter write capabilities, but it still is not as flexible as a general-purpose CPU in this context.

##### Share on other sites
Thanks for the quick answer. I tried to solve it that way, that I put the copy operation in an existing loop I want to run 24 times.

Unfortunately the operations in this loop also seems to be too much for CG. I could run the loop 9 times, but that is not sufficiant for me.

The error message is now:

(0) : error C6001: Temporary register limit of 32 exceeded; 79 registers needed to compile program

C:\Program Files\NVIDIA Corporation\Cg\bin\cgc.exe -profile vp40 \$(ItemPath)

Any ideas how I can solve that?

##### Share on other sites
There is no straight-forward way to solve your problem from the angle you're using now, if the index array has to be dynamic (which, I guess, is the case - correct me if I'm wrong).

Rather, you should try to flip the problem upside down; since you cannot dynamically choose the destination addresses (given fixed source values) at shader runtime, instead resolve source addressess dynamically given fixed destination. As to how to do this - and whether it is possible to begin with - depends entirely on the complete algorithm.

The concept of LValue and RValue effectively represent the destination and source registers (more accurately, expressions that resolve to them).