• Advertisement
Sign in to follow this  

Understanding the “sampler array index must be a literal expression” error in ComputeShaders

This topic is 1366 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Let's say I have a compute shader that retrieves data from a Texture2DArray using the Id of the group like this:

Texture2DArray<float4> gTextureArray[2];
[numthreads(32, 1, 1)]
void Kernel(uint3 GroupID : SV_GroupID, uint3 GroupThreadID : SV_GroupThreadID)
{
    float3 tmp = gTextureArray[GroupID.x].Load(int4(GroupThreadID.x,GroupThreadID.x,0,0)).rgb;

    ....
}

And let's say I launch it like this

 

deviceContext->Dispatch(2, 0, 0);

 

So, 2 groups, 32 threads each that read pixel values from a Texture2DArray. All the threads in GroupID.x = 0 will read values from gTextureArray[0] and all the threads in GroupID.y = 0 will read values from gTextureArray[1]. It turns out I can't compile that simple code, instead I get this compile error (cs_5_0)

 

error X3512: sampler array index must be a literal expression

 

Now, I know I can do this instead:

Texture2DArray<float4> gTextureArray[2];
[numthreads(32, 1, 1)]
void Kernel(uint3 GroupID : SV_GroupID, uint3 GroupThreadID : SV_GroupThreadID)
{
    float3 tmp = float3(0,0,0);
    if(GroupID.x == 0)
        tmp = gTextureArray[0].Load(int4(GroupThreadID.x,GroupThreadID.x,0,0)).rgb;
    else if(GroupID.x == 1)
        tmp = gTextureArray[1].Load(int4(GroupThreadID.x,GroupThreadID.x,0,0)).rgb;

    ....
}

Or use a switch in case I have lots of groups so it doesn't look that much awful (it still does)

Notice how there is no warp divergence since all threads in each group will go one branch or the other. My question is, am I missing something here? Why does HLSL not support that kind of indexing since I can not see any divergence or other problems, at least in this case?

Edited by joystick-hero

Share this post


Link to post
Share on other sites
Advertisement

First of all thanks for your answer and help. I didn't think about the syntactic sugar possibility.

 

However, this is why Texture2DArray was invented, instead of just Texture2D. At the moment you've got an array of two Texture2DArray's.

Did you actually want just one Texture2DArray with two Texture2D's inside it?

If so, the syntax is: tmp = gTextureArray.Load(int4(x,y, mip, arrayIndex)).rgb;

 

The problem is I kinda needed an array of Texture2DArrays because I need to bind 10240  Texture2D's (64x64 dimension) to this ComputeShader and it greatly exceeds the 2048 length limit for Texture2DArray's. The shader runs faster the more data you give it to process, unless you run out of memory but that's not the case so far xD.  And this was the only idea I had to sort out this problem, well this and TextureCubeArray's maybe? I don't know how those work tho and I had all my Texture2dArray C++ code in place.

I think the right syntax is tmp = gTextureArray.Load(int4(x,y, arrayIndex, mip)).rgb; ?

Unless the documentation is really confusing :c

 

 

Waterfalled code is really ugly stuff, so it looks like the compiler authors have decided that instead of doing it automatically behind the scenes (which might make you think that indexing arrays of textures is supported in hardware), they'd instead force you to write it yourself so you're aware of what's going on.

 

Aside from the fact that it doesn't look much pretty, are there any performance reasons to not use some waterfalled code? Because if the threads in the warps don't diverge I didn't think about another bad consequences.

Again thanks for your help!

Edited by joystick-hero

Share this post


Link to post
Share on other sites

The textures's contents are the result of previous scene renders from several viewpoints. I don't know if there's an easy way to instruct DirectX to render to some Texture2D area as opposed to an entire one :c

Share this post


Link to post
Share on other sites

You can do it quite easily by setting an appropriate viewport. the D3D11_VIEWPORT structure specifies the width and height of the viewport, as well as the X and Y offset. So for instance, let's say you had a 256x256 render target that and you wanted to render to the top left corner. You would set TopLeftX = 0 and TopLeftY = 0, and then set Width and Height to 128. Then if you wanted to render to the top right corner you would keep the same Width and Height, but set TopLeftX = 128. And so on, until you rendered all 4 corners.

Edited by MJP

Share this post


Link to post
Share on other sites

Thanks MJP. I guess that simple-why-I-didn't-think-about-that solution solves all my problems. Gonna try that. Now I'm glad I don't have a picture of me in my profile pic. So there are no performance problems with my original code I suppose? Only it's painfully ugly and not scalable at all xD

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement