Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!


1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


Understanding the “sampler array index must be a literal expression” error in ComputeShaders


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
6 replies to this topic

#1 joystick-hero   Members   -  Reputation: 148

Like
0Likes
Like

Posted 24 April 2014 - 02:47 PM

Let's say I have a compute shader that retrieves data from a Texture2DArray using the Id of the group like this:

Texture2DArray<float4> gTextureArray[2];
[numthreads(32, 1, 1)]
void Kernel(uint3 GroupID : SV_GroupID, uint3 GroupThreadID : SV_GroupThreadID)
{
    float3 tmp = gTextureArray[GroupID.x].Load(int4(GroupThreadID.x,GroupThreadID.x,0,0)).rgb;

    ....
}

And let's say I launch it like this

 

deviceContext->Dispatch(2, 0, 0);

 

So, 2 groups, 32 threads each that read pixel values from a Texture2DArray. All the threads in GroupID.x = 0 will read values from gTextureArray[0] and all the threads in GroupID.y = 0 will read values from gTextureArray[1]. It turns out I can't compile that simple code, instead I get this compile error (cs_5_0)

 

error X3512: sampler array index must be a literal expression

 

Now, I know I can do this instead:

Texture2DArray<float4> gTextureArray[2];
[numthreads(32, 1, 1)]
void Kernel(uint3 GroupID : SV_GroupID, uint3 GroupThreadID : SV_GroupThreadID)
{
    float3 tmp = float3(0,0,0);
    if(GroupID.x == 0)
        tmp = gTextureArray[0].Load(int4(GroupThreadID.x,GroupThreadID.x,0,0)).rgb;
    else if(GroupID.x == 1)
        tmp = gTextureArray[1].Load(int4(GroupThreadID.x,GroupThreadID.x,0,0)).rgb;

    ....
}

Or use a switch in case I have lots of groups so it doesn't look that much awful (it still does)

Notice how there is no warp divergence since all threads in each group will go one branch or the other. My question is, am I missing something here? Why does HLSL not support that kind of indexing since I can not see any divergence or other problems, at least in this case?


Edited by joystick-hero, 24 April 2014 - 02:52 PM.


Sponsor:

#2 Hodgman   Moderators   -  Reputation: 40050

Like
5Likes
Like

Posted 24 April 2014 - 11:28 PM

AFAIK, arrays of textures (different to TextureArray's!) are just "syntactic sugar", used at compile time; there isn't really a runtime equivalent for them under the hood. 

 

So at compile time, imagine the compiler is basically transforming

Texture2DArray<float4> gTextureArray[2];

into

Texture2DArray<float4> gTextureArray0;

Texture2DArray<float4> gTextureArray1;

 

If you want to index into the array at compile time, it's fine. The compiler can replace gTextureArray[0] with gTextureArray0 as it parses the code.

But if you want to index into the array at runtime (e.g. with gTextureArray[x], where x is a variable), then the only option the compiler has is to emit "waterfalled" code

(e.g. if( x==0 ) gTextureArray0 else if ( x==1 ) gTextureArray1 else error).

 

Waterfalled code is really ugly stuff, so it looks like the compiler authors have decided that instead of doing it automatically behind the scenes (which might make you think that indexing arrays of textures is supported in hardware), they'd instead force you to write it yourself so you're aware of what's going on.

 

However, this is why Texture2DArray was invented, instead of just Texture2D. At the moment you've got an array of two Texture2DArray's.

Did you actually want just one Texture2DArray with two Texture2D's inside it?

If so, the syntax is: tmp = gTextureArray.Load(int4(x,y, mip, arrayIndex)).rgb;


Edited by Hodgman, 24 April 2014 - 11:34 PM.


#3 joystick-hero   Members   -  Reputation: 148

Like
0Likes
Like

Posted 25 April 2014 - 12:25 PM

First of all thanks for your answer and help. I didn't think about the syntactic sugar possibility.

 

However, this is why Texture2DArray was invented, instead of just Texture2D. At the moment you've got an array of two Texture2DArray's.

Did you actually want just one Texture2DArray with two Texture2D's inside it?

If so, the syntax is: tmp = gTextureArray.Load(int4(x,y, mip, arrayIndex)).rgb;

 

The problem is I kinda needed an array of Texture2DArrays because I need to bind 10240  Texture2D's (64x64 dimension) to this ComputeShader and it greatly exceeds the 2048 length limit for Texture2DArray's. The shader runs faster the more data you give it to process, unless you run out of memory but that's not the case so far xD.  And this was the only idea I had to sort out this problem, well this and TextureCubeArray's maybe? I don't know how those work tho and I had all my Texture2dArray C++ code in place.

I think the right syntax is tmp = gTextureArray.Load(int4(x,y, arrayIndex, mip)).rgb; ?

Unless the documentation is really confusing :c

 

 

Waterfalled code is really ugly stuff, so it looks like the compiler authors have decided that instead of doing it automatically behind the scenes (which might make you think that indexing arrays of textures is supported in hardware), they'd instead force you to write it yourself so you're aware of what's going on.

 

Aside from the fact that it doesn't look much pretty, are there any performance reasons to not use some waterfalled code? Because if the threads in the warps don't diverge I didn't think about another bad consequences.

Again thanks for your help!


Edited by joystick-hero, 25 April 2014 - 12:29 PM.


#4 phantom   Moderators   -  Reputation: 8697

Like
3Likes
Like

Posted 25 April 2014 - 03:50 PM

Is there any reason you can't pack your source data into a larger texture and just calculate the indices in terms of 64*64 'pages' instead?

#5 joystick-hero   Members   -  Reputation: 148

Like
0Likes
Like

Posted 25 April 2014 - 04:40 PM

The textures's contents are the result of previous scene renders from several viewpoints. I don't know if there's an easy way to instruct DirectX to render to some Texture2D area as opposed to an entire one :c



#6 MJP   Moderators   -  Reputation: 14018

Like
3Likes
Like

Posted 25 April 2014 - 06:11 PM

You can do it quite easily by setting an appropriate viewport. the D3D11_VIEWPORT structure specifies the width and height of the viewport, as well as the X and Y offset. So for instance, let's say you had a 256x256 render target that and you wanted to render to the top left corner. You would set TopLeftX = 0 and TopLeftY = 0, and then set Width and Height to 128. Then if you wanted to render to the top right corner you would keep the same Width and Height, but set TopLeftX = 128. And so on, until you rendered all 4 corners.


Edited by MJP, 26 April 2014 - 12:32 PM.


#7 joystick-hero   Members   -  Reputation: 148

Like
0Likes
Like

Posted 25 April 2014 - 08:29 PM

Thanks MJP. I guess that simple-why-I-didn't-think-about-that solution solves all my problems. Gonna try that. Now I'm glad I don't have a picture of me in my profile pic. So there are no performance problems with my original code I suppose? Only it's painfully ugly and not scalable at all xD






Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS