Sign in to follow this  

Compute Shader problem

Recommended Posts

Lets say I have a Compute Shader that calculates whether or not to write to an array based on some stochastic calculation.

When the shader decides that it will write to the array I want it to do so at the index equal to the number of times the array has been written to during the current Dispatch call.

Is the above possible? If so how?

Share this post

Link to post
Share on other sites
Is there memory that can be shared across all dispatched threads and not just across the threads in each group?

(EDIT: For example memory declared as deviceshared as opposed to groupshared?)

(If so then the idea in the OP would be possible by having the shader add one to an integer stored in the shared memory every time it writes to the array)

Share this post

Link to post
Share on other sites
If you don't need precise ordering on the output buffer, you could use the append buffer type. If you do need precise order, use InterlockedAdd to increment an index variable across all threads and use that variable to refer to the buffer offsets manually. If you choose the latter approach, be sure to protect the thread between InterlockedAdd and the actual memory write, as other threads could also increment the index before you do the writing.

The append buffer approach generally gives more performance because the hardware can optimize the write pattern as it sees fit.

The official documentation on append buffer is relatively thin, but the D3D team has written about its usage in various places on the 'net, including these forums. It's not incredibly hard to use, once you manage to actually create one.

Share this post

Link to post
Share on other sites
[quote name='Nik02' timestamp='1298547027' post='4778387']...[/quote]

Thank you very much for your reply Nik02.

I think I will need to use the InterlockedAdd method because I'm actually wanting to write to a Texture3D, though I can see that for a 1D array appending would be ideal.

In general for what I'm trying to do would the following shader code be OK?

#define NumThreadsX 32
#define NumThreadsY 32
#define NumThreadsZ 1

struct MyStruct
float4 a;

RWTexture3D<MyStruct> WriteToUAV : register(u0);
RWTexture2D<int> WriteCountUAV : register(u1);

[numthreads(NumThreadsX, NumThreadsY, NumThreadsZ)]
void MyShader(uint3 ThreadIndex : SV_DispatchThreadID)
//where x and y are arbitrary integers

uint3 WriteToIndex = uint3( x, y, WriteCountUAV[ uint2(x, y) ]);
InterlockedAdd( WriteCountUAV[ uint2(x, y) , 1 ] );

WriteToUAV[WriteToIndex].a = float4(1,1,1,1);

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this