ComputeShader ConsumeStructuredBuffer Problem

Started by
8 comments, last by tobr 11 years, 1 month ago

The Task is to compute the lengths of 64 3D Vectors in ComputeShader using a ConsumeStructuredBuffer as Input and AppendStructuredBuffer as Output.
I always get the result 0 for the length. Even if i try to get the content of the ConsumeStructuredBuffer in the shader i always get 0. I checked the Dimensions of the ConsumeBuffer and the AppendBuffer in Shader and it gets me the correct results.

Here is the code:


struct InputData
{
	float3 data;	
};

struct OutputData
{
	float data;
};

ConsumeStructuredBuffer<InputData> gInput : register(u0);
AppendStructuredBuffer<OutputData> gOutput : register(u1);

[numthreads(64, 1, 1)]
void CS(int3 dtid : SV_DispatchThreadID)
{
	OutputData output;
	InputData input = gInput.Consume();	
	float len = length(input.data);
	output.data = len;
	gOutput.Append(output);
}

technique11 VecAdd
{
	pass P0
	{
		SetVertexShader( NULL );
		SetPixelShader( NULL );
		SetComputeShader( CompileShader(cs_5_0, CS() ) );
	}
}

And here ist the Code which created and sets the Buffers in C++:


struct InputData
{
	//XMFLOAT3 data;
	float data[3];
};

struct OutputData
{
	float data;
};

std::vector<InputData> vecbuffer(64);

	// Initialize the vecbuffer with values [1,10]

for(UINT i = 0; i < vecbuffer.size(); ++i)
{
  vecbuffer.data[0] = MathHelper::RandF(1,10);		
  vecbuffer.data[1] = MathHelper::RandF(1,10);		
  vecbuffer.data[2] = MathHelper::RandF(1,10);		
}

mNumElements = vecbuffer.size();

	// Create a buffer to be bound as a shader input
	D3D11_BUFFER_DESC inputDesc;
	inputDesc.Usage = D3D11_USAGE_DEFAULT;
	inputDesc.ByteWidth = sizeof(InputData) * mNumElements;
	inputDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_UNORDERED_ACCESS;
	inputDesc.CPUAccessFlags = 0;
	inputDesc.StructureByteStride = sizeof(InputData);
	inputDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED;

	D3D11_SUBRESOURCE_DATA vinitInputData;
	vinitInputData.pSysMem = &vecbuffer[0];

        ID3D11Buffer* inputBuffer = 0;
	HR(md3dDevice->CreateBuffer(&inputDesc, &vinitInputData, &inputBuffer));

       // Create a buffer the compute shader can write to 
	D3D11_BUFFER_DESC outputDesc;
	outputDesc.Usage = D3D11_USAGE_DEFAULT;
	outputDesc.ByteWidth = sizeof(OutputData) * mNumElements;
	outputDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_UNORDERED_ACCESS;
	outputDesc.CPUAccessFlags = 0;
	outputDesc.StructureByteStride = sizeof(OutputData);
	outputDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED;

	HR(md3dDevice->CreateBuffer(&outputDesc, 0, &mOutputBuffer));

        D3D11_UNORDERED_ACCESS_VIEW_DESC uavDesc;
	uavDesc.Format = DXGI_FORMAT_UNKNOWN;
	uavDesc.ViewDimension = D3D11_UAV_DIMENSION_BUFFER;
	uavDesc.Buffer.FirstElement = 0;
	uavDesc.Buffer.Flags =  D3D11_BUFFER_UAV_FLAG_APPEND;
	uavDesc.Buffer.NumElements = mNumElements;

	HR(md3dDevice->CreateUnorderedAccessView(inputBuffer, &uavDesc, &mInputUAV));		

	HR(md3dDevice->CreateUnorderedAccessView(mOutputBuffer, &uavDesc, &mOutputUAV));

Has someone a solution or sees and error? I already tried the BindFlags with only D3D11_BIND_UNORDERED_ACCESS but its the same result.


Advertisement

When you bind the buffers to the compute shader, do you specify the initial counts properly to reflect the number of items in them? This is important, or else your compute shader will think the buffer is empty, and when it tries to consume the values it will just get all zeros.

The way that Append/Consume buffers work is that there's a "hidden" counter on the buffer that holds the number of items in it. This number of items <= the number of elements in the buffer when you create it. There are three ways to change this counter: by calling Append (increments the counter), but calling Consume (decrements the counter), or by manually specifying the count when calling CSSetUnorderedAccessViews. In your particular case, if you don't specify the count when binding the UAV it will be 0 and Consume won't give you back an element from the buffer.

In general, you should be careful to consider whether you actually need to use an Append or Consume buffer. In any case where you know the number of elements ahead of time you typically don't need to use them. For instance if you run 100 threads and only some of them may output a value, then you want to use an Append buffer since it's unknown how many you will end up with. However if you then run another compute shader that reads in the N elements that were output and does some processing on them, then you don't need to use a Consume buffer since you can just copy the hidden counter out of the buffer and into a constant buffer. Then you can just access the elements without calling Consume, which is actually faster since the hardware doesn't need to do a global atomic decrement.

The Count is specified in the UAV Description.

(uavDesc.Buffer.NumElements = mNumElements;)

I checked both buffers in the shader with the getGimensions() function (http://msdn.microsoft.com/de-de/library/windows/desktop/ff471461%28v=vs.85%29.aspx) and they have both the right dimensions(numStructs and stride). But it seem they are empty...

Maybe there is an error on how i bind them. Do i have to bind both - the consume buffer and the append buffer with a UnorderedAccessVIew?


That "NumElements" is the maximum number of elements in the buffer. Append/Consume buffers maintain a separate counter indicating how many elements out of the total number are in use at any given period of time. It's conceptually similar to the difference between capacity() and size() in std::vector: the former tell you how much memory was allocated for the internal array of elements, while the latter tells you how many elements you've actually added to the array. Like I said before you need to change the counter you either need to Append to the buffer, you or you need to specify the count when binding the UAV.

Maybe there is an error on how i bind them. Do i have to bind both - the consume buffer and the append buffer with a UnorderedAccessVIew?

Yes, you do need to use UAVs for both of them. In your code above you are creating both of them, so I would assume you are using both... It might help if you just post the code where you set the UAVs in the compute shader, and we can point out what is going on more directly.

MJPs description is very precise and clear - please read through it again if it still isn't completely clear to you. In order for you to get something back when your shader code calls 'Consume', the internal counter has to think there is some elements already existing in it. If you don't use the 'initial counts' parameter when you bind the UAV to the shader stage, then you have to actually append values into that buffer with shader code.

If you want an example of this in action, you can look at the ParticleStorm demo from Hieroglyph 3. In that case, I append values directly in a shader to build up the number of particles in my append buffer. Those are then run through another shader pass that consumes each particle, updates it, then appends it again to another buffer. Follow along in your own code and determine what your structure count is at each point in your program. You can use the ID3D11DeviceContext::CopyStructureCount(...) method to find out what values are in your buffer's structure count.

Ok thanks, i understand. The problem is to set the initial counter for the buffer. I use the Effect11 Framework to set the UnordereredAccessViews (http://msdn.microsoft.com/en-us/library/windows/desktop/ff476783(v=vs.85).aspx). The methods seem to have no parameters to set the counter. That means i cant use the Effect11 Framwork to set an UnorderedAccessView to a ConsumeBuffer? I have to use the CSSetUnorderedAccessViews method..?

I have now tried to set the the UnordererdAccessViews like so:


UINT counters[2] = {mNumElements,0};
ID3D11UnorderedAccessView* uavs[2] = { mInputUAV, mOutputUAV };
md3dImmediateContext->CSSetUnorderedAccessViews(0, 2, uavs, counters);

I still get the result 0 back. I cant find any error.

How are you checking the result of your calculations? I would suggest checking the value you get from the copy structure count method for each of your buffers before and after this pass are executed. If you are still unable to get it to return anything other than 0 for the result, then I would write a small compute shader that will fill your append buffer with with data (i.e. just add vectors that reflect the threadID of the compute shader thread). That would help you rule out that your data is not getting initialized properly, and it will also help you in debugging how the counter works when you are querying its value.

Is there any debug output in the console window when you execute your code? Have you enabled the debug layer when you are creating your device? Both of these will help you in debugging any issues if you haven't already checked them.

Ok it works now. I rewrote the whole test program. Thanks guys!

This topic is closed to new replies.

Advertisement