[HLSL] Copy content of Consume Buffer to Structured Buffer

Started by
7 comments, last by neroziros 9 years, 9 months ago

Hi all!

I was wondering if it is possible to copy the content from a consume buffer to a structured buffer.

A little more context: I'm working with a dynamic particle system and, to accomplish inter-particle interaction, I need to compare each particle of the system against each other particle of the same system. The problem is that since I cannot index a consume buffer (to just peek at the values instead of consuming them), I don't know how to do this without copying its content to a structured buffer each frame.

Cheers and thanks in advance for your time!

PS: The reason why I don't use a structured buffer to store all particles in the first place is because the system is of variable size, creating and destroying particles each frame, which I made easily using append and consume buffers.

Advertisement

A consume buffer is just a special case of a structured buffer. If you want to read values from the buffer instead of consuming, just create a shader resource view for your buffer and bind it to your context. You can then just access that shader resource view as a StructuredBuffer in your shader.

Thanks for the advice, it worked! Though now I have a problem: I can't bind the same buffer as a consume buffer and StructuredBuffer at the same time. If I try this, when I access the values of its consumebuffer version they are ok but on its structuredbuffer version they are null. I suppose this is for memory safety reasons.

If I'm correct I could solve this by copying the content of the consume buffer to an auxiliary structured buffer with CopyStructureCount. Will try and report results.


If I'm correct I could solve this by copying the content of the consume buffer to an auxiliary structured buffer with CopyStructureCount. Will try and report results

That's right - and if you have the debug device activated, then you should also get warnings / errors about binding the same resource for input/output (UAV) and input (SRV) at the same time.

Hmm it works but it is badly hurting the performance of the program (It went from 60fps to 30fps). I'm copying the content of the consume buffer with the CopyResource function. Is this problem happening because I am doing this copy each frame? Does someone knows a workaround?

Thanks again!

Don't make assumptions, profile. Could be that it's not the copying being the problem (see below).

If you can, use a GPU profiler (e.g. NVidia's NSight). MJP also has a blog post about using timer queries, though I never could use them for compute shaders with my NVidia, unfortunately.

That being said, I'm a bit concerned about this:

I need to compare each particle of the system against each other particle of the same system


This sounds like an O(N2) approach, which is already bad in itself. Do you use a spatial acceleration structure at all ? Here's an example using a uniform grid (though with a classic pipeline, not compute shaders). And here's some collision detection with CUDA. Or google for some GPGPU implementation/articles of SPH (smoothed particle hydrodynamics).

I ran the profiler and indeed the problem is CopyResource. However now I know why. According to the docs "Doing this frequently at runtime will severely degrade performance... The application needs to wait long enough for the command buffer to be emptied and thus have all of those commands finish executing before it tries to map the corresponding staging resource. ".

I managed to minimize this problem by doing the copy each 5 frames instead of every frame. It is not perfect but I think I should be able to get good approximate results this way.

Thanks for those articles I will check those out! Indeed as I am doing it right now it is a O(N2) approach and it is killing the performance when I am around 100k particles.

Just a quick suggestion regarding your performance problem with CopyResource. The performance warning in the docs only relates to DstResources created with their usage set to D3D11_USAGE_STAGING.

Assuming you only need CPU-write acess to your consume buffer, then so long as you create your consume buffer (the pSrcResource) with D3D11_USAGE_DYNAMIC and your second structured buffer (the pDstResource) with D3D11_USAGE_DEFAULT, the CopyResource shouldn't cause a pipeline stall. Apologies if I'm mistaken, but it might be worth checking those usage flags on the resource creation.

Thanks for the advise but it didn't work. It seems I can't set the consume buffer flag with D3D11_USAGE_DYNAMIC because I need to consume particles from that buffer, and D3D11_USAGE_DYNAMIC don't allow write operations from the GPU side (only CPU)

Cheers!

This topic is closed to new replies.

Advertisement