Sign in to follow this  

[HLSL] Copy content of Consume Buffer to Structured Buffer

This topic is 1281 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi all!

 

I was wondering if it is possible to copy the content from a consume buffer to a structured buffer.

 

A little more context: I'm working with a dynamic particle system and, to accomplish inter-particle interaction, I need to compare each particle of the system against each other particle of the same system. The problem is that since I cannot index a consume buffer (to just peek at the values instead of consuming them), I don't know how to do this without copying its content to a structured buffer each frame.

 

Cheers and thanks in advance for your time!

 

PS: The reason why I don't use a structured buffer to store all particles in the first place is because the system is of variable size, creating and destroying particles each frame, which I made easily using append and consume buffers.

Edited by neroziros

Share this post


Link to post
Share on other sites

Thanks for the advice, it worked! Though now I have a problem: I can't bind the same buffer as a consume buffer and StructuredBuffer at the same time. If I try this, when I access the values of its consumebuffer version they are ok but on its structuredbuffer version they are null. I suppose this is for memory safety reasons.

 

If I'm correct I could solve this by copying the content of the consume buffer to an auxiliary structured buffer with CopyStructureCount. Will try and report results.

Share this post


Link to post
Share on other sites


If I'm correct I could solve this by copying the content of the consume buffer to an auxiliary structured buffer with CopyStructureCount. Will try and report results

That's right - and if you have the debug device activated, then you should also get warnings / errors about binding the same resource for input/output (UAV) and input (SRV) at the same time.

Share this post


Link to post
Share on other sites

Hmm it works but it is badly hurting the performance of the program (It went from 60fps to 30fps). I'm copying the content of the consume buffer with the CopyResource function. Is this problem happening because I am doing this copy each frame? Does someone knows a workaround?

 

Thanks again!

Edited by neroziros

Share this post


Link to post
Share on other sites
Don't make assumptions, profile. Could be that it's not the copying being the problem (see below).

If you can, use a GPU profiler (e.g. NVidia's NSight). MJP also has a blog post about using timer queries, though I never could use them for compute shaders with my NVidia, unfortunately.
 
That being said, I'm a bit concerned about this:

I need to compare each particle of the system against each other particle of the same system

 
This sounds like an O(N2) approach, which is already bad in itself. Do you use a spatial acceleration structure at all ? Here's an example using a uniform grid (though with a classic pipeline, not compute shaders). And here's some collision detection with CUDA. Or google for some GPGPU implementation/articles of SPH (smoothed particle hydrodynamics).

Share this post


Link to post
Share on other sites

I ran the profiler and indeed the problem is CopyResource. However now I know why. According to the docs "Doing this frequently at runtime will severely degrade performance... The application needs to wait long enough for the command buffer to be emptied and thus have all of those commands finish executing before it tries to map the corresponding staging resource. ".

 

I managed to minimize this problem by doing the copy each 5 frames instead of every frame. It is not perfect but I think I should be able to get good approximate results this way.

 

Thanks for those articles I will check those out! Indeed as I am doing it right now it is a O(N2)  approach and it is killing the performance when I am around 100k particles. 

Edited by neroziros

Share this post


Link to post
Share on other sites

Just a quick suggestion regarding your performance problem with CopyResource.  The performance warning in the docs only relates to DstResources created with their usage set to D3D11_USAGE_STAGING.

 

Assuming you only need CPU-write acess to your consume buffer, then so long as you create your consume buffer (the pSrcResource) with D3D11_USAGE_DYNAMIC and your second structured buffer (the pDstResource) with D3D11_USAGE_DEFAULT, the CopyResource shouldn't cause a pipeline stall.  Apologies if I'm mistaken, but it might be worth checking those usage flags on the resource creation.

Share this post


Link to post
Share on other sites

Thanks for the advise but it didn't work. It seems I can't set the consume buffer flag with D3D11_USAGE_DYNAMIC because I need to consume particles from that buffer, and D3D11_USAGE_DYNAMIC don't allow write operations from the GPU side (only CPU)

 

Cheers!

Share this post


Link to post
Share on other sites
Sign in to follow this