Jump to content
  • Advertisement
Sign in to follow this  
Tispe

UpdateSubresource on StructuredBuffer

This topic is 1213 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello

 

I am having some trouble trying to update a StructuredBuffer with new data.

 

Currently during initialization of the application I create a CComPtr<ID3D11ShaderResourceView> pStructuredBuffer that encapsulates a CComPtr<ID3D11Buffer> pBuffer that I create first with all the proper flags. I forget about pBuffer after creating the resource view and let pBuffer go out of scope.

 

Later when I want to update the resource I do this:

ID3D11Resource *pBuffer;
pStructuredBuffer->GetResource(&pBuffer);
pDeviceContext->UpdateSubresource(pBuffer, 0, 0, VectorOfStructures.data(), 0, 0);
pDeviceContext->VSSetShaderResources(0, 1, &pStructuredBuffer.p);
pBuffer->Release();

However I get an Unhandled exception and Visual Studio is breaks at UpdateSubresource(), I suspect it might have something to do with SrcRowPitch or SrcDepthPitch.

 

Unhandled exception at 0x74D83E28 (KernelBase.dll) in app.exe: 0x0000087D (parameters: 0x00000000, 0x00C1E0A4, 0x00C1D4DC).

 

Any clues on what I might do wrong? What should SrcRowPitch and SrcDepthPitch be if I have a vector of structures?

Share this post


Link to post
Share on other sites
Advertisement

Got it working by using Map/Unmap instead of UpdateSubresource. I guess the latter does not work on DYNAMIC buffers?

Share this post


Link to post
Share on other sites
UpdateSubresource only works for resources created with DEFAULT usage. For DYNAMIC you should Map and Unmap.

Share this post


Link to post
Share on other sites

Something I wanted to know, when Mapping a resource, what does that entail on the driver side? msdn says it "denies the GPU access to that subresource", but does it also block the GPU from doing other things?

 

Since it is a deviceContext method, I get the idea that everything on the GPU side kinda stops since I don't do any drawcalls while copying.. When I map/memcpy/unmap 120MB of data over to the GPU, it takes about 20ms to do so. During that time memcpy blocks the calling thread. Is this 20ms of GPU computation going to waste, or does the driver do tricks behind the scenes to pipeline this sort of stuff?

 

Is this a good place to double-buffer and do a map/asynch(memcpy)/unmap, so that the main thread can perform drawcalls while it's copying? 

Share this post


Link to post
Share on other sites
Mapping a DYNAMIC resource with D3D11_MAP_WRITE_DISCARD is meant to prevent any kind of GPU synchronization and stalls. Typically the GPU won't be executing commands until quite some time after the CPU issues D3D commands. The D3D user-mode drivers will typically buffer things so that they can be executed on a separate thread, and the driver will send off packets of work to the GPU at some later point. In practice you can end up having the GPU be up to 3 frames behind the CPU, although in practice it's usually closer to 1 frame. Because of that lag, you have an a potential issue with updating GPU resources from the CPU. If the CPU just modified a resource with no synchronization (which is effectively what happens when you use D3D11_MAP_WRITE_NO_OVERWRITE), the CPU might be changing it while the GPU is still using it, or hasn't used it yet. This is obviously bad, since you want the GPU to work with the data that you originally specified for the frame that its working on. To get around this DISCARD allows the driver to silently hand you a new resource behind the scenes, which is known as "buffer renaming". By giving you a new piece of memory to work with, you can write to that one while the GPU is still using the old piece of memory from a previous frame. Doing this can add a fair bit of overhead, since the driver might implement this by having some sort of pool where it frees up old allocations by waiting on labels to ensure that the GPU has finished using them. It may also decide to block you if insufficient memory is available, so that it can wait for the GPU in order to free up more memory. And then of course once the driver has given you the memory to write to, it will probably take a while to actually fill such a large buffer. Even at peak CPU bandwidth, it will surely take at least a few milliseconds to touch 120 MB of memory. It can also be slower in some cases, since the memory you get back from Map will typically be in uncached, write-combined memory so that it can be visible to the GPU.

The first thing I would probably do here is try to profile how much of your overhead is coming from Map(), and how much of it is coming from just filling the buffer with data. If Map() is taking a long time, you may want to consider an alternative approach. DYNAMIC is usually used for small, frequently-updated resources like constant buffers. The driver's internal mechanisms may not be scaling particularly well for this case. Another approach you can try is to have your own pool of buffers that have STAGING usage. You can cycle through these (2-3 should be enough), and then once you've filled them you can use CopyResource to copy their contents to a GPU-accessible buffer with DEFAULT usage.

Share this post


Link to post
Share on other sites

To follow on MJP's great advice, have you tried using the performance tools in the latest versions of Visual Studio?  They can show you a pretty good representation of the parallelism between the CPU and GPU, and will likely show you some insight into what is costing you time that stacks up in your overall frame time.

Share this post


Link to post
Share on other sites

I just recently installed Win10 and VS Community. I did notice a graph in the Diagnostics window during debugging. But I have not studied up on how to use these tools properly. Do you know of any good places or videos on youtube?

 

Would be nice to know where the bottlenecks are.

 


Typically the GPU won't be executing commands until quite some time after the CPU issues D3D commands. The D3D user-mode drivers will typically buffer things so that they can be executed on a separate thread, and the driver will send off packets of work to the GPU at some later point. In practice you can end up having the GPU be up to 3 frames behind the CPU, although in practice it's usually closer to 1 frame.

 

Great to know, I hope this translates in my case that I don't have to worry about keeping the GPU busy while mapping with D3D11_MAP_WRITE_DISCARD. That it all buffers up and only when everything is ready the GPU goes bananas and produces a frame. The interface kinda lets you believe that a DrawCall is executed when called.

Share this post


Link to post
Share on other sites

The interface kinda lets you believe that a DrawCall is executed when called.


Indeed, it does make it appear like that is the case. That's actually one of the major changes for D3D12: with D3D12 you build up one or more command lists, and then you must explicitly submit them to the GPU. This makes it very clear that you're buffering up commands in advance, and also lets you make the choice as to how much latency you want between building command lists and having the GPU execute them. It also completely exposes the memory synchronization to the programmer. So instead of having something like D3D11_MAP_WRITE_DISCARD where the driver is responsible for doing things behind the scenes to avoid stalls, it's up to you to make sure that you don't accidentally write to memory that the GPU is currently using.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!