[D3D12] Copying between upload buffers

Started by
7 comments, last by Oetker 8 years, 1 month ago

I'm porting over my D3D11 dynamic vertex buffer class. That worked basically like std::vector; I'd create a USAGE_DYNAMIC buffer, map it with WRITE_DISCARD, and if the size turned out to be insufficient, I'd unmap it, create a new larger buffer, CopySubResourceRegion() over the old data, and map the new buffer. Of course, other approaches would be to immediately draw the contents of the old buffer and discard it, or to keep the old buffer and don't copy over the data (a more deque-like approach), but this worked for me.

For D3D12 I'm creating a vertex buffer on an UPLOAD heap, I map it, and if it's too small, create a new buffer. Now comes the issue. I'm trying to copy over the old data using ID3D12GraphicsCommandList::CopyBufferRegion() function, and it doesn't seem to work properly. Note that the debug runtime doesn't complain about anything, which it will if trying to do a plain CopyResource() with an UPLOAD heap as destination.

One issue I can imagine is that the GPU tries to draw from my buffer before the copy is complete. However, I'm unable to insert a resource usage barrier, as for upload heap resources only D3D12_RESOURCE_STATE_GENERIC_READ usage is accepted; if I try to use anything else the debug runtime complains.

Any idea what I'm doing wrong? Do I need to use a fence (I'm not too clear on when a barrier is enough and when a fence is required), or copy to a default usage buffer?

Advertisement
I'm not quite clear what you're doing.
You fill UploadA, use ID3D12GraphicsCommandList::CopyBufferRegion to move it to UploadB, fill some more data into UploadB, and then use ID3D12GraphicsCommandList::CopyBufferRegion to move it to final buffer C?

If you're copying between two upload buffers, it's probably better for the CPU to do it immediately rather than queueing up a GPU copy command.

Alternatively you could use Reserved Resources to create a very large reserved resource buffer at some sane (but large enough) size, eg 100MB. Remember this is just Virtual Memory you're allocating, not actual physical memory. You can then commit physical memory (heap space) to the reserved resource as required. You're effectively increasing its size on demand without the need to copy the data to a larger buffer. If you want to reduce the size of the buffer at some point in the future you can decommit the heap space from the end of the buffer.

For this you only need Tiled Resources Tier 1 support which is supported on all cards that I know of.

Adam Miles - Principal Software Development Engineer - Microsoft Xbox Advanced Technology Group

I fill UploadA, when it's full I allocate larger buffer UploadB, then I use CopyBufferRegion to copy UploadA into UploadB. Then any new data gets written into UploadB, and I destroy UploadA once the current frame has been rendered. So in any case my issue isn't UploadA being destroyed too early, though it might be that I'm using (writing other data to the end of, and drawing) UploadB before the data has been copied.

By having the CPU copy the data, do you mean just using memcpy? In D3D11 (accidentally) reading from DYNAMIC buffers was extremely slow, though I guess the solution for that would just be to mark it was CPU readable. I could see that being faster than having the GPU perform the copy depending on if the data is kept in CPU memory and only read by the GPU on drawing, or if it's actually immediately sent over the bus.

You don't want to copy between upload buffers on the CPU. It can be very slow, since the memory will probably be write combined (uncached reads). There's really no way for you to figure out how big of a buffer you need ahead of time? Personally I would try to avoid having to create new buffers in the middle of rendering setup.

You don't want to copy between upload buffers on the CPU. It can be very slow, since the memory will probably be write combined (uncached reads). There's really no way for you to figure out how big of a buffer you need ahead of time? Personally I would try to avoid having to create new buffers in the middle of rendering setup.

Unfortunately it's not possible to do so; I don't have control over the application that drives the renderer. The suggestion of using Reserved Resources sounds like it could work, but I'd still like to know what I'm doing wrong. When uploading textures (i.e. copying from an upload to a default heap), is the resource barrier enough, or do you also need a fence to know when the texture is ready for use?

You shouldn't need a fence unless you're synchronizing between different queues. However the fact that the runtime complains about trying to transition an UPLOAD resource to a write state suggests that its not a supported operation. In general UPLOAD is meant to be read-once for the GPU (with low bandwidth compared to DEFAULT), its not intended to be writable. It should be possible to make your vertex buffer a DEFAULT resource, and then copy to it through intermediate UPLOAD buffers. However I'm not sure if this is more less complicated than using reserved resources.

Oetker,

In D3D12, CopyBufferRegion should complain about an UPLOAD destination resource, just like CopyResource does. UPLOAD resources are always in the GENERIC_READ state and cannot be transitioned out. Perhaps the errors have already been muted elsewhere?

Are you sure you actually have to copy over the contents of the old buffer?

That worked basically like std::vector; I'd create a USAGE_DYNAMIC buffer, map it with WRITE_DISCARD, and if the size turned out to be insufficient, I'd unmap it, create a new larger buffer, CopySubResourceRegion() over the old data, and map the new buffer.

The CopySubresourceRegion call is actually discarded in the scenario you describe. Map in D3D11 requires WRITE_DISCARD to be passed when called on USAGE_DYNAMIC vertex buffers, meaning the current data in the buffer becomes undefined.

I hope you actually don't have to copy over the data at all, and your scenario is much simpler than you originally believed. But, if you actually must address the issue, there is another option besides reserved resources to consider. You can resort to a CUSTOM heap type. UPLOAD is just an abstraction that can be removed, if it gets in the way. Using a CUSTOM heap will allow you to transition the destination resource to COPY_DEST for CopyBufferRegion, then back to GENERIC_READ for further normal usage.

See https://msdn.microsoft.com/en-us/library/windows/desktop/dn770374(v=vs.85).aspx for more info.

Given the choice of reserved vs. CUSTOM heap options, the reserved resource technique should result in better runtime efficiency. Overall, it avoids copying and a much larger spike in residency. The spike occurs when keeping both buffers alive for awhile. Don't forget you'll likely need an aliasing barrier when using reserved resources. There are less obvious downsides of reserved resources: good D3D tool support for reserved resources will likely take longer to come along than for CUSTOM heaps, and reserved resources are currently a less thoroughly tested code path in drivers.

-Brian Klamik

...

Brian,

Thanks for your reply. On the D3D11 case: after copying the old data to the new buffer, I map with MAP_NO_OVERWRITE, and this works fine...

As for D3D12, no idea why the copy operation doesn't generate an error message. I'll just have to look for another approach. I've tried a custom heap but didn't manage to find a combination of flags that worked for me, maybe I'll give it another try. Or maybe I'll keep around the old buffers and draw those instead of copying over the data. In this case, the cost for a buffer that's too small is multiple draw calls for a single frame, after which the new, larger buffer will be used. Only thing I'll have to watch out for is that primitives or triangle strips cannot span multiple buffers.

This topic is closed to new replies.

Advertisement