Jump to content
  • Advertisement
Sign in to follow this  

[DirectX10] texture update : UpdateSubResource vs staging intermediate ring buffer

This topic is 3006 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, many papers or programming guide (NVidia GPU programming guide - http://developer.download.nvidia.com/GPU_Programming_Guide/GPU_Programming_Guide_G80.pdf, GDC08 Dx10 performance tips : http://www.docstoc.com/docs/3670084/DirectX-10-Performance-Tips, etc...) advice to use a kind of ring buffer of Staging intermediate textures, then use a GPU copy to the destination texture, instead of using directly UpdateSubResource(). Using this ring buffer we always obtain a CPU Copy (Map() call on the staging texture with NO_WAIT flag) then an asynchronous GPU copy (copysubresource() call). Could somebody explain me how it could be more efficient than using UpdateSubTexture() ? The official documentation of the method says : * When there is contention for the resource, UpdateSubresource will perform 2 copies of the source data. First, the data is copied by the CPU to a temporary storage space accessible by the command buffer. This copy happens before the method returns. A second copy is then performed by the GPU to copy the source data into non-mappable memory. This second copy happens asynchronously because it is executed by GPU when the command buffer is flushed. * When there is no resource contention, the behavior of UpdateSubresource is dependent on which is faster (from the CPU's perspective): copying the data to the command buffer and then having a second copy execute when the command buffer is flushed, or having the CPU copy the data to the final resource location. This is dependent on the architecture of the underlying system. So what i understand (but I'm probably wrong...) is that : - in first case case, we've got the same behavior of using a ring buffer (a CPU copy + an asynchronous GPU copy), without implementing the system by ourself, letting the DirectX layer treat this. I don't know what this layer do, and in what kind of memory the CPU copy will occurs on, but the only thing i could admit to be better to implement is to have ring buffer staging resources created early in application....but I'm really not convinced of performance gain. - in the second case, depending of the underlying system we may have only one CPU copy, and in worst case the same as first case. So could an expert bright me on this ? If we're sure that (or if we know that the majority of) UpdaSubResource() calls will occurs on resource without contentions (GPU doesn't use it anymore). Is it needed to implement a manual substitute solution ? Thanks. (PS : my assumption is that Mircrosoft graphics engineers are better than me and may have designed a better UpdateSubResource() than any substitute i can do...)

Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!