Jump to content


Member Since 13 Nov 2008
Offline Last Active Mar 06 2017 08:57 AM

Topics I've Started

Asynchronous constant buffer update

17 July 2013 - 03:13 AM



currently I'm rethinking the way I handle shader constants in our engine. What I currently do is holding a local backing store for each constant buffer which gets filled by the shader constant provider(s). After all constants are assembled the constant buffers are mapped and the hole memory junk is simply copied via memcpy. Additionally I'm doing other stuff, to keep the number of updates as low as possible (sharing buffers, packing constants by update frequence).


This seems to be not efficient though, the GPU seems to keep renaming buffers what stalls the CPU far to often. I thought about doing the update asynchronous so other operations can be done during the update. It now happens that the device context is not multi thread safe, so the synchronization must be done by myself. Does anyone have experience with this topic? Or maybe I'm doing it all wrong and somebody can give me a hint.



DirectX11 performance problems

05 July 2013 - 01:20 PM



I'm currently working on a DirectX11 port of a old DX9 renderer and facing the problem that DX11 seems really slow in comparison to the old stuff. I've already checked all the 'best practice' slides which are available all around the internet (such as creating new resources at runtime, updatefrequency for constantbuffers, etc...). But nothing seems to be a real problem. Other engines i checked are much more careless in most of this cases but seem not to have likewise problems.


Profiling results that the code is highly CPU bound since the GPU seems to be starving. GPUView emphasizes this since the CPU Queue is empty most of the time and becomes only occasionally a package pushed onto. The wired thing is, that the main thread isn't stalling but is active nearly the whole time. Vtune turns out that most of the samples are taken in DirectX API calls which are taking far to much time (the main bottlenecks seem to be DrawIndexed/Instanced, Map and IASetVertexbuffers). 


The next thing I thought about are sync-points. But the only source I can imagine is the update of the constant buffers. Which are quite a few per frame. What I'm essentially doing is caching the shader constants in a buffer and push the whole memory junk in my constant buffers. The buffers are all dynamic and are mapped with 'discard'. I also tried to create 'default' buffers and update them with UpdateSubresource and a mix out of both ('per frame' buffers dynamic and the rest default), but this seemed to result in equal performance.


The wired thing is, that the old DX9 renderer produces much better results with the same rendercode. Maybe somebody has experienced an equal behaviour and can give me a hint.