Hi,
To better understand the behavior of ExecuteIndirect I took the MS D3D12ExecuteIndirect example and simply changed the number of tris to 320,000. I was initially expecting the CPU load to be little changed as I thought this offloaded execution of the drawlist entirely to the GPU (so the cpu should have more or less fixed overhead - and note the MS example offloads draw list culling to the CS so there is no per-frame work on that on the CPU-side). To my surprise however I immediately became CPU bound, more or less much to the same extent as if I were to stash all the draw commands in a bundle. (To be clear the single indirect draw command shown below is responsible for all the cycles).
I can only assume the MS example is not at issue (at least, it looks ok to me). So, is it (a) my understanding of ExecuteIndirect is flawed (b) the current NVidia driver on my hardware is emulating (353.62 on a GTX 970M respectively) or (c) <insert your opinion here>?
Thanks for the insights!
Thanks,
Jason
m_commandList->ExecuteIndirect(
m_commandSignature.Get(),
TriangleCount,
m_processedCommandBuffers[m_frameIndex].Get(),
0,
m_processedCommandBuffers[m_frameIndex].Get(),
CommandBufferSizePerFrame);