If you've noticed a difference in GPU idling between those two cases, I would guess that it's because you're syncing every frame instead of the recommended practice of having 1 frame of latency between CPU and GPU. This is mosly down to your presentation code.
You should have quite a few draws per submission because there's a large CPU cost involved in submission. One of D3D12's advantages is that draws are cheap, but doing one submit per draw-call will negate this benefit.
In my tests I found that ~500 draws per command list performed best in my game.
Thanks Hodgman, could you elaborate on how to do the 1 frame of latency? or point me some resource about that? I have 5 frame buffers but I guess I get confused and lost somewhere. My engine is based on Microsoft's MiniEngine link though I modified lots of things, but the main framework is almost the same: So basically I record commandlist and submit it before present within the same frame....
My current project is more like academic research project so don't have ton of stuffs to draw, so typically I only have around 50 draw/dispatch calls per frame, I mean currently I can have CPU wait for GPU but definitely not the other way around. So what's your suggestion?
also since my project have the following logic per frame, I feel it's very tricky to adapt '1frame of latency' strategy though.
do{
m = CPU_ICPSolver( result ); // Nothing to do with GPU inside
GPU_PrepareWorkingBuffer(
depth_and_normalmap1, // input as SRV
depth_and_normalmap2, // input as SRV
matrix, // input as CBV
workingBuf); // output as UAV (all 7 buffer)
for (int i = 0; i < 7; ++i) {
GPU_Reduction::Process(workingBuf[i]); // reduction to 1 float4 value inside GPU, but not copied to ReadBack buffer
}
GPU_Reduction::Readback( result ); // Read the reduction result, copy from default heap to readback heap, need to wait GPU inside
reprojection_error = GetReprojectionError( result );
}while(iterations < 20 && reprojection_error > threshold)
so my project has a long GPU, CPU work dependency chain here, any suggestions? thanks