It seems like it has to do with CPU calling Present too many times before the GPU makes the actual draw call and swap the buffer.
I have tried using the DXGI_PRESENT_DO_NOT_WAIT flag, but it's giving me HRESULT with "GPU is busy" message when I call Present. By using this flag and avoid making further drawcall when HRESULT fails, the throttling is gone on the GPU and CPU is no longer waiting on anything. However, this is obviously not a solution because I'm just ignoring everything until my Present call succeeds, and only continue to Update/Render after that.
Still stumped on why the "Wait" takes so long considering in general my one frame generally takes like ~14ms to render. If anything, the "Draw" row in the attached picture suggests that my previous draw calls and Present have completed along the timeline, so I'm not sure what this "Wait" is for.
Any help would be appreciated.