Today I decided to measure performance of my application the way I always did with Win32 applications: via xperf and custom markers.
I placed ID3D11DeviceContext2::BeginEventInt()/EndEvent() in all significant places, ran xperf, collected data, and here is one frame timings:
Line # Label (Field 3) Time (s)
2 Cleanup 46.07790765
3 Deferred GeometryPass 46.077912
4 SSAO pass 46.0780287
5 SSAO Blur 46.0780344
6 SSAO Blur1::Hor 46.07803541
7 SSAO Blur1::Vert 46.07803742
8 SSAO Blur2::Hor 46.0780391
9 SSAO Blur2::Vert 46.0780401
10 RtVisualizer 46.07804178
11 Wnd3d::Present 46.07804513
12 Cleanup 46.10182305 (the next frame)
The biggest timing is between Present() and the next frame.
In process explorer I can see 50% of GPU utilization.
I assume procexp’s 50% = 100% of a real GPU utilization (GPU's fan runs very noisy).
So GPU does not sleep after Present().
With this 2 pieces information in mind, I’ve made two conclusions:
- Immediate context is not immediate: it collects commands to draw. Present() call starts to execute them.
- With my current knowledge I can’t measure GPU performance. =(
It means I can’t see difference between 16 samples and 64 samples in SSAO pass.
Right now the only tools I may to use are: noise of my video card cooler, procexp’s GPU graph, and FPS in my application.
It’s not quite precise.
I would like to see difference when I tune some parameters of my renderer with high precision.
Are there any suggestions how to measure GPU (shader) performance in DX11?
Thanks in advance.