I've implemented some postprocess effect. Measured its performance with DX query using double buffering and I got 2ms.
Next, I thought about making my scene more complex so I loaded a bit more meshes and whole lot of textures (previously 1 draw call, now 400) and measured postprocess time. Should be the same, right? It's independent of the geometry rendering phase. But to my surprise the DX queries gave me 3.2ms.
I thought that maybe the problem was that I use too few queries; that reading queries from n-1 frame in n-th frame was not good enough. So I switched to quadruple buffering. And now I get 1.42ms timing.
Is there a way to get consistent timings from DX queries? I remember that when I was working on similar things at work where I had access to PS4 I only profiled there because PS4 *always* gave me consistent measurements. But now I'm stuck with DX11 and PC and want to get correct timings.
Also, depending on whether I run my app in Debug or Release mode, I get different timings. 2x worse in Debug. I'm not sure if they should differ that much since I measure GPU time.