Jump to content
  • Advertisement
Sign in to follow this  
Dargal

OpenGL SwapBuffers blocking

This topic is 901 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

While doing performance analysis using Nividia Nsight I have found out that calls to SwapBuffers might start to block. I manage to reproduce the same behavior in a small test program by simulate slow CPU frames using a 6ms delay. After around 200 frames one can see that SwapBuffers starts to block almost like OpenGL switches mode in to a more strict synchronization. After 250 frames the 6ms delay is removed but SwapBuffers continues to block and never returns to the normal mode. Is this a feature in OpenGL trying to be smart and produce a smoother visualization?

Tested on Windows 10 and Geforce GTX 980.

 

[attachment=30365:Figure1.png]

 

[attachment=30366:Figure2.png]

Share this post


Link to post
Share on other sites
Advertisement
If your GPU frametime is higher than your CPU frametime, then this is expected. Otherwise, you'd just end up with an ever-growing amount of work in the GPU's command queue!
What's the total frametime on the GPU, and on the CPU without counting swapbuffers?

[edit]Actually, in your pictures it looks like initially there's a healty amount of latency between the CPU and GPU -- while the CPU is starting commands for frame N, the GPU is busy working on frame N-1.
However, the GPU is slowly catching up to the CPU and eroding that latency-buffer.
Once the GPU catches up to the CPU, you're in the situation where the CPU is generating commands for frame N, and the GPU has nothing to do, as it's waiting for frame N's commands to arrive.
This is akin to trying to watch an online video that's emptied it's buffer and needs to pause for buffering.

When the GPU runs out of work to do, Windows likely steps in and tells it to sleep for a moment to avoid wasting power, and to let the command queue fill back up. When the CPU finally produces a new frame's worth of gl commands, Windows has to flush them though to the GPU queue and wake it back up.

I would guess that increasing the CPU=>GPU latency (by doing a realistic workload, and by being GPU-bound), throughput will increase.

Share this post


Link to post
Share on other sites

Thanks!

 

Yes, I think you are right. The Sleep(6) after SwapBuffers is a bad way of faking high CPU load but it will produce a ~22ms frame. I'll modify the test and make it more realistic as you suggest.

Share this post


Link to post
Share on other sites

Yes, theagentd you are right! In the test above Threaded Optimization was set to auto and the driver somehow detects bad performance due to implicit synchronization and turns off Threaded Optimization. My guess is that the latency between CPU and GPU is one parameter used for detecting when to fallback to a more strict CPU synchronization.

 

Some more testing using a simple rendering loop drawing one triangle not using textures. See the results in the figures below.

[attachment=30450:Figure1a.png]

Figure 1, threaded optimization off

[attachment=30451:Figure2a.png]

Figure 2, threaded optimization on

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!