# OpenGL Display Loop with Win32

This topic is 2152 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hello all,

I've been working on a little OpenGL test recently, and I was trying to set up smooth animation.  I am programming on Windows, using win32 for my application setup, and GLEW for my OpenGL calls.  I apologize in advance if this question is more win32 than OpenGL, but here goes:

My window is set up to support OpenGL, but is not full screen.  My main loop is the standard while loop, with a PeekMessage() once per iteration.  If no messages exist, I display and update my scene.  The display functions runs a bunch of OpenGL state changes, a few draw functions, and then calls glFlush().  Now, my expectation was that this loop would be running as fast as it can ( ie: no waits ).  But I noticed that my animations appear to be running at around 60 FPS; I profiled it with QueryPerformanceCounter(), and that does indeed seem to be the case.

Initially, I had thought that perhaps one of my win32 functions was blocking in order to fufill some sort of requirement, but upon closer inspection it looks like it is actually my display function that is taking around 16.6 ms on average.  In that function I only call glFlush() at the end, I do not call glFinish().

Does anybody know why this might be happening?  It's actually kind of what I *want* to happen, but I'd like to know why my loop is mysteriously locked to 60 FPS.  My understanding was that getting VSYNC with OpenGL and win32 requires special use of extensions to configure the swap interval, so I was pretty surprised with this behavior.

##### Share on other sites
Don’t call glFlush() unless you have a specific reason.
Call wglSwapInterval( 0 ) to disable v-sync on Windows. You may also have to check your graphics card’s control panel to ensure no overrides are active.

L. Spiro

##### Share on other sites

Check your video card's control panel to see if vsync is forced there.

Your use of glFlush but no mention of a SwapBuffers call indicates that you may have created a single-buffered context.  I'm not sure how single-buffered contexts are handled under WDDM but if you're using one I'd suggest that you stop doing so - create a double-buffered context instead and use SwapBuffers at the end of your display function.  In other words, get things set up the way they should be in your own code before going looking for problems elsewhere.

##### Share on other sites

Thanks for the responses.  An update:

- I do have a double buffered Render Context, I apologize for not mentioning that initially.

- I tried removing the glFlush() call, which seemed to have no effect at all.  This surprised me, as I'm used to flushing my graphics pipe before any accumulated commands are sent.  Is this not required in OpenGL?

- I tried setting wglSwapInterval( 0 ) as you suggested Spiro, and that definitely behaves more like I was expecting.  Now I get far more than 60 iterations per second, with frame averages of around 0.15 ms.

So this makes sense, assuming that some process is waiting for the next VSYNC to actually swap the display buffers ( with the interval set to 1 ).  But what I am wondering is; where is my thread getting blocked exactly?  From my profiling it seems like it is happening during my display functions ( I could narrow that down further ... ), which is entirely OpenGL calls ( the SwapBuffers() call comes after ).  Does OpenGL just wait on the next call if a buffer swap has been requested that hasn't completed yet?

Just for clarity: the reason I'd like to know is so that I can control the VSYNC wait myself.  I actually want to be limited to 60 fps, but I'd like to be able to manage where that wait occurs in my render loop.

##### Share on other sites

glFlush() is a signal to the driver that you want to have all the submitted rendering ready before you go on with the code. And you're correct, as far as I know the wait occurs on SwapBuffers() and there's nothing you can do about it - why precisely would you, though? The best you can do is probably to measure frame time and if your frames are consistently shorter than your desired frame rate, do some extra stuff. It would also kind of fight the nature of VSYNC to be able to control it - the display works on a specific rate, it cannot be altered by programs.

##### Share on other sites

Well, it's not that I want to control the VSYNC itself, its that I want to control when my thread is blocked by it ( or at least know exactly when it is going to occur ).  And that's the funny thing; I assumed it would occur on SwapBuffers, but it seems like it's actually happening somewhere in the OpenGL calls based on my profiling.  Really I just need to know what function is going to block as that will affect how I synchronize with other systems in the simulation.

##### Share on other sites

Alright, dug a little deeper:

I did some more timing to see where the block occurs in my display function.  It's not in glClear(), glUseProgram(), or any glUniform...() calls.  It appears that the wait happens when I call glBindBuffer() to activate my vertex data.

This to me sounds like the buffer resource is being locked internally while the GPU is using it to render.  This is actually undesirable for me, as I definitely want to be able to start writing draw commands for my current frame while the GPU is working on the last frame.  If I'm right about this, is there any way to avoid the stall?  Since I don't need to modify that data, it seems like there ought to be a way to use it for rendering without trying to lock it...

##### Share on other sites

- I tried removing the glFlush() call, which seemed to have no effect at all.  This surprised me, as I'm used to flushing my graphics pipe before any accumulated commands are sent.  Is this not required in OpenGL?

There are already built-in flushes in the OpenGL pipeline, such as swapping render targets etc. Every standard game can simply rely on these built-in flushes; there are very few reasons to call it manually, and unless you actually have a reason (such as multi-threaded rendering) there will indeed be no change except in performance, which will be decreased by glFlush() (and even more so by glFinish()).

V-sync waits only happen inside SwapBuffers().

glBindBuffer() does not cause such a stall. Stalls related to buffers only occur when you try to overwrite parts of the buffer that have previously been sent to the GPU for rendering but have yet to be flushed. Double-buffering these buffers solves this problem, or updating only parts of the buffer that were not sent to the GPU on the previous frame.

L. Spiro

##### Share on other sites

Thank you Spiro, that clarification helps a lot.

The only thing that doesn't make sense then is the profiling I am doing.  I built a little profiling class using QueryPerformanceCounter(), and I am using it to record time taken for blocks of code.  My loop looks like this:

while(!done)
{
if (PeekMessage(&msg,NULL,0,0,PM_REMOVE))
{
if (msg.message==WM_QUIT)
{
done=TRUE;
}
else
{
TranslateMessage(&msg);
DispatchMessage(&msg);
}
}
else
{

display();

SwapBuffers(hDC);

}
}


Now I have a start/stop timer block around the display() function and the SwapBuffers() function.  If the main thread wait occurs inside SwapBuffers(), I would expect that the SwapBuffers() timer would take around 16 ms ( as my scene is very light, just two quads being drawn ).  But from my numbers, the SwapBuffers() call takes almost no time and the display() call takes around 16.6 ms.  That's why it seemed like something in my display() function was blocking, which lead me to glBindBuffers().

But maybe I have an error in my timer class, or I am making an invalid assumption here?

##### Share on other sites

Think I found my problem!

In my display() function, I was calling glGetUniformLocation() for the projection, world and texture locations.  That seems to be what blocks, not glBindBuffers() ( I had been overlooking that in the profile region ).

If I instead cache the uniform locations on init after linking the shader, I no longer stall in my display() function.  Now SwapBuffers() is what takes the 16.6 ms as expected.

I am a little surprised that getting the uniform location would block like that, but maybe it's documented somewhere...?

• ### Game Developer Survey

We are looking for qualified game developers to participate in a 10-minute online survey. Qualified participants will be offered a \$15 incentive for your time and insights. Click here to start!

• 16
• 9
• 15
• 9
• 11