• Advertisement
Sign in to follow this  

VSync messing with QPC() timing?

This topic is 1651 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I apologize if this is the wrong subforum, but I couldn't quite figure out where to post this.
Anyway, I'm currently trying to make my physical calculations independent from rendering. What I've done is make every physical object store its transformation matrices, which are then modified during physical calculations or used for rendering.

Calculations and rendering are initiated in the message loop as follows:



timer.Update(); //this essentially does the QueryPerformanceCounter(&tickCount)
if((timer.tickCount.QuadPart-timer.memory1.QuadPart)>=(timer.tickFreq.QuadPart/GlobalParams.PhysicsFreq))
{
	ProcessPhysics(ObjectList,GlobalParams.PhysicsFreq);
	timer.memory1=timer.tickCount;
	phystick++;
}
if((timer.tickCount.QuadPart-timer.memory2.QuadPart)>=(timer.tickFreq.QuadPart/GlobalParams.MaxFPS))
{
	FramePass();
	timer.memory2=timer.tickCount;
	phystick=0;
}

Notes: timer is a custom object;

tickFreq is initialized via QueryPerformanceFrequency(&tickFreq) in the constructor;

memory1, memory2 and phystick are pre-initialized to zero;

phystick is displayed during rendering using ID3DXFont interface.

Also, let's assume MaxFPS to be 60 and PhysicsFreq to be 1000.

And I'm using a laptop.

 

Now, I've decided to do the whole "count the ticks" thing to check that "physics-to-frames" ratio stays the same. And here's the problem. Without VSync (D3DPRESENT_INTERVAL_IMMEDIATE), it's fine. With VSync (D3DPRESENT_INTERVAL_ONE), it stays level for around 5 seconds and then gradually drops to roughly 1/3rd of its original value. All movements slow down as well.

 

I've tried commenting out the physics call, updating tickFreq every Update(), but nothing works. FramePass() is executed at stable intervals, I've checked. What in the world could I be missing?

(Also, feel free to point out if my concept is a waste of time.)

Share this post


Link to post
Share on other sites
Advertisement
I assume you've got a while(true){...} around the above code, or similar, so it's:
while(true)
{
 if( 1ms has passed since last physics update )
     Advance physics by 1ms
  if( 16ms has passed since last draw )
     Draw
}
If you're vsync'ing to 60Hz, then the Draw function will take 16.6667ms. Then in the next iteration of the loop:
* you check if 1ms has passed, and it has, so you advance the physics by 1ms.
* You check if 16.6667ms has passed, and it has, so you draw another frame (which will take 16.6667ms of time again).

So, for each 16.6667ms of time that passes, you only advance the physics simulation by 1ms!
If your loop ever takes more than 1ms, then your physics will go into slow motion.

You need something like:
double physicsAccumulator = 0;
while(true)
{
  physicsAccumulator += actual passed since last iteration
  while(physicsAccumulator >= 1ms)
  {
    Advance physics by 1ms
    physicsAccumulator -= 1ms
  }
  if( frame-limiter is enabled && 16ms has passed since last draw )//frame limiter shouldn't be enabled if vsync is enabled
    Draw
}
Edited by Hodgman

Share this post


Link to post
Share on other sites

Um, I didn't quite get it. Are you saying VSync stalls the rendering function until the display has finished drawing the previously presented frame? So the "while(true)" only fires once every 16.(6) ms?

 

Regarding the "physicsAccumulator": I've considered passing the time since the last physics iteration to ProcessPhysics(), so that it would advance the game world by that amount, but then, if a game were to freeze momentarily, there would be a skip. I thought it would make quite a mess of things, so I've decided to use a fixed step size.

Share this post


Link to post
Share on other sites

Are you saying VSync stalls the rendering function until the display has finished drawing the previously presented frame? So the "while(true)" only fires once every 16.(6) ms?

Yes, that is what Hodgman is saying; vsync blocks until the display is ready for another frame.

I thought it would make quite a mess of things, so I've decided to use a fixed step size.

Then you might find Fix Your Timestep an interesting read.

Share this post


Link to post
Share on other sites

Yes, that is what Hodgman is saying; vsync blocks until the display is ready for another frame.

But why does the slowdown only occur after several seconds, then? To clarify, at first everything is just like without VSync, with phystick showing 16-17 every frame (that's got to mean first "if" fires 16.(6) times as often as the second one... right?). Then it slowly drops. And even so, it stays higher than 1, so "while(true)" must be firing at more than 60Hz.

Edited by Pygmyowl

Share this post


Link to post
Share on other sites

 

Yes, that is what Hodgman is saying; vsync blocks until the display is ready for another frame.

But why does the slowdown only occur after several seconds, then? To clarify, at first everything is just like without VSync, with phystick showing 16-17 every frame (that's got to mean first "if" fires 16.(6) times as often as the second one... right?). Then it slowly drops. And even so, it stays higher than 1, so "while(true)" must be firing at more than 60Hz.

 

Traditionally, enabling vsync would mean that the "Present" function would block until a vblank period was reached, which would be once every 16.6ms for a 60Hz display (note that some displays may be 59Hz, or 72Hz, etc... so your MaxFPS parameter needs to be configurable).
However, this isn't necessarily true any more. Vsync means that the driver will implement a frame-limiter on the GPU, so that it only displays images in sync with the display's refresh rate. If you feed it images faster than that (calling present == giving the GPU an image to display), then instead of blocking, the driver can choose to queue up those images.
e.g. Say the driver can queue up 10 images at once -- then if you call Present 11 times in a row really quickly, the first 10 will complete immediately, and the 11th will block.

Basically, when you call any D3D function, you're just putting commands into a queue, which the GPU will execute later. If this queue gets too full, then Present will block until the GPU has emptied it enough. If vsync is on, then at some point Present is likely to block for a large amount of time, such as 16ms, so your loop has to be able to deal with jumps in time that are at least about that big.

Anyway, regardless of whether you're using vsync or not, your current game-loop/timer code is broken. For your code to work, you're assuming that the amount of work done per frame will always be less than 1ms. If drawing a frame and/or processing the physics ever takes more than 1ms, your physics will go into slow motion.
You cannot draw or simulate very much stuff in 1ms, so your game loop will only work for very simple games, or on very high performance computers. If you add more "stuff" to your game, it will go into slow motion due to each iteration of your loop taking more than 1ms...

On a side note: If your physics update ever takes more CPU time than your fixed timestep (e.g. if at the moment, the ProcessPhysics took more than 1ms to complete), you're in bigger trouble though - because there'd be no way to fix that. If 3ms has passed since the last update, you need to update your physics 3 times in order to not go into slow motion... but if each update takes the CPU 2ms to complete, then now 6ms has passed, so you'll have to update physics 6 times (which will take 12ms, etc, etc). To mitigate this, I'd suggest using a much lower fixed-timestep than 1000Hz.
1000Hz is really quite extreme for physics - you probably don't need it to be that precise, usually even 30Hz is enough.
I'm making a racing simulation game, which requires a lot of accuracy, so I update most of my physics at 60Hz, and just the car wheels (which is the most important part) are updated at 600Hz.
 

Regarding the "physicsAccumulator": I've considered passing the time since the last physics iteration to ProcessPhysics(), so that it would advance the game world by that amount, but then, if a game were to freeze momentarily, there would be a skip. I thought it would make quite a mess of things, so I've decided to use a fixed step size.

As above, your fixed step size has the other draw-back of causing your game to run in slow motion. You can create a hybrid of the two, which uses an accumulator but also has a maximum-updates-per-render setting to avoid big jumps.

uint64 physicsAccumulator = 0;
while(true)
{
  physicsAccumulator += ticks passed since last iteration
  int physicsSteps = physicsAccumulator / ticksPerPhysicsUpdate
  physicsAccumulator -= physicsSteps * ticksPerPhysicsUpdate // i.e. physicsAccumulator = the remainder

  if( physicsSteps > max updates per render )
      physicsSteps = max updates per render;//avoid large jumps if the game has been stalled
  for(int i=0; i!=physicsSteps; ++i)
  {
    Advance physics by fixed time step
  }
  Draw
}

Share this post


Link to post
Share on other sites

Never looked in detail, but I am sure I saw a non-blocking mode somewhere, e.g. so you can do:

drawFrame();
while(!Present())
{
    //Present was not ready, e.g. due to vsync having too many queued frames
    //or the gpu just generally being to far behind (well I assume that blocks,
    //rather than it just in effect discarding a frame and never rendering it)
    //Do some other non-rendering task here and call present again
    tryUpdateLogic();//let the fixed step logic/physics run a step if its time
}

Share this post


Link to post
Share on other sites

In response to Hodgman:

Thanks for the information on VSync and GPU queueing, it does indeed seem that frames were being fed to the display faster than it could draw them. I've slightly lowered the render function call frequency and the slowdowns seem to have ceased. I don't suppose it's too much of a crutch to render at (screen update rate)-1Hz if VSync is enabled? happy.png

 

As for my physical calculations, I've only just started figuring them out (hence the extreme update rate, for instance - I've decided to play it safe to avoid collision skips and such, but looks like I've overdone it). Thanks for everyone's suggestions, I'll definitely make use of them.

Share this post


Link to post
Share on other sites

If the refresh rate is at 60HZ and your rendering consistently at 59HZ i is going to duplicate frames. If you cant use a Present like I mentioned (I just looked at the IDXGISwapChain and it seems to have flags to do so for Present) then I think you need to render at slightly above the refresh rate, so that it still gets all the frames it needs, with minimum time spent waiting on a vsync.

Share this post


Link to post
Share on other sites

Have you got D3DCREATE_FPU_PRESERVE in your CreateDevice call?  If not I'd suggest that you put it in there as a temporary workaround, and doing so should resolve your problem, which sounds an awful lot like you're accumulating time deltas and suffering from floating point precision loss in the accumulation.

Share this post


Link to post
Share on other sites

If the refresh rate is at 60HZ and your rendering consistently at 59HZ i is going to duplicate frames. If you cant use a Present like I mentioned (I just looked at the IDXGISwapChain and it seems to have flags to do so for Present) then I think you need to render at slightly above the refresh rate, so that it still gets all the frames it needs, with minimum time spent waiting on a vsync.

Since you've mentioned IDXGISwapChain, does your method require DirectX 10 or higher?

 

Have you got D3DCREATE_FPU_PRESERVE in your CreateDevice call?  If not I'd suggest that you put it in there as a temporary workaround, and doing so should resolve your problem, which sounds an awful lot like you're accumulating time deltas and suffering from floating point precision loss in the accumulation.

Sadly, adding the flag did nothing. I've mentioned before that, as per Hodgman's suggestion, this is probably a frame queueing problem.

Share this post


Link to post
Share on other sites


Since you've mentioned IDXGISwapChain, does your method require DirectX 10 or higher?
On Direct3D9, on Vista/Win7, you can create a IDirect3DDevice9Ex device instead of a IDirect3DDevice9 device, which then lets you use the PresentEx function with the D3DPRESENT_DONOTWAIT flag.

Share this post


Link to post
Share on other sites

Speaking of which, I tried obtaining the swap chain pointer from the device (using GetSwapChain) and calling Present from that (it has a flag parameter, like PresentEx), but D3DPRESENT_DONOTWAIT didn't have any effect. As it happens, my laptop has Intel and NVidia GPUs.

 

That said, is PresentEx any different from what I've tried? Should I give it a go nonetheless?

 

Update: I've come across an article describing the use of GetRasterStatus to determine if the frame is still being drawn. Adding this to my second timer check (so the render function isn't called until the display has finished drawing) seems to have fixed the problem.

Just in case, the second check now looks like this:

if((timer.tickCount.QuadPart-timer.memory1.QuadPart)>=(timer.tickFreq.QuadPart/GlobalParams.MaxFPS))
{
	d3ddev->GetRasterStatus(0,&d3drst);
	if(d3drst.InVBlank==true)
	{
		FramePass();
		phystick=0;
		timer.memory1.QuadPart=timer.tickCount.QuadPart;
	}
}

Are there any potential problems I'm not seeing?

Edited by Pygmyowl

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement