• 10
• 10
• 12
• 12
• 14

# the 'perfect' game loop, fix your time step (by step)

This topic is 438 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hey all,

I've arrived at the chapter in "Game Engine Architecture", covering timers.
Not very accidentally I also want to implement a correct and flexible main/ game loop.

So, I've started with the things I need, things I know and things I don't know yet.

What are the requirements?
- game logics/ phyics will run at a constant speed dT (for example 30fps)
-- aka, the game logics should result in the same, independent of the machine/ hardware (note: this doesn't mean rendering)
- enabling/ disabling vsync should not affect gameplay/ speed (both should be supported)
-- aka game logics should be updated in the same speed when rendering is done at 100fps, 60fps or 25fps etc.
- game shouldn't break when debugging (and a specific frame time hits the fan)

What do I know?
- there are quite some solutions to figure out a 'good' dT (DeltaT), a few that I've read about:
-- static, delta T is always 1/60th of a second for example (CPU dependent)
-- use last frame time and make this dT for next frame
-- average the last x frames and make this dT for next frame (catch 'spikes')
-- guarantee that a frame takes x time, i.e. 1/60 or 1/30 second
(sleep if frame is shorter then ideal, sleep ideal time - frame time of frame time exceed ideal frame time)

Question:
- is the following example/ statement correct when V-sync is enabled, with 60hz:
-- game logics/ non rendering take 5ms
-- rendering takes 5ms
-- 16 - 5 - 5 -> vsync makes the API/ present call wait 6ms before presenting
(basically guaranteeing the full loop takes 16ms, 1/60)

Let's say these basics are clear, then let's take a look at Gaffers 'fix my timestep' article.
I had to read it quite a few times to get the terminology right, until I read the first article on phyics :)

From what I understand from it, is that it will deliver me everything defined under the requirements above.
And also that it will decouple physics from rendering, which sounds like a good thing to do.

Approach:
I've tried to 'redo' the final solution from the article, in my own words/ syntax.
With the different that I don't pass a "State" (position + velocity) in my render function, since all updating is done in the update call/function.
This means in the approach below, I currently don't do anything with the 'alpha'/ blending factor, because the state isn't passed to render.

My questions
- are my assumptions correct/ is this the way to go?
- what exactly is 'dt', here 0.01? (I took it from the article code sample)
- and to which update frequency is the 'physics' (update logics) part set to in this example?
- should/ can I somehow integrate the alpha/ blending factor in the while loop for the gamelogics updating (/ physics)?
(instead of passing the result to render(), which I don't do)

Below you'll find the (psuedo) code.
Any input is appreciated as always (including other feedback/thoughts, then the 2 explicit questions).

double dt = 0.01
double accumulator = 0.0

double currTime
double totalTime
double lastTime

while(!quit)
{
// here time starts running for non-rendering stuff
while(peekmessage)
{
handle windows messages
}

lastTime = currtime
currTime = GetTime()
double frameTime = currTime - lastTime
if(frameTime > 0.25) frameTime = 0.25	// seconds?

accumulator += frameTime

// is this while loop the decoupled physics part?
while(accumulator >= dt)		// dt being the set/ wanted frametime??
{
mGame->UpdateLogics(totalTime, dt)
totalTime += dt
accumulator -= dt
}

// here the non-rendering time has passed, rendering time starts
mGame->Render()
}


##### Share on other sites

'dt' means delta time, i.e. the measured difference in time between two points, but you're using frameTime for that. 'dt' is a bad variable name for a fixed time value.

So, instead of accumulator >= dt you want something like accumulator >= UPDATE_PERIOD where UPDATE_PERIOD is 1/30, or whatever you want your game update speed to be.

Not sure what you mean with the blending factor - it's not uncommon to pass something like accumulator / UPDATE_PERIOD through to the renderer, which will tell it how far through the current frame you are, and make rendering adjustments accordingly, which can be important if the rendering rate and update rate aren't strict multiples or factors of each other. There's no need to do any blending in your physics. The point of a fixed timestep is that it's an exact, fixed amount of time, every time.

##### Share on other sites

What are the requirements?

I think it's important to remember that requirements are going to be different depending on the project.  Sometimes it might make sense to use fixed time steps for each update, sometimes it doesn't.  Most games I've made, instead of fixing the update time, would clamp delta time to something reasonable and just scale your time-based updates by dt.  Shouldn't matter if you have 10fps or 1000 fps.

is this while loop the decoupled physics part?

I'm not sure where you saw the "decoupled" line, but what they probably meant by that is that you shouldn't let your rendering code do any of your simulation/logic.  Your rendering should be kept separate ("decoupled") from the state of the game- it just receives or observes that state and renders it, otherwise not touching it.

##### Share on other sites
What are the requirements?

As TedEh says, its really up to you. for caveman 3.0, i'll probably go with user defined framerate limiter, and scaling update accordingly.   less work than fix-your-timestep - no blending required.  no temporal aliasing.  no spiral of death.

let's take a look at Gaffers 'fix my timestep' article. I had to read it quite a few times to get the terminology right, until I read the first article on phyics   From what I understand from it, is that it will deliver me everything defined under the requirements above.

- game shouldn't break when debugging (and a specific frame time hits the fan)

this is the spiral of death. can't recall if the the gaffer article mentions the ET cap used to fix this.

And also that it will decouple physics from rendering, which sounds like a good thing to do.

// is this while loop the decoupled physics part?

it makes them no longer in lockstep.  so render is "decoupled from update - allowing render to run as fast as possible".  the article says something like that.

- are my assumptions correct/ is this the way to go?

fix-your-timestep is an algo designed to maximize FPS. all the stuff like DT and tweening is stuff required to make everything work while maximizing FPS. if you follow the evoution of the algo through the course of the article, you see he addresses each problem in turn, resulting in the final algo with DT, tween, etc.

- what exactly is 'dt', here 0.01? (I took it from the article code sample)

DT is the update timestep size. if you run update at 60Hz, DT = 0.016666666 seconds.   ET is the elapsed time for render or input+update+render (IE ET since start of last update). then:

accumulator += ET

while accumulator >= DT

{

do_an_update()

accumulator-=DT

}

in update(), you save your current state (position, rotation, maybe animation frame too) in previous_state before updating current_state.

when you render, you tween (blend) between previous_state and current_state by the factor accumulator/DT.

- and to which update frequency is the 'physics' (update logics) part set to in this example?

DT = 0.01 seconds, so update runs at 100 Hertz.

- should/ can I somehow integrate the alpha/ blending factor in the while loop for the gamelogics updating (/ physics)? (instead of passing the result to render(), which I don't do)

BTW, that line:

if(frameTime > 0.25) frameTime = 0.25 // seconds?

is the ET cap required to avoid the spiral of death.  It limits ET to 250ms -  assuming all your units are in seconds - what units does GetTime() return ?

Edited by Norman Barrows

##### Share on other sites

Thanks all.

Below a new version. For clarity, let's assume all units are in seconds (0.25 is 0.25 seconds, which might be a bit high, but let's forget that).

Depending on the scenario I might go for the other version below (option 2), taking the average of last 3 frames and use that as DeltaT.

In this 2nd scenario I think I wont be 'covered' when it comes to supporting with V-sync enabled and disabled. To achieve this, should I add a fixedFrameTime, and sleep if actual time is lower (or sleep full fixedFrameTime - exceeded time, to guarantee the fixedFrameTime)? I think in this scenario I'll only cover the total frame time, and not a fixed frame time for the update logics/ physics (whicht might be OK, depending on the use-case).

There's something I don't understand yet (option 1). I currently do all updating within the 'UpdateLogics', applying velocity etc. to move, rotate etc. objects.

And render simply does what it says, it renders.

Say I decide to pass the accumulator/fixedFrameTime to the Render() function, does that mean the following:

- UpdateLogics only stores the requested new state, besides the current state

- Render function will blend the current state with new state and applies the actual changes in position, rotations etc.

If so, that sounds a bit strange, because I'd expect 'Render' to render the scene (not update objects's positions, rotations etc.)

double fixedFrametime = 1/60
double accumulator = 0.0

double currTime
double totalTime
double lastTime

while(!quit)
{
// non-rendering time of frame starts
while(peekmessage)
{
handle windows messages
}
mGame->HandleUserInput()

lastTime = currtime
currTime = GetTime()
double lastFrameTime = currTime - lastTime
if(lastFrameTime > 0.25) lastFrameTime = 0.25	// maximize at 0.25 seconds

accumulator += lastFrameTime

while(accumulator >= fixedFrameTime)
{
mGame->UpdateLogics(totalTime, fixedFrameTime)
totalTime += fixedFrameTime
accumulator -= fixedFrameTime
}

// non-rendering time done, rendering time starts

// OPTION 1
mGame->Render()
// OPTION 2
mGame->Render(accumulator/fixedFrameTime)
}

double currTime
double totalTime
double lastFrameTimes[3]
int frameCounter = 0

float deltaT = 1/60;

while(!quit)
{
// non-rendering time of frame starts
while(peekmessage)
{
handle windows messages
}
mGame->HandleUserInput()

lastTime = currtime
currTime = GetTime()
lastFrameTime[frameCounter] = currTime - lastTime
++frameCounter;

if(frameCounter == 3)
{
deltaT = average(lastFrameTime[0], lastFrameTime[1], lastFrameTime[2])
frameCounter = 0
}

totalTime += currTime - lastTime

mGame->UpdateLogics(totalTime, deltaT)

// non-rendering time done, rendering time starts
mGame->Render()
}

Edited by cozzie

##### Share on other sites

I think it's important to remember that requirements are going to be different depending on the project.

^^ Yep, there isn't one loop to rule them all. Fixed timestep is good when you need some level of determinism, or you need to run your updates too fast for accuracy (e.g. a precise 1000Hz simulation) or run your updates too slow for performance (e.g. a very complex 10Hz simulation). In a lot of other situations, variable timestep works just fine, and can be faster/simpler.
IMHO a robust general purpose engine must allow the game itself to dictate this policy :)

aka, the game logics should result in the same, independent of the machine/ hardware

aka the game logic is deterministic.
This assumption isn't true. A fixed timestep is important first step to creating a deterministic simulation, but by itself it is not enough to ensure determinism.

What do I know? - there are quite some solutions to figure out a 'good' dT (DeltaT), a few that I've read about:
-- static, delta T is always 1/60th of a second for example (CPU dependent)
-- use last frame time and make this dT for next frame
-- average the last x frames and make this dT for next frame (catch 'spikes')
-- guarantee that a frame takes x time, i.e. 1/60 or 1/30 second (sleep if frame is shorter then ideal, sleep ideal time - frame time of frame time exceed ideal frame time)

First is not feasible except on fixed hardware (e.g. a NES :lol: ).
The last is called a framerate limiter, and you may want to optionally add one to your game as well as method 2/3 for measuring DT. e.g. without vsync on, your game might run at 1000FPS -- it's nice to give the user an option to limit this just for the sake of power efficiency.
In my experience the second choice is the most common, and simplest.
I often see the third choice used in certain situations, such as a FPS counter on screen, even when the second option is being used for the game loop.
If you have a lot of spikes in your DT graph, then yes, the third option could be better for your game... but IMHO you should also fix those spikes from occurring in the first place!!!

It's true that it's impossible to use the correct DT value for a frame, because you don't know what it will be until after the frame is finsiehd. Option two and three are both taking guesses at the right DT value by looking at past history (with and without a filter).

When vsync is enabled, you can improve upon the second method because vsync introduces a constraint which you can use to improve your predictions. If the refresh rate is 60Hz, then you know that the next frame is always going to appear at some multiple of 1/60 seconds after the previous. Any predicted DT that does not fit into n/60 where n is an integer, is obviously an invalid guess.
In my engine, I quantize the estimated DT to the refresh rate, in order to apply the vsync constraint to the estimated DT. This basically requires creating a second time accumulator (besides the normal fixed time step accumulator) and looks like:

double adjustedDelta = deltaTime + m_vsyncLockBuffer;//add any leftover time from the accumulator to the previous frame's DT / DT estimate to avoid time drift
float frameCount = (float)floor(adjustedDelta * m_vsyncHz + 0.5f);//quantize the estimate to a whole number of vblanks
frameCount = frameCount >= 0 ? frameCount : 0;
deltaTime = frameCount / m_vsyncHz;//compute a new quantized estimate
m_vsyncLockBuffer = adjustedDelta - deltaTime;//store any borrowed/leftover time in the accumulator

For example, if your DT estimate is 16ms, we know that this is an under-estimate because at 60Hz, each frame will take a minimum of 16.667ms... So this technique will borrow 0.667ms from the future in order to bump our DT guess up to one full frame. Adding this feature did actually reduce some micro-stuttering in my animations.

Say I decide to pass the accumulator/fixedFrameTime to the Render() function, does that mean the following: - UpdateLogics only stores the requested new state, besides the current state - Render function will blend the current state with new state and applies the actual changes in position, rotations etc.

Update outputs a whole game-state. You keep around the most recent two game-states. Render blends two game-states together to produce a temporary third game-state, which it renders from.

Edited by Hodgman

##### Share on other sites

Thanks Hodgman.

Regarding the blending; too make this expicit, for sake of simplicity, in 2D:

-- object X has position (2,3) and velocity/speed 1, DT = 0.016

-- in game logics update a potential new state is calculated, using DT 0.016. Result for position: (2.016, 3.016)

(for example with a new state struct containing a delta for S, R and T)

-- object position is unchanged (current state, pos stays (2.3)) and new state is stored

-- in render, the blend factor is passed, let's say this is 0.5 (fictive)

-- then render will update the object, by applying the new state (delta SRT) * 0.5

Is this a correct monkeyproof explanation of what's going on?

For flexibillity and practice I'll now create 3 options and test with them:

1. Straight forward: use last frameTime of DT for new frame, with a max DT added (done, currently implemented :))

(with potentially the addition of using the refresh rate when vsync is enabled, hodgman's suggestion)

2. The 'Gaffer fix my timestep' method (with/without the blending (of current and new state) in Render

3. Governed framerate with static frameTime (with sleep till next frametime is passed, basically manual vsync :))

I'm still a bit in doubt if adding the extra step (store newState in updatelogics and update in render) is needed for the use-cases I currently have/know. For practice I can add it, it probably doesn't harm either way. Besides the fact that a render function updates positions, rotations and scales for dynamic objects (and triggers updating matrices etc.). Simplified I'd say the update function should do that, but then the idea of the blending doesn't apply :)

When I have the options implemented I'll paste the results (for future reference and perhaps feedback).

One last question; I believe that enabling/disabling v-sync shouldn't affect the updating of the logics/ applying of states in all 3 options. Is this correct? (where option 1 might have some specific code when vsync is enabled)

##### Share on other sites

Your second example there starts off with a 1/60 frame time and then stomps right over that with whatever your rendering rate is. So that's back to a completely variable frame rate based on rendering speed.

I don't think you need to test different methods, especially since the main point of these more complex loops is for them to work properly across lots of different hardware, which you won't have access to. I think you need to choose one based on your game. Is it physics-heavy? Then fix the update timestep. Otherwise, you will be fine with a variable time-step, averaged or not.

In my experience, every pro game I've worked on has had a fixed-step update rate. It just makes things easier to reason about, as well as making your physics more reliable.

##### Share on other sites

-- then render will update the object, by applying the new state (delta SRT) * 0.5

It's an interpolation of the two states, so something like:
alpha = accumulator / timeStep;
renderState = lerp( oldState, newState, alpha );

If so, that sounds a bit strange, because I'd expect 'Render' to render the scene (not update objects's positions, rotations etc.)
It doesn't update any object positions. It generates a temporary data set used to visualize those objects.

e.g. in order to draw an object in a scene, you have to copy it's position into a GPU buffer. The value to copy into that buffer is calculated by interpolating two game-states. There's no need to actually update the game-logic's representation of the object to this blended location.

Edited by Hodgman

##### Share on other sites

Thanks.

@Hodgman: I'm not following.. Currently I would do something like this:

-- class mMeshInstance (linked to a game object/entity or not)

-- member XMFLOAT3 mPosition

-- member function: Move(const XMFLOAT3 pPosition)

-- this member function overwrites/ updates mPosition

(there are a few variant of this functions, taking 3 floats, increasing x y z etc., but that's out of scope)

-- in 'update', taking DeltaT I call 'Move' on the objects

-- when I render, I render based on the current values within the object, those are used for the matrices (and eventually uploaded to GPU buffers)

In the other approach, I think I need too add a struct for a delta SRT to in this case the MeshInstance class.

Then some function will calculatie these values (called a 'state' in the referenced article on fix my timestep).

Afterwards in the Render function, I would need to update the position using the blending factor, current S R T and the newState struct.

Then I upload the result to the GPU buffers.

Are you suggesting to only update the S R T (scale, rotation, translation) in the 'state struct' (and create a few of them) and always keep mPosition unchanged?

That doesn't match up in my head :) I Always want to have similar data on the CPU side, as on the GPU sides (i.e. a world matrix based on position XYZ, will have value XYZ for the mPosition member variable of my mesh instance class object).