responsiveness of main game loop designs

Started by
38 comments, last by Norman Barrows 7 years, 11 months ago

>> I'm pretty sure he was joking....

see, things are so whacked in game development i couldn't even tell! i've seen some really crazy and stupid sh*t come down the pike over the years. i used to think its was only regular software development that was full of idiots. time has proven otherwise.

So I'm an idiot, eh? Oh well, I don't really suppose I could prove otherwise.

i later realized what hodgeman meant. they poll at 5 hz and update for 200ms, so the player still gets a fair share timeslice. physics runs at faster speeds, similar to stepped movement for projectiles, for greater accuracy, but the overall timeslice length is the same (i assume).

He's saying HUMANS act at 5 Hz. He's not suggesting that means a game should poll input devices at 5 Hz.

Advertisement

if you poll at 5 Hz, your max polling lag is 200ms. if you poll right after you present, your other lag is zero. if it takes 33ms to poll, update, and render the results of update, your max response lag is 33ms. for a total of 233ms max lag. for minumum lag, you lose the poll lag (IE assume they pressed the button on the very last poll), which would give you a minimum lag of 33ms (minus polling time) from button press after present through update, render, and present again.
i think i'd take a lag range from 33 to 66 ms vs one of 33 to 233 ms any day.

No one is suggesting that you poll at 5Hz...

total max lag = max time between polls (polling lag) + time between present and input doing other things (other lag) + time from end of present til end of next present that shows results of input polled after the first present (response lag).
i poll at 15hz, my max polling lag is 66ms. i poll right after i present. my other lag is zero. it takes 66ms to poll, update, render, and present, so my max response lag is 66ms, for a total of 132ms max lag. and thats on a single core at 1.3ghz and onboard graphics. the typical game ready PC is more like 4 core at 3+ ghz with a gtx 700 or 900 series card. i could easily goto 30 fps on such a machine, putting max lag at about 66ms, and min lag at about 33ms minus time to poll (which is negligible compared to render and update).

Whenever you press a key, there's between 0-67ms until the next poll. If the framerate is also 15Hz, then there's 67ms until the frame that consumes that polled data is finished. If the GPU framerate is also 15Hz, then there's 67ms until the GPU renders the commands from the previous frame (The present function does not present the current frame, it usually presents the previous one). If you're using a 60Hz LCD, there's probably about 17ms of buffering of the video signal before it starts emitting photons.
So your game has 150 to 217ms of input->photon latency.

Or if your GPU rate is actually 60Hz, then there's 0-67ms of input queue time, 67ms of CPU time, 17ms of GPU time, and 17ms of LCD buffer, for 100 to 167ms of input->photon latency.

By simply ensuring that your game runs at 60Hz, that goes down to 50 to 67ms.
If you increase the simulation rate even higher than 60Hz, you can reduce input->photon latency below 50ms.

Call people idiots all you like; your game is still far less responsive than a typical 60Hz game, so your insults are baseless.

As for the original topic of the thread: If you decouple the sim/draw rate, it won't necessarily change this latency (it can remain the same, increase, or even decrease, depending on your choices).

i later realized what hodgeman meant. they poll at 5 hz and update for 200ms, so the player still gets a fair share timeslice. physics runs at faster speeds, similar to stepped movement for projectiles, for greater accuracy, but the overall timeslice length is the same (i assume).

I only ever mentioned 5Hz once, as an offhand comparison metric - the speed of cognition. Human perception is about 200Hz, and cognition is about 5Hz, so that's the ballpark that we're playing in. And I mentioned that pro-gamers can manage a continuous input rate of about 7Hz. This was all in response to your strange arguments about high framerates making games unfair.

I currently run physics at 600Hz, display at a variable rate (usually 60Hz - driven by the LCD refresh clock), and poll at the display rate -- which gives 0-17ms input queue, 17ms CPU, 17ms GPU, and 17ms GPU buffer for 50 to 67ms input->photon latency as above.
However, I also choose to buffer one frame of simulation data so that I can interpolate movement in case the variable framerate causes a draw to fall out of phase with the sim, which adds 1.7ms of extra latency (one sim frame), so it's more like 52 to 68ms. If I was simulating at 60Hz, this interpolation buffer would be adding 17ms of lag, but due to the high simulation rate, it's not much of an issue.
In VR mode, I display at 90Hz as that's what the current HMD's can handle, which brings my keyboard/gamepad input->photon latency down to 0-11ms input queue + 11ms CPU + 1.7ms buffer + 11ms GPU + 11ms LCD buffer = 35-46ms.

Although, I'm considering polling at the simulation rate to reduce the input->photon latency below 50ms.
My CPU costs are about 0.5ms per 600Hz sim frame and 3ms per 60Hz draw frame. So a 60Hz CPU frame would actually be 10x sim updates taking 5ms, and 1x draw taking 3ms, for 8ms used time and 9ms waiting in Present.
So, the input->photon path would be 0-1.7ms of input queue, 0.5ms sim CPU cost (only the last sim update in the 60Hz frame matters), 1.7ms of my sim buffering, 3ms of draw CPU, 17ms of GPU and 17ms of LCD buffering for ~38 to 40ms of input->photon latency on a regular monitor (or 27 to 29ms in a VR HMD).
So -- for me, rendering less often than I update could actually cut 40% off my input->photon latency (i.e. doing so would increase responsiveness, which was the point of your thread).

Going off on a tangent; the HMD SDK's actually re-poll the headset about 3ms before the video out event and warp the GPU results using the delta between my polled values and their late-polled values, which brings head-look->photon latency down to 3ms GPU + 11ms LCD buffer = 14ms, which is incredibly low. It's pretty much impossible to get figures that low on any PC hardware anywhere, besides VR applications.

>> So I'm an idiot, eh? Oh well, I don't really suppose I could prove otherwise.

that observation was not meant to include those who frequent this site. this is one of the few places where one can get straight answers from folks who have been there and done that.

>> He's saying HUMANS act at 5 Hz. He's not suggesting that means a game should poll input devices at 5 Hz.

yes, his follow-up response cleared up that point. i first i took it to mean he was polling at 5hz, presenting at ~60Hz, and updating at something > 60Hz.

at the very first i took that to mean the player was getting updated both less often, and for a smaller total time slice, which it obviously couldn't, or the player would move slower than the rest of the world. it must have been a long day.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

>> input->photon latency

yes, this is the thing to be measured - and reduced, if necessary. which leads to the question, how much is too much? my testing that came up with 15hz minimum for a standard loop did no tests to determine if it was render rate, poll rate, or input -> photon latency that was the core cause for 15Hz minimum. so it might be the 15Hz refresh rate, the 15Hz polling rate, or the input -> photon latency of a 15Hz std loop - whatever that crunches out to be.

>> By simply ensuring that your game runs at 60Hz, that goes down to 50 to 67ms.

you mean with a std loop at 60Hz - correct? those are the correct numbers for a std loop at 60Hz w/ 60Hz screen refresh rate lag. or are you referring to a decoupled multi-thread design? or both?

>> If you increase the simulation rate even higher than 60Hz, you can reduce input->photon latency below 50ms.

without changing poll rate, poll lag stays the same, update is faster, render is unchanged, so yes it would be more responsive.

>> Call people idiots all you like

that comment was not directed at anyone on this site.

>> Although, I'm considering polling at the simulation rate to reduce the input->photon latency below 50ms.

why wouldn't you? polling tends to be so much faster than update. and polling less often just introduces lag - right?. what caused you to poll less often in the first place? surely at some point in the past you used to poll as often as you updated in games. when did you stop, and more importantly why? i'm most curious as to what would cause such as decision. as it seems to be an unusual one. i mean, if polling less often introduces lag, and you poll less often, then it would seem you made a conscious decision to introduce lag, and i'm wondering what on earth could cause you to do that.

>> So -- for me, rendering less often than I update could actually cut 40% off my input->photon latency

that makes sense. your drawing overhead compared to input and update is so high that drawing less often reduces the input->photon time, by reducing the poll+update lag. you poll and update more often and render less often, so the time from poll to photons goes down on average.

so the real questions are what are the upper and lower limits for these (if there are any). IE polling Hz, update Hz, render Hz and the resulting poll to photon lag time. they all have minimum rates at which things become unplayable. one or more of them is 15Hz lower limit based on my test results. higher rates in any gets you lower lags times. higher rates for render gets you smoother animation. i suspect polling rates need be no faster than update rates - can't think of anything it might get you. higher update rates might have some small effect on physics accuracy.

the less often you do something (poll, update, render), the more time you have to do things in general. that why i've used 15 Hz in the past. it gives me 66ms to poll, update, and render - and it all goes to render of course.

so the real question is what are the minimum upper limits? when is a given rate for poll, update or render fast enough? there's no sense doing things more often than necessary. at some point poll to photon lag will be "fast enough". it seems render will be limited by the monitor refresh rate. you may render to vidram twice, but if the hardware (i was going to say DAC, but they don't use a DAC anymore do they?) only sends it off to the monitor once, whats the point? so it would seem that render need run no faster than refresh. putting you at like 60Hz or 90Hz for render. if render is say 60Hz, whats the Hz for poll and input that's "fast enough" when it comes to "poll to photon" lag? once that's determined, you need go no faster, and you can concentrate on doing more in the same amount of time, instead of the same amount in less time.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

To do this optimally you need a buffer of input events with hardware time-stamps.

I was hoping that DirectInput would evolve towards that but it was abandoned and we're back to sucking messages from the pump.

That at least provides an ordered list of events but without the time-stamps you cannot even implement something as simple as pulling back the plunger for a pinball game accurately.

You get quantized to your polling rate and experience jitter corresponding to your input-stack and thread stability.

You could compensate for this by introducing the uncertainty of your input into your hit detection.

When you receive an event you know it happened between between 'just now' and one polling period ago which gives you a delta-time.

Humans do not act "at 5Hz".

I can push a button for less than 10 ms and routinely demonstrate how HMI's cannot handle button presses that quick (and we show on an oscilloscope that the button was indeed pressed for 7~12 ms).

60 Hz vs. 120 Hz makes a notable difference for FPS games and it's probably due to the triple-buffering.

3 x 1/60 -> 50 ms. That's an eternity.

- The trade-off between price and quality does not exist in Japan. Rather, the idea that high quality brings on cost reduction is widely accepted.-- Tajima & Matsubara

actually the time to be measured is from the end of present, followed by a poll, an update, a render, and a present.

that is only the same as poll to photon time if you do nothing between present and polling

so its really the time from one present to the next that includes the results of the polling after the first present. and that is the same as input -> photon only if you poll immediately after present

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

>> You get quantized to your polling rate and experience jitter corresponding to your input-stack and thread stability.

sounds like you're forgetting abut nyquist rate...

https://en.wikipedia.org/wiki/Nyquist_rate

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

well, from the responses it seems that splitting things in parallel will increase throughput. hodgman's timelines in post #9 show this well.

but the interesting thing is that while its split over multiple processors, the order does not change. input and update on one thread simply overlap the render of the previous frame on the other thread.

so it would seem that everyone is still using the same order, perhaps split over multiple threads.

of course not being able to come up with a negative case doesn't prove anything, but it would seem to indicate that the order must be maintained. which is what my original question was about, order and latency, not reducing latency by multi-threading - although the info RE: latency reduction via multi-threading is most welcome.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

actually the time to be measured is from the end of present, followed by a poll, an update, a render, and a present.

that is only the same as poll to photon time if you do nothing between present and polling

so its really the time from one present to the next that includes the results of the polling after the first present. and that is the same as input -> photon only if you poll immediately after present

Nah that's too conservative to cover input->photon time

Given two frames (which poll, update, draw, present), and three keyboard inputs, A, B, C and D:


| Frame 1          | Frame 2          |
| A       B        |CD                |
| Pol,Up,Drw,Prsnt | Pol,Up,Drw,Prsnt |

The user pressed C immediately before a Poll, so it spends close enough to zero time in the hardware and OS processing queues before being picked up by the poll. So C has no extra delay.
B spends about half a frame waiting in an OS queue.
A spends about a whole frame waiting in an OS queue.
D occurred just momentarily after C, but also just missed the Poll, so it will have to wait around for a whole frame (like A did).

So input->photon latency must also include a variable factor of between zero and one frames, or from 0 to 1000/PollingHz milliseconds.

If you just count present->present, you're also not including any GPU processing time whatsoever. The GPU and CPU do not run in close synchronization - and usually have at least one frame of latency between them. Graphics drivers deliberately introduce one frame of latency to ensure that no pipeline stalls can occur and throughput is maintained.
LCD's also buffer inputs for at least one frame.
So assuming a decent graphics driver (and rendering code), and a decent LCD, your timeline looks like:


| CPU Frame 0       | CPU Frame 1       | CPU Frame 2       |
| Pol,Up,Drw,Prsnt0 | Pol,Up,Drw,Prsnt1 | Pol,Up,Drw,Prsnt2 |
+-------------------+-------------------+-------------------+
|                   | GPU Frame 0       | GPU Frame 1       |
|                   | Render,    Prsnt0 | Render,    Prsnt1 |
+-------------------+-------------------+-------------------+
|                   |                   | LCD Frame 0       |
|                   |                   | Buffer,    Prsnt0 |

^^ Just to be clear, this is what the timeline of your game basically looks like right now ^^ three different processors, handling the frame in a serial pipeline

So just measuring the CPU's present->present timeframe will give you a value that's potentially 3x smaller than the real value.
When you add the effect of input polling causing events to linger in a buffer, your actual input->photon latency is between 3x and 4x the numbers you're calculating.

The exception to this "at least three frames" rule is when the CPU/GPU/LCD update rates are all very different.

e.g. if your CPU framerate is 15Hz, GPU framerate is 30Hz, and LCD framerate is 60Hz, then you get:

Max time an event can linger in a queue before being picked up by a Poll: 15Hz / up to 66.7ms

CPU present->present time: 15Hz / 66.7ms

GPU present->present time: 30Hz / 33.3ms

LCD buffering time: 60Hz / 16.7ms

Total: from 116.7 to 183.33333333, or from 1.75x to 2.75x (instead of the 3x to 4x for the general rule of thumb).

You shouldn't just calculate these values and trust the theory though; get a 240Hz camera and film your keyboard+screen while you strike a key, and count the 240Hz frames that tick by between your finger first touching the keyboard and the LCD showing a response.
On a regular 60Hz game, it should be at least 11 frames in the 240Hz footage (somewhere around 50ms).
On a 15Hz game, it should be at least 35 frames in the 240Hz footage (somewhere around 150ms).

If you fix the caveman download links, I can do some empirical tests with a 240Hz camera for you.

Humans do not act "at 5Hz".
I can push a button for less than 10 ms and routinely demonstrate how HMI's cannot handle button presses that quick (and we show on an oscilloscope that the button was indeed pressed for 7~12 ms).

Again, 5Hz only got brought up as the rate of high level cognition / conscious experience, and is also approximately the human conscious reaction rate. If you had no idea how far away the button was, forcing you to actually think about whether you've touched it yet and should now release (instead of using muscle memory), you'd end up with much longer press/hold times due to that large thinking/reaction delay.
This is off topic, but striking a button can be a controlled by a conscious decision making process at 1Hz for all it matters, and still achieve a 10ms contact time, as long as you don't have to think too hard about the process itself once it's begun :P

To do this optimally you need a buffer of input events with hardware time-stamps.
I was hoping that DirectInput would evolve towards that but it was abandoned and we're back to sucking messages from the pump.
That at least provides an ordered list of events but without the time-stamps you cannot even implement something as simple as pulling back the plunger for a pinball game accurately.
You get quantized to your polling rate and experience jitter corresponding to your input-stack and thread stability.

You could compensate for this by introducing the uncertainty of your input into your hit detection.
When you receive an event you know it happened between between 'just now' and one polling period ago which gives you a delta-time.

Humans do not act "at 5Hz".
I can push a button for less than 10 ms and routinely demonstrate how HMI's cannot handle button presses that quick (and we show on an oscilloscope that the button was indeed pressed for 7~12 ms).

60 Hz vs. 120 Hz makes a notable difference for FPS games and it's probably due to the triple-buffering.
3 x 1/60 -> 50 ms. That's an eternity.


Thus is basically what I did when I implemented my own input.

All inputs go to a buffer where each buffer entry is timestamped. The game then reads a set amount of inputs per tick so that someone with a more sensitive device doesn't have non-deterministic play.

In my puzzle game this was important for debugging and testing otherwise someone with a hyper sensitive gaming mouse could have a different game experience than someone with a bog standard device and this would be impossible to debug.

This is just what I found, however.

I wrote about it in a thread here where possible solutions were discussed in detail, hope this helps! - http://www.gamedev.net/topic/664831-handling-input-via-windows-messages-feedback-requested/

This topic is closed to new replies.

Advertisement