Timing Pitfalls and Solutions: new developments
I've previously written an article about timing, but unfortunately the situation has gotten even worse since then. RDTSC is more broken than before and QPC now uses it instead of only marginally broken alternatives, so that leaves WinXP users on dual-core systems with a total lack of a decent timer.
The newly updated article (PDF) again describes the timing hardware, APIs, their problems, and workarounds. An important new development is a driver for the HPET timer that is a fairly clean solution to this entire mess.
The modalities of publication/distribution are not yet clear, but I'd love to hear your comments on the draft.
Hopefully this text will help people avoid all-too-frequent timing glitches :)
So if I understand this correctly, the only way to use reliable high-resolution timers on XP is to write a driver that needs to be deployed with application?
Your proposal seems useful when developing profilers, although profilers based on Monte-Carlo should work around this issue completely, because they work at time ranges far greater than the resolution of even GTC.
I don't really see the issue of using (a wrapper around) GTC for games, though. With a 10-ms resolution, you can easily run a 60FPS game with reasonable video precision (the simulation precision is independent of timer precision anyway). I feel this point would deserve additional justification in the draft than a single sentence.
I don't really see the issue of using (a wrapper around) GTC for games, though. With a 10-ms resolution, you can easily run a 60FPS game with reasonable video precision (the simulation precision is independent of timer precision anyway). I feel this point would deserve additional justification in the draft than a single sentence.
Pretty much, unless you're able to get all users to edit their boot.ini or install a hotfix (that's not pushed out via Windows Update, grr).
Note that the driver can be loaded at runtime, so it doesn't need an installer or anything. It's distributed in the EXE directory as a .sys file, much like a .dll .
Note that the driver can be loaded at runtime, so it doesn't need an installer or anything. It's distributed in the EXE directory as a .sys file, much like a .dll .
I haven't read it in depth so I won't make many comments yet, but I noticed two things when just doing a quick read through:
- You say that an ideal timer should be a monotonically increasing source of UTC timestamps, but wouldn't a strictly increasing source of UTC timestamps be preferable? It would avoid cases where the change in time is perceived as 0seconds which could result in instability, division by zero and inefficiency if the programmer isn't careful to consider that case.
- In the references you list:
[Drepper 2006] Drepper, Ulrich: POSIX Option Groups. August 20 2006.
– URL http://www.steampowered.com/status/survey.html
The URL seems to be incorrect though.
- You say that an ideal timer should be a monotonically increasing source of UTC timestamps, but wouldn't a strictly increasing source of UTC timestamps be preferable? It would avoid cases where the change in time is perceived as 0seconds which could result in instability, division by zero and inefficiency if the programmer isn't careful to consider that case.
- In the references you list:
[Drepper 2006] Drepper, Ulrich: POSIX Option Groups. August 20 2006.
– URL http://www.steampowered.com/status/survey.html
The URL seems to be incorrect though.
@Toorvyk: not sure I understand the point about Monte Carlo profilers. They work by periodically sampling the current instruction pointer, so there is no time measurement involved at all.
hm. I'd suppose that most games (our RTS included) do per-frame interpolation that's separate from (possibly time-independent) simulation. If we presuppose a 60 FPS framerate and 10 ms resolution, calls to determine the elapsed time since the last frame will alternate between 10ms and 20ms. Surely that will cause non-smooth updates, or are you filtering these delta values?
@CTar:
> but wouldn't a strictly increasing source of UTC timestamps be preferable?
Indeed :D Since we're dreaming about the perfect timer, might as well have it be strictly monotonic.
> The URL seems to be incorrect though.
Oops, well-spotted; fixed now. Thanks for pointing that out!
Quote:I don't really see the issue of using (a wrapper around) GTC for games, though. With a 10-ms resolution, you can easily run a 60FPS game with reasonable video precision (the simulation precision is independent of timer precision anyway). I feel this point would deserve additional justification in the draft than a single sentence.
hm. I'd suppose that most games (our RTS included) do per-frame interpolation that's separate from (possibly time-independent) simulation. If we presuppose a 60 FPS framerate and 10 ms resolution, calls to determine the elapsed time since the last frame will alternate between 10ms and 20ms. Surely that will cause non-smooth updates, or are you filtering these delta values?
@CTar:
> but wouldn't a strictly increasing source of UTC timestamps be preferable?
Indeed :D Since we're dreaming about the perfect timer, might as well have it be strictly monotonic.
> The URL seems to be incorrect though.
Oops, well-spotted; fixed now. Thanks for pointing that out!
Quote:Original post by Jan Wassenberg
Pretty much, unless you're able to get all users to edit their boot.ini or install a hotfix (that's not pushed out via Windows Update, grr).
Note that the driver can be loaded at runtime, so it doesn't need an installer or anything. It's distributed in the EXE directory as a .sys file, much like a .dll .
I don't suppose such implementations are available from somewhere. The thought of providing user level software which requires poking around kernel mode doesn't really apeal to me.
Also, somewhat unrelated, how well does boost::posix_time handle these anomalies?
I've a question regarding "clock_gettime".
Can I compare CLOCK_REALTIME or CLOCK_MONOTONIC between two different Machines?
What if the CPU slows down, gets faster.
You seem to have some understanding: So what would we be an appropriate timer for profiling on clusters (highest possible resolution + very correct/comparable values)
Can I compare CLOCK_REALTIME or CLOCK_MONOTONIC between two different Machines?
What if the CPU slows down, gets faster.
You seem to have some understanding: So what would we be an appropriate timer for profiling on clusters (highest possible resolution + very correct/comparable values)
@Antheus:
Yep, I'd release GPL code for the whole lot; remains to see how it can be packaged.
Writing a WDM driver indeed requires caution. This one is kept very simple, though; it's only 350 lines (all of the real logic runs in user mode).
> Also, somewhat unrelated, how well does boost::posix_time handle these anomalies?
boost::posix_time appears to be focussed on dates and formatting - I see no high-resolution time logic. Boost does have an "xtime" wrapper on top of GSTAFT / clock_gettime, but that's apparently intended for time-of-day, not high-resolution timestamps.
@hydroo:
clock_gettime should shield you against changes in CPU freq (that's the operating system's job). However, I wouldn't use CLOCK_REALTIME for profiling - it can change due to time-of-day adjustments (e.g. NTP).
Not sure about how CLOCK_MONOTONIC clocks behave in a cluster, never tried that. I wouldn't know of a better way than using it as the timesource, having nodes agree on the start time and hoping timer drift doesn't become a problem (the HPET guarantees < 500 ppm, which isn't great, but hey).
Quote:I don't suppose such implementations are available from somewhere. The thought of providing user level software which requires poking around kernel mode doesn't really apeal to me.
Yep, I'd release GPL code for the whole lot; remains to see how it can be packaged.
Writing a WDM driver indeed requires caution. This one is kept very simple, though; it's only 350 lines (all of the real logic runs in user mode).
> Also, somewhat unrelated, how well does boost::posix_time handle these anomalies?
boost::posix_time appears to be focussed on dates and formatting - I see no high-resolution time logic. Boost does have an "xtime" wrapper on top of GSTAFT / clock_gettime, but that's apparently intended for time-of-day, not high-resolution timestamps.
@hydroo:
Quote:Can I compare CLOCK_REALTIME or CLOCK_MONOTONIC between two different Machines?
What if the CPU slows down, gets faster.
You seem to have some understanding: So what would we be an appropriate timer for profiling on clusters (highest possible resolution + very correct/comparable values)
clock_gettime should shield you against changes in CPU freq (that's the operating system's job). However, I wouldn't use CLOCK_REALTIME for profiling - it can change due to time-of-day adjustments (e.g. NTP).
Not sure about how CLOCK_MONOTONIC clocks behave in a cluster, never tried that. I wouldn't know of a better way than using it as the timesource, having nodes agree on the start time and hoping timer drift doesn't become a problem (the HPET guarantees < 500 ppm, which isn't great, but hey).
The resolution of 2000/XP's standard tick can be increased (err, decreased?) from 100hz (10ms) to 1000hz (1ms)
Besides requiring a different code path for 98/ME, are there any issues that I am unaware of?
Anyways.. this doesnt truely seem to be a game-related problem since games dont seem to have issues with it (aside from hotfixing dual AMD's with its otherwise backwards moving clock) ..
.. this does seem to be an issue for serious profiling, but the monte-carlo method has already been mentioned and its superior in so many ways that there really isnt a point discussing it (both the AMD and Intel profilers use it extensively)
Besides requiring a different code path for 98/ME, are there any issues that I am unaware of?
Anyways.. this doesnt truely seem to be a game-related problem since games dont seem to have issues with it (aside from hotfixing dual AMD's with its otherwise backwards moving clock) ..
.. this does seem to be an issue for serious profiling, but the monte-carlo method has already been mentioned and its superior in so many ways that there really isnt a point discussing it (both the AMD and Intel profilers use it extensively)
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement