Sign in to follow this  
Jan Wassenberg

Timing Pitfalls and Solutions: new developments

Recommended Posts

I've previously written an article about timing, but unfortunately the situation has gotten even worse since then. RDTSC is more broken than before and QPC now uses it instead of only marginally broken alternatives, so that leaves WinXP users on dual-core systems with a total lack of a decent timer. The newly updated article (PDF) again describes the timing hardware, APIs, their problems, and workarounds. An important new development is a driver for the HPET timer that is a fairly clean solution to this entire mess. The modalities of publication/distribution are not yet clear, but I'd love to hear your comments on the draft. Hopefully this text will help people avoid all-too-frequent timing glitches :)

Share this post


Link to post
Share on other sites
So if I understand this correctly, the only way to use reliable high-resolution timers on XP is to write a driver that needs to be deployed with application?

Share this post


Link to post
Share on other sites
Your proposal seems useful when developing profilers, although profilers based on Monte-Carlo should work around this issue completely, because they work at time ranges far greater than the resolution of even GTC.

I don't really see the issue of using (a wrapper around) GTC for games, though. With a 10-ms resolution, you can easily run a 60FPS game with reasonable video precision (the simulation precision is independent of timer precision anyway). I feel this point would deserve additional justification in the draft than a single sentence.

Share this post


Link to post
Share on other sites
Pretty much, unless you're able to get all users to edit their boot.ini or install a hotfix (that's not pushed out via Windows Update, grr).

Note that the driver can be loaded at runtime, so it doesn't need an installer or anything. It's distributed in the EXE directory as a .sys file, much like a .dll .

Share this post


Link to post
Share on other sites
I haven't read it in depth so I won't make many comments yet, but I noticed two things when just doing a quick read through:
- You say that an ideal timer should be a monotonically increasing source of UTC timestamps, but wouldn't a strictly increasing source of UTC timestamps be preferable? It would avoid cases where the change in time is perceived as 0seconds which could result in instability, division by zero and inefficiency if the programmer isn't careful to consider that case.

- In the references you list:
[Drepper 2006] Drepper, Ulrich: POSIX Option Groups. August 20 2006.
URL http://www.steampowered.com/status/survey.html
The URL seems to be incorrect though.

Share this post


Link to post
Share on other sites
@Toorvyk: not sure I understand the point about Monte Carlo profilers. They work by periodically sampling the current instruction pointer, so there is no time measurement involved at all.

Quote:
I don't really see the issue of using (a wrapper around) GTC for games, though. With a 10-ms resolution, you can easily run a 60FPS game with reasonable video precision (the simulation precision is independent of timer precision anyway). I feel this point would deserve additional justification in the draft than a single sentence.

hm. I'd suppose that most games (our RTS included) do per-frame interpolation that's separate from (possibly time-independent) simulation. If we presuppose a 60 FPS framerate and 10 ms resolution, calls to determine the elapsed time since the last frame will alternate between 10ms and 20ms. Surely that will cause non-smooth updates, or are you filtering these delta values?

@CTar:
> but wouldn't a strictly increasing source of UTC timestamps be preferable?
Indeed :D Since we're dreaming about the perfect timer, might as well have it be strictly monotonic.

> The URL seems to be incorrect though.
Oops, well-spotted; fixed now. Thanks for pointing that out!

Share this post


Link to post
Share on other sites
Quote:
Original post by Jan Wassenberg
Pretty much, unless you're able to get all users to edit their boot.ini or install a hotfix (that's not pushed out via Windows Update, grr).

Note that the driver can be loaded at runtime, so it doesn't need an installer or anything. It's distributed in the EXE directory as a .sys file, much like a .dll .


I don't suppose such implementations are available from somewhere. The thought of providing user level software which requires poking around kernel mode doesn't really apeal to me.

Also, somewhat unrelated, how well does boost::posix_time handle these anomalies?

Share this post


Link to post
Share on other sites
I've a question regarding "clock_gettime".

Can I compare CLOCK_REALTIME or CLOCK_MONOTONIC between two different Machines?
What if the CPU slows down, gets faster.

You seem to have some understanding: So what would we be an appropriate timer for profiling on clusters (highest possible resolution + very correct/comparable values)

Share this post


Link to post
Share on other sites
@Antheus:
Quote:
I don't suppose such implementations are available from somewhere. The thought of providing user level software which requires poking around kernel mode doesn't really apeal to me.

Yep, I'd release GPL code for the whole lot; remains to see how it can be packaged.
Writing a WDM driver indeed requires caution. This one is kept very simple, though; it's only 350 lines (all of the real logic runs in user mode).

> Also, somewhat unrelated, how well does boost::posix_time handle these anomalies?
boost::posix_time appears to be focussed on dates and formatting - I see no high-resolution time logic. Boost does have an "xtime" wrapper on top of GSTAFT / clock_gettime, but that's apparently intended for time-of-day, not high-resolution timestamps.

@hydroo:
Quote:
Can I compare CLOCK_REALTIME or CLOCK_MONOTONIC between two different Machines?
What if the CPU slows down, gets faster.

You seem to have some understanding: So what would we be an appropriate timer for profiling on clusters (highest possible resolution + very correct/comparable values)

clock_gettime should shield you against changes in CPU freq (that's the operating system's job). However, I wouldn't use CLOCK_REALTIME for profiling - it can change due to time-of-day adjustments (e.g. NTP).
Not sure about how CLOCK_MONOTONIC clocks behave in a cluster, never tried that. I wouldn't know of a better way than using it as the timesource, having nodes agree on the start time and hoping timer drift doesn't become a problem (the HPET guarantees < 500 ppm, which isn't great, but hey).

Share this post


Link to post
Share on other sites
The resolution of 2000/XP's standard tick can be increased (err, decreased?) from 100hz (10ms) to 1000hz (1ms)

Besides requiring a different code path for 98/ME, are there any issues that I am unaware of?

Anyways.. this doesnt truely seem to be a game-related problem since games dont seem to have issues with it (aside from hotfixing dual AMD's with its otherwise backwards moving clock) ..

.. this does seem to be an issue for serious profiling, but the monte-carlo method has already been mentioned and its superior in so many ways that there really isnt a point discussing it (both the AMD and Intel profilers use it extensively)

Share this post


Link to post
Share on other sites
*sigh*
Quote:
The resolution of 2000/XP's standard tick can be increased (err, decreased?) from 100hz (10ms) to 1000hz (1ms)

Yes. This mentioned on page 4, along with its steep price.

Quote:
Besides requiring a different code path for 98/ME, are there any issues that I am unaware of?
Not sure why a different code path would be needed, but page 5 points out one serious issue with clock-interrupt-based timekeeping.

Quote:
Anyways.. this doesnt truely seem to be a game-related problem since games dont seem to have issues with it (aside from hotfixing dual AMD's with its otherwise backwards moving clock) ..

Requiring all your users to install the hotfix would be a partial solution.. for that specific problem. Unfortunately, there are others as well.
And broad statements involving "games" are very dangerous, because all it takes is one counterexample (the abovementioned story on page 5).

Quote:
.. this does seem to be an issue for serious profiling, but the monte-carlo method has already been mentioned and its superior in so many ways that there really isnt a point discussing it (both the AMD and Intel profilers use it extensively)

Do not be so quick to dismiss real-time hierarchical profiling. I've implemented both methods and monte-carlo is no help at spotting changes in elapsed time. Example: which part of the renderer is slower when moving over unit-heavy portions of the map? (this is very valuable insight)

That, and the sweeping generalizations aside: You didn't bother to read the text before commenting, did you?

Share this post


Link to post
Share on other sites
Thanks for the release. ++ to you kind sir. I've not looked it over but this seems quite interesting so I saved it for a rainy day... We have a few of those in Portland here ;-)

Share this post


Link to post
Share on other sites
Glad to; I hope it proves useful.
The distribution contains some 22 KLOC in total, most of which are modules for debugging, memory allocation/tracking and CPU/OS specific facilities. I carry them over from project to project and would hate to write in C++ without them :)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this