Jump to content
  • Advertisement

Archived

This topic is now archived and is closed to further replies.

Ademan555

Best, most accurate clock?

This topic is 5368 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

hmm, ok. I''m still not convinced about your magical ability to be confident in measurements that yield .5, and not (say) .33.
In this case, the uncertainty of where in the tick you start/stop kills you (may actually have been +/- 2 ticks) - no statement can be made about the times, due to insufficient resolution. If we increase the # of samples to avoid this problem and get 3000 calls in 1000 ticks, I''ll be damned if I''ll call that 0.5 ticks per call

Share this post


Link to post
Share on other sites
Advertisement
I think what he''s trying to show is that 3000 calls / 1000 ms is more 0.5 ticks per call than what it is 1 tick. Why on earth you''d like sub clock tick timing, when it''s hard to find a timer that''s even remotely that accurate is beyond me though :D

Share this post


Link to post
Share on other sites


Something like this can be tested easily

rdtsctimer();
x<y;
x<y;
x<y;
x<y;
rdtsctimer();
outputresult();

rdtsctimer();
x^y;
x^y;
x^y;
x^y;
rdtsctimer();
outputresult();




Big difference. 4clock ticks. One of those times you can assume that x^y is less than half a clock tick (even four of them is less than half a clock tick).

As I said, if you haven't spent a few years studying it, ignore it. If you have studied it you will know why you use it.

for short intervals rdtsc is dead accurate, long time measurements it becomes less accurate.
queryperformance is the other way around.

edit: < to &lt;

[edited by - Dredge-Master on December 17, 2003 9:09:27 PM]

Share this post


Link to post
Share on other sites
quote:
Original post by Dredge-Master
rdtsctimer();
x<y;
x<y;
x<y;
x<y;
rdtsctimer();
outputresult();

rdtsctimer();
x^y;
x^y;
x^y;
x^y;
rdtsctimer();
outputresult();



if those are built in types wouldn''t they get optimised away as they have no side effects and the return value isn''t used.

quote:
As I said, if you haven''t spent a few years studying it, ignore it. If you have studied it you will know why you use it


With your experience, wouldn''t you normally profile something a large number of times, rather than just once?

Share this post


Link to post
Share on other sites
quote:
I think what he''s trying to show is that 3000 calls / 1000 ms is more 0.5 ticks per call than what it is 1 tick. Why on earth you''d like sub clock tick timing, when it''s hard to find a timer that''s even remotely that accurate is beyond me though :D

If that''s all he''s saying, I agree
CPU clock accuracy is not the issue: it may be bad (cheap crystals have something like 200 PPM freq tolerance), but the ''benchmark'' is still counting clocks, no matter how long they are.

A more important question: how is an instruction going to take less than a clock?!
> even four of them [xor] is less than half a clock tick.
oh come on

quote:
for short intervals rdtsc is dead accurate, long time measurements it becomes less accurate.
queryperformance is the other way around.

What is your definition of accurate?

> With your experience, wouldn''t you normally profile something a large number of times, rather than just once?
I guess in this case it''s alright - with such a short piece of code, you will quickly notice if you got preempted (no more 4 clock time difference). The only other change would be warming the cache, and that''s not an issue here either.

Share this post


Link to post
Share on other sites
try it

one cycle can contain more than one instruction (on newer cpu''s anyway - I''ve never used high performance timers on pre-Pentium class chips, or on the Solaris machine) - just depends which ones.

That''s why you use x^1023 instead of x<1023. C compilers will not optimise this because in the instance of

for (i=0;i<1023;a?i+=2:++i)func(a,i);

for (i=0;i^1023;a?i+=2:++i)func(a,i);

the second will become unpredictable.

It''s those antsi little optimisations that can make small but sometimes useful changes (try it when comparing very large quantities of small fixed point data - alot faster)

It''s a bit wise comparison, not a numeric one. Hence the manual optimisation.

It''s like a/(x|1).
Figure that one out Compilers don''t optimise that into their code either



For the accuracy it''s because of interference from other processes - even in the highest thread states they still share resources very occasionally. That and the damned caching of some variables, so certain small variables won''t be timed correctly.

I thought it would be accurate all the time, but the Intel documentation for RDTSC said otherwise. I tried it and it was right, and I was wrong. It shows that sometimes documentation can be helpful.

Share this post


Link to post
Share on other sites
btw - regarding the profiling multiple times
I profile each function for testing either 200 multiples (for smaller code) or 100 (for slower code).

When not testing, each timed function is logged every time it is used.

For code measurement I use RDTSC. For frame and cycle frequency I use QueryPerformanceCounter.

Share this post


Link to post
Share on other sites
> one cycle can contain more than one instruction (on newer cpu''s anyway ..
Right, but 4 instructions in half a clock is a bit unrealistic Max issue rate is 3 instructions/clock on my Athlon.

quote:

> What is your definition of accurate?
For the accuracy it''s because of interference from other processes - even in the highest thread states they still share resources very occasionally.

You mean your thread is preempted more often when calling Windows APIs, i.e. there''s a Reschedule call in there somewhere? hmm, that could be. Supporting evidence: I''ve noticed ReadFileEx callbacks are sometimes delivered from within an API call.

quote:
I thought it would be accurate all the time, but the Intel documentation for RDTSC said otherwise. I tried it and it was right, and I was wrong. It shows that sometimes documentation can be helpful.

?

(at home, high latency )

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
The whole measure 0.5 ticks thing is kinda correct. In science and engineering you normally have a scale to read off of and you say it is either near a mark on the scale of between the marks on the scale - giving you the ability to judge at most accurate half the scale used (as long as there are no other factors). This does not apply to this case really because in 1 call you get a whole number of ticks.

It is right to say that if you have a larger set of measurements you can say the result more accurately - but things start to get complicated when you have to estimate error in the readings. It also depends on whether you measure the whole lot in 1 lump or you measure them 1 by 1 and add it together.

Share this post


Link to post
Share on other sites
Someone else has already done most of the research here:

http://www.geisswerks.com/ryan/FAQS/timing.html

Okee?

Share this post


Link to post
Share on other sites

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!