Archived

This topic is now archived and is closed to further replies.

Prozak

ClockTick Code Compatibility

Recommended Posts

Prozak    898
Hi. I came across this code snippet that reads the clock ticks on Pentium machines:
// Read Clock Ticks

_int64 readTSC (void)
{
	_int64 t;
	unsigned int a,b;
	unsigned int *c = (unsigned int *)&t;
	_asm {
		_emit 0x0f;
		_emit 0x31;
		mov a,eax;
		mov b,edx;
	}
	c[0]=a;c[1]=b;
	return t;
};
I would state the author of this code snippet, but all i remember is that is is free for all use, i cant remember from where i copy/pasted it... My main problem is that I need to detect if the machine supports this or not. Can someone point me some code that checks if I can use this? Is this more precise than QueryPerformanceCounter? Or are the two the same thing? Out of curiosity, are clock ticks actually clock cycles, as in 20 CTs = 20 nanoseconds for example, or is a clock tick = to a performed instruction, as in 20 clock ticks, i just did 20 instructions? Thanx

[Hugo Ferreira][Positronic Dreams][Colibri 3D Engine][Entropy HL2 MOD][Yann L.][Enginuity]
The most irrefutable evidence that there is intelligent life in the Universe is that they haven''t contacted us!

Share this post


Link to post
Share on other sites
IFooBar    906
you can use this function to get the same result as the one you posted


__int64 readTSC()
{
__asm rdtsc
}


compiler will complain with a "no return value" but you can safely disable/ignore it. also _emit 0x0f and _emit 0x31 is the rdtsc function.

as for detection code, well...

first you need to call the cpuid instruction with eax = 1 so that it gets you the CPU feature bits in the edx register. Then copy the edx register into a DWORD or something and check the bits to see if a feature is supported or not. Bit number 4 tells you if rdtsc is supported or not.

For a complete application with source that actually does all this just click the System Information linky in my sig. It''s all there.

quote:
Out of curiosity, are clock ticks actually clock cycles, as in 20 CTs = 20 nanoseconds for example, or is a clock tick = to a performed instruction, as in 20 clock ticks, i just did 20 instructions?



Yeah, clock ticks are infact clock cycles. a clock tick is not on the instruction level becuase some instructions can take many cycles to complete.

| C++ Debug Kit :: GameDev Kit :: DirectX Tutorials :: 2D DX Engine | TripleBuffer Software |
| Plug-in Manager :: System Information Class :: D3D9 Hardware Enum | DevMaster :: FlipCode |

Share this post


Link to post
Share on other sites
Prozak    898
IFooBar, thanx for the reply.

So, this method is more precise than QueryPerformanceCounter? or is QueryPerformanceCounter just a winAPI wrapper arround the readTSC method?


[Hugo Ferreira][Positronic Dreams][Colibri 3D Engine][Entropy HL2 MOD][Yann L.][Enginuity]
The most irrefutable evidence that there is intelligent life in the Universe is that they haven''t contacted us!

Share this post


Link to post
Share on other sites
IFooBar    906
yeah its not really more precise. But it is way faster to execute. The thing is that with the query function, you can easily get the frequency of the ticks and turn the ticks into valid seconds. But with rdtcs, it''s hard to accurately get teh frequency of the computer, so turning it into seconds would be tricky.

Computer frequency calculation is always a bit wonky. You dont get exact result, and on top of that, a lapto will run at different speeds at different times, it can change processor speeds in real time.

| C++ Debug Kit :: GameDev Kit :: DirectX Tutorials :: 2D DX Engine | TripleBuffer Software |
| Plug-in Manager :: System Information Class :: D3D9 Hardware Enum | DevMaster :: FlipCode |

Share this post


Link to post
Share on other sites
billybob    134
I thought I''d comment on this, I originally was using rdtsc for my timing code. eventually, I switched to QPF/QPC, for the same reasons IFooBar mentioned. IMO, its not worth it, unless you actually want clock cycles, but I wouldn''t base time in your game off of it.

Share this post


Link to post
Share on other sites
Jan Wassenberg    999
QPC implementations:
WinXP - PMT (3.57 MHz, 0,7 µs read overhead, several hardware bugs)
Win2k - PIT (1.19 MHz, ~3 µs read overhead)
MP HAL - TSC (CPU freq, several dozen clocks read overhead, dox say it''s unreliable - *sigh*)

I''m not sure about the TSC being more precise than PMT (tossup which crystal is worse), but the TSC has much higher resolution - it''s definitely worth it. You have to be a bit careful: don''t use it on SpeedStep or SMP systems (in both cases, compensating is a whole lot of work, and unreliable), and continually measure the frequency (=> more accurate, accounts for temperature drift).
Or has someone found a workaround for the TSCs-aren''t-synced-between-CPUs problem? Maintaining separate per-CPU counters, and choosing between them via GetCurrentProcessorNumber would be fine, but that API is only available on Win 2003 (even though it''s declared in my WinXP headers - grr). There''s a corresponding field in KTHREAD and the KPCR, but I don''t see how to get at it without writing a driver, and I''m not quite willing to go that far
BTW, CPU freq isn''t hard to measure. Count clocks in one QPC tick, 20 samples should be enough, median filter, and that does it to .05 MHz of the true value on my system. CallNtPowerInformation() gives you the freq as well, but I''m not sure about its accuracy (it reports 1403 MHz, as does DxDiag; the true value is 1403.19).

Share this post


Link to post
Share on other sites
Prozak    898
I dont even think frequency is a necessity if all you want is to know how much a certain task ocupies the processor, per frame.

If the total returned by readTSC is 50.000 in this frame, and my AI code took 4.000 then I know my AI is taking 12.5% of each frame time.

This even work in laptops where their processor can change its internal speed, because what you''re geting is allways a percentile per frame.

And if its more acurate, even better


[Hugo Ferreira][Positronic Dreams][Colibri 3D Engine][Entropy HL2 MOD][Yann L.][Enginuity]
The most irrefutable evidence that there is intelligent life in the Universe is that they haven''t contacted us!

Share this post


Link to post
Share on other sites
Jan Wassenberg    999
Right, if all you''re doing is profiling, you don''t have most of these problems. The TSCs may differ (slightly) between CPUs, but that doesn''t matter to you much, especially if measuring large blocks of code.

Share this post


Link to post
Share on other sites