Sign in to follow this  
Wixner

[Win32] QueryPerformance Anomalities

Recommended Posts

For the past two days, I've seen pink-spotted elephingos flying around throwing pumpkins at kids because this weird behaviour of QueryPerformanceCounter and QueryPerformanceFrequency.
[source lang = "cpp"]

//////
// my objects constructor
//////
if( !QueryPerformanceFrequency( &m_frequency ) )
 throw "QueryPerformanceFrequency failed";

//////
// my objects update-function
//////
LARGE_INTEGER start;
if( !QueryPerformanceCounter( &start ) )
 throw "QueryPerformanceCounter@start failed";

// do some computations, time-space-continuum disruptions and coffe...

LARGE_INTEGER end;
if( !QueryPerformanceCounter( &end ) )
 throw "QueryPerformanceCounter@end failed";

// calculate the time between start and end
double time = static_cast < double >( ( end.QuadPart - start.QuadPart ) / static_cast < double >( m_frequency.QuadPart ) );

// keep track of time
static double counter = 0;
counter += time;

// average computations per second
double computations = 1.0 / time;

// run the app for approximately 1 second
if( counter >= 1.0 )
 return computations;




Unless my keen (yeah, right) eye got this wrong, the code seems valid, but the results are not. When my function returns after what's supposed to be 1 second, the application has in fact been active for 10 seconds. It seems that my "time" is 10 times less than what it should be. I've tried this on my two computers; one with and AMD64 and one AMD AthlonXP cpu, but the same behaviour exist on both. edit: i forgot to typecast the frequency to double, but the problem is still there [Edited by - Wixner on October 22, 2006 5:01:55 PM]

Share this post


Link to post
Share on other sites
Consider yourself lucky... I occasionally get a negative number (like 1 out of 20 times). I don't think it's very reliable?

btw, you have a static_cast<double>(LONGLONG / double) above... I'm not sure what the result of a LONGLONG / double would be, but maybe that's the source of your problem?

Hey, this link seems to have some more info...

Share this post


Link to post
Share on other sites
I looked through the link and that problem currently only occurs on dual-core Amd64 cpu's. The most annoying thing is that I know I've used this code earlier (a month or two ago).

I've just tested the code on a new fresh project, and the anomaly is still there:

[source lang = "cpp"]
#include <windows.h>
#include <iostream>

int main( int argc, char** argv )
{
LARGE_INTEGER frequency;
QueryPerformanceFrequency( &frequency );
double _frequency = static_cast < double >( frequency.QuadPart );

double counter = 0;
while( counter < 10 )
{
LARGE_INTEGER start = { 0 };
QueryPerformanceCounter( &start );
double _start = static_cast < double >( start.QuadPart );

// seems correct
// Sleep( 10 );

// seems incorrect
// Sleep( 1 );

// the more work we throw to our cpu, the more it seems to differ
// for( int i = 0; i < 100000; ++i )
// float a = i * 3.1415192f;


LARGE_INTEGER end = { 0 };
QueryPerformanceCounter( &end );
double _end = static_cast < double >( end.QuadPart );

double time = ( _end - _start ) / _frequency;
counter += time;
std::cout << counter;

// my knowledge in console programming is.. limitied ;P
for( int i = 0; i < 1000; ++i )
std::cout << "\b";
}
}




If you guys got the time, could you please check this out and see if the counter (supposed to count on a second-basis) seems correct?

Share this post


Link to post
Share on other sites
Yeah duel core screws it up. You will get negative times in some cases, but don't worry, I got the solution right here for ya [smile] QueryPerformanceCounter is as accurate as you are gonna get on windows btw!

Anyways, I'm gonna just throw code at you, please excuse my language in my comments.. To make it work on a duel core system the easiest way is to set the thread affinity - basically lock it to run on one core:


/**************************************************************************
*
* File: cTime.cpp
* Author: Neil Richardson
* Ver/Date:
* Description:
*
*
*
*
*
**************************************************************************/


#include "cDebug.h"
#include "cTime.h"

using namespace core;
/////////////////////////////////////////////////
// Ctor & Dtor
/////////////////////////////////////////////////
cTime::cTime():
MilliSeconds_ ( 0.0f ),
Seconds_ ( 0.0f )
{
#ifdef WIN32_DUAL_CORE_TIMING
// Set our threads affinity to first CPU to stop jittering
if ( SetThreadAffinityMask( GetCurrentThread(), 1 ) == 0 )
{
DBG_FAIL( "Invalid affinity! Do you have a CPU at all?\n" );
}
#endif

#ifdef WIN32_PERFORMANCE_TIMER
// Get the frequency
if ( QueryPerformanceFrequency(&Freq_) == 0 )
{
DBG_FAIL( "QueryPerformanceFrequency is shit\n" );
}

// Get first tick
QueryPerformanceCounter(&FirstTick_);

#else // GetShitCount()

// Get first tick
FirstTick_ = GetTickCount();

#endif
}

cTime::~cTime()
{

}

/////////////////////////////////////////
// update
/////////////////////////////////////////
void cTime::update()
{
#ifdef WIN32_PERFORMANCE_TIMER
// Get the tick
QueryPerformanceCounter(&Tick_);

// Compensate
Seconds_ = (cReal)(Tick_.QuadPart - FirstTick_.QuadPart) / (cReal)Freq_.QuadPart;
MilliSeconds_ = Seconds_ * 1000.0f;

#else // GetShitCount()
MilliSeconds_ = (cReal)(GetTickCount() - FirstTick_);
Seconds_ = MilliSeconds_ * 0.001f;
#endif
}




Share this post


Link to post
Share on other sites
Odd, I will say I've never experienced that problem before. I mean try locking the thread affinity - see if that works, I mean do any other games give you any trouble? Another time - Could use the multimedia time - timeGetTime.

Share this post


Link to post
Share on other sites
No other application depending on QueryPerformanceTimer gives me any problems (not that I know of ofcourse )

Setting the thread's affinity does not change anything - it would be strange if it did in the first place though :)

timeGetTime() seems to work with my initial tests, but we all know we should avoid timeGetTime() like the plague ;)

Ah well, I suppose i need to wait for my fever to go down before I wrestle this beast any more :(

Edit:
Added the information about thread affinity

Share this post


Link to post
Share on other sites
Quote:

// seems correct
// Sleep( 10 );

// seems incorrect
// Sleep( 1 );


This is a BAD way to test timer code. Sleep(1) tells the OS that you need a minimum of 1ms downtime.
So your app probably wont be reactivated for 10-15ms. That would explain the
difference in timing there

Quote:

// for( int i = 0; i < 100000; ++i )
// float a = i * 3.1415192f;


If you are going to use loops like this to test your code make sure that you have optimizations turned completely off
or that loop will probably get removed from the output code.


Share this post


Link to post
Share on other sites
Thread affinity shouldn't solve anything on a single core processor, since even in seperate threads, the internal timer itself is always the same. There is a change there is some race condition problems if something is reading and writing to that block of memory at the same time. The thread affinity fix is recommended for the dual core processors though (And is recommended by Microsoft).

Share this post


Link to post
Share on other sites
It fixed the problem with my dual core machine though - thanks!

Is it a good idea to SetThreadAffinityMask back to 0 after you've done your timing? Richy2k's sample didn't do that.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
We don't in our engine since it takes time to set it. We poll at least once a frame and sometimes more often then that depending. To be nice we do set it back at the end of the program, though its not needed really.

Using Sleep(0) to free up spare cycles is fine but using Sleep for anything timing related is pointless unless you want 1~30ms difference in times (I've had Sleep(1) take 31ms, no joke).

Also, don't forget about timeBeginPeriod(1)/timeEndPeriod(1) to make timeGetTime() more accurate (to 1ms but that's not enough for game timing really).

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
Also, don't forget about timeBeginPeriod(1)/timeEndPeriod(1) to make timeGetTime() more accurate (to 1ms but that's not enough for game timing really).


Really? I would think a resolution of 1-6 ms would be more than suitable, and hasn't caused me any problems.

Share this post


Link to post
Share on other sites
Quote:
Original post by Mastaba
Really? I would think a resolution of 1-6 ms would be more than suitable, and hasn't caused me any problems.


Running at 60FPS, 1-6ms is a full 6-37% of your entire frame time. when you consider that 1-6 is really +/- 1-6ms, then you're talking about a resolution only good to 12-74% of your frame time. That's pretty likely to introduce bugs in very time sensitive systems (like physics) and will make any any custom built optimization tool worthless.

-me

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Why on earth do you need ~1ms accuracy for a game? Even 10ms would be extravagent.

Share this post


Link to post
Share on other sites
Quote:
Original post by Palidine
Quote:
Original post by Mastaba
Really? I would think a resolution of 1-6 ms would be more than suitable, and hasn't caused me any problems.


Running at 60FPS, 1-6ms is a full 6-37% of your entire frame time. when you consider that 1-6 is really +/- 1-6ms, then you're talking about a resolution only good to 12-74% of your frame time. That's pretty likely to introduce bugs in very time sensitive systems (like physics) and will make any any custom built optimization tool worthless.

-me


Physics time is independent of render time (or rather weakly dependent). If the game physics needs only a first order approximation, this resolution is more than suitable. Heck, Quake3 limited the server frame rate to 20 or 30 Hz (one of those) by default! Granted, Quake3 had very little physics to bother with.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Mastaba
Quote:
Original post by Palidine
Quote:
Original post by Mastaba
Really? I would think a resolution of 1-6 ms would be more than suitable, and hasn't caused me any problems.


Running at 60FPS, 1-6ms is a full 6-37% of your entire frame time. when you consider that 1-6 is really +/- 1-6ms, then you're talking about a resolution only good to 12-74% of your frame time. That's pretty likely to introduce bugs in very time sensitive systems (like physics) and will make any any custom built optimization tool worthless.

-me


Physics time is independent of render time (or rather weakly dependent). If the game physics needs only a first order approximation, this resolution is more than suitable. Heck, Quake3 limited the server frame rate to 20 or 30 Hz (one of those) by default! Granted, Quake3 had very little physics to bother with.


It was Doom3 and 60hz. Quake 3 could run at 400fps. Physics updates were I believe based on frame rate. Playing on servers limited you to whatever the server allowed.

10ms? You're joking right? You must be VERY new to game programming. If its 1ms+ I cannot use it for physics at all, its too inaccurate. You really need to read a lot about accurate timing before jumping into game dev. With 10ms resolution you could get a time that was older then your last physics updates time meaning you're applying physics to something that you think was 2 frames ago, talk about jumping animation and collision detection! That makes for a realistic game ;-)

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Oh and your last sentence shows your total lack of knowledge. Quake3 had TONS of physics to take care of. Every moving object had to have collision detection done (time needed for this) against the entire world (each bullet). Every moving entity had to have the same thing. You think its so simple but yet there is hardware to solve physics problems now and try to offload that from the CPU. There is a reason for this and they didn't start developing this hardware a year ago either.

Please, before posting and trying to sound like you are knowledgable in the area, make sure you are. You will provide false information to those on here that are trying to learn.

Share this post


Link to post
Share on other sites
Hmmmm, I guess my degree in physics was wasted then. Also, I wouldn't exactly call clipping a ray against the bsp, physics (which is what it did for instant hit projectiles). And, maybe you should read up or sober up before posting anonymously. Do some research bub, Quake 3 limited the server frame rate (not render frame rate) to 20 or 30 Hz by default. DOOM 3 did limit the server rate to 60 Hz, but I wasn't talking about DOOM 3.

[Edited by - Mastaba on October 23, 2006 7:20:35 PM]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Apparently it was wasted. Doom3 limited the refresh rate to 60hz period, you could over ride this mind you. If you think 10ms is good enough resolution for physics then you likely will want to stay with paper physics. Having something report 101ms one frame and 98ms the next and followed by 120ms is not the best system. The QueryPerformance*() fucntions are made to return accurate times in the ns range so you don't get leaps in time forwards and backwards you can get with other time functions (including timeGetTime() which at best is 1ms).

Last time I checked the Q3A source bullets were not "instant hit" items. They were updated per frame to find collisions. Also, most "physics" libraries for games include vector and matrix maths in them. So updating locations is "physics" if only by loose association. So checking a hit against a BSP would still count. It's still processing cycles and time does matter (for non-static geometry). Even at 30hz (I've never, EVER seen a Q3A server that was below 60hz but you may play on different servers then me who knows) that's still 3+ frames without accurate physics where time could go forward or backwards because of inaccuracies.

Anyways if you want one time update for every 6+ frames that's your deal. I find players like very accurate physical simulations without things jumping. To each their own I guess. You're right and the hundreds of game programmers who use highly accurate timers are morons. I'll conceed to your superiority. *bows*

Share this post


Link to post
Share on other sites
You do know that you don't need to calculate physics for every frame you render right? You calculate the physics at the reduced server rate, and interpolate or extrapolate the 'physics' on the client. And ya, check your Q3 source again, most of the projectiles were instant hit other than the obvious ones, rocket, grenade, etc. Again, you are the one playing on exceptional Q3 servers, as the default server frame rate is 20 Hz. The OSP mod upped that to 30 Hz by default. Also, I never said anything about 10 ms, that was someone else. I also never said anything about using a coarse time resolution to profile code. I also never said anybody was wrong for using QueryPerformance*, I've used it myself in games. But stop and think critically about it for a second, if the server is calculating the frames at 20 Hz such as Quake 3 did by default (google sv_fps if you don't believe me), i.e. 50 ms per frame, then having microsecond resolution for the frame is not useful as those digits are insignificant compared to the server frame rate.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Two things, you didn't say 10ms another AP did, my bad you said 1-6ms. Still that is a huge difference. Secondly, we're talking different things. You are assuming this is for multiplayer games while I see none of the original posters mentioning this. I do my physics based on single player only. Also, most games now use much higher refresh rates to get more accurate hits. 20hz is HIGHLY inaccurate and why a lot of servers set it higher. The idea a rocket hits behind someone it should have hit because of a low rate annoys a LOT of people (including me). CS servers (and CS:S) seem to all be 60hz now for this very reason. 60hz should be the norm honestly and thankfully seems most games agree.

Also high frequency timers are used for a lot more then physics. Animation, AI and sound are also users of them. Though in my case sound only because its the only timers available since I refuse to code up something less accurate when a more accurate and just as cycle heavy option is already done.

I was trying to find a very good article that argued the point of very high resolution timers a lot better then I have. But in the end, if you really care, you'll find it (swear it was on here).

Anyways, if I was an ass sorry but we're quite far off topic and I'm now on my own time.

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
We don't in our engine since it takes time to set it. We poll at least once a frame and sometimes more often then that depending. To be nice we do set it back at the end of the program, though its not needed really.


FWIW, I found that with the thread affinity set, the UI is less responsive (I'm using the timing stuff on a background thread that is doing some rendering work to prepare bitmaps for the UI thread)

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Twas me who mentioned 10ms. I still don't understand the purpose of a high resolution timer. This would indicate, as Mastaba noted, that you are measuring the time per frame and using that in your calculations. While this is reasonable if you expect wildy varying (and low) frame rates, however in a game situation you would want to aim for a very consistent (and preferably high) frame/update rate - which of course doesn't need accurate timing at all.
That is, I hope, if understood correctly.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this