Jump to content

  • Log In with Google      Sign In   
  • Create Account

We're offering banner ads on our site from just $5!

1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.


QueryPerformanceCounter Win32 timer slower than normal


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
22 replies to this topic

#1 Solid_Spy   Members   -  Reputation: 426

Like
0Likes
Like

Posted 12 November 2013 - 02:40 AM

Hello, windows high res timer QueryPerformanceCounter always seems to act quirky whenever I use it.

 

I remember to set the thread affinity to cpu 0, and the timer runs perfectly at 1 second per frame, but I want it to run at 60 fps, I tried dividing the timer frequency by 60, but it still runs too slow, and even dividing it by 10000 keeps it slower than 60 fps. I do not know what I am doing wrong.

 

Here is my code:

while(destroyed != 2)
	{
		QueryPerformanceCounter(&gameTimer->currentTime);
		if(PeekMessage(&msg, 0, NULL, NULL, PM_REMOVE))
		{
			TranslateMessage(&msg);
			DispatchMessage(&msg);
		}
		if(gameTimer->deltaTime >= 1.0f / 60.0f)
		{
			for(int i = 0; i < gameObjectList.size(); i++)
			{
				gameObjectList[i]->Update();
			}
			graphicsEngine->Render();
			gameTimer->deltaTime = 0.0f;
		}
		Sleep(1);
		QueryPerformanceCounter(&gameTimer->lastTime);
		gameTimer->CalculateDelta();
	}

This is how I'm calculating delta:

deltaTime += ((float)lastTime.QuadPart - (float)currentTime.QuadPart) / frequency.QuadPart;


Sponsor:

#2 N.I.B.   Members   -  Reputation: 1195

Like
0Likes
Like

Posted 12 November 2013 - 03:22 AM

How fast does it render when you are not limiting the FPS?

You should use vsync to limit your FPS to 60, otherwise you'll probably see screen tearing.


Edited by N.I.B., 12 November 2013 - 03:25 AM.


#3 Bacterius   Crossbones+   -  Reputation: 9068

Like
0Likes
Like

Posted 12 November 2013 - 03:37 AM

EDIT: some users have noted that the approach I suggested previously is not optimal. On further reflection I agree and have retracted it (I have left the explanation of why the OP's code is not behaving as expected).

 

---

 

Your problem is that Sleep(1). That's defeating the whole point of using an accurate timer, because Sleep() is less accurate than QPC. So by using it you have lowered the accuracy of your game loop to the accuracy of the Sleep() function, which is equivalent to not having QPC at all.


The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

 

- Pessimal Algorithms and Simplexity Analysis


#4 N.I.B.   Members   -  Reputation: 1195

Like
2Likes
Like

Posted 12 November 2013 - 04:04 AM


And for people complaining that "it takes all the CPU", relax - first, the CPU is there to be used, and secondly, the CPU is not doing very much work at all when just waiting for the delta time to elapse to draw the next frame (so it won't overheat or anything - it's not as if it were running Linpack in the meantime)

Actually, the CPU is not 'just waiting'. It runs instructions, and so it works, and consumes cycles and power (a lot - it doesn't have any cache misses to stall on).

Now imagine every application on the system running busy-wait loops - the CPU will be fully utilized all the time, causing performance degradation of the entire system.



#5 Bacterius   Crossbones+   -  Reputation: 9068

Like
2Likes
Like

Posted 12 November 2013 - 04:11 AM

 


And for people complaining that "it takes all the CPU", relax - first, the CPU is there to be used, and secondly, the CPU is not doing very much work at all when just waiting for the delta time to elapse to draw the next frame (so it won't overheat or anything - it's not as if it were running Linpack in the meantime)

Actually, the CPU is not 'just waiting'. It runs instructions, and so it works, and consumes cycles and power (a lot - it doesn't have any cache misses to stall on).

Now imagine every application on the system running busy-wait loops - the CPU will be fully utilized all the time, causing performance degradation of the entire system.

 

 

Then perhaps a WaitableTimer would do the trick, by getting the operating system to wake up the thread at regular (but not necessarily accurate enough) intervals. But then again, Windows is not a real-time operating system, and does not give particularly strong guarantees regarding when it decides to schedule threads for execution.


The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

 

- Pessimal Algorithms and Simplexity Analysis


#6 N.I.B.   Members   -  Reputation: 1195

Like
3Likes
Like

Posted 12 November 2013 - 04:16 AM

vsync is the simplest solution to limit the FPS to 60.



#7 Flimflam   Members   -  Reputation: 657

Like
0Likes
Like

Posted 12 November 2013 - 04:17 AM

And for people complaining that "it takes all the CPU", relax - first, the CPU is there to be used, and secondly, the CPU is not doing very much work at all when just waiting for the delta time to elapse to draw the next frame (so it won't overheat or anything - it's not as if it were running Linpack in the meantime). Context switches are expensive, let the operating system handle these details for you and write code without worrying about them - it knows what it's doing better than you do. If you really cannot deal with this, then you can always try and change the timer resolution with timeBeginPeriod(), to, for instance, 1 ms, and then use Sleep(), which will give some better results, but be warned that this may still cause your framerate to jitter to up to +- 1 ms if your thread happens to be sleeping as you cross the 16.67ms threshold.

 

Actually, if you have a busy wait loop running like that, it actually will run the processor ragged, and if they're on a laptop it's going to be eating their battery to its fullest.  If they're committed to a tight loop, however, you don't need to call Sleep every frame. Every 10-15 frames should be enough to give the CPU a breather, maybe even more, and over such a period of time, the time eaten by sleep will be barely noticeable.  

 

However I really don't think I'd advise a loop like that. I second the notion that limiting your main rendering framerate should be handled by vsync because there is almost no reason to do it yourself and lots of reasons why you wouldn't want to do it yourself.


Edited by Flimflam, 12 November 2013 - 04:23 AM.


#8 tonemgub   Members   -  Reputation: 1143

Like
0Likes
Like

Posted 12 November 2013 - 04:39 AM


Windows is not a real-time operating system, and does not give particularly strong guarantees regarding when it decides to schedule threads for execution

http://msdn.microsoft.com/en-us/library/windows/desktop/ms685096%28v=vs.85%29.aspx

 

IMO, It's not that Windows doesn't make guarantees... It does make quite a lot of them, actually, but it cannot guarantee what the hardware does. Most laptops and tablet PCs have hardware power-saving features that can't be controlled by the OS, to save power - this is also why it appears as though the OS "knows what it's doing better than you" - the hardware does that, not the OS.

 

This is interesting: http://msdn.microsoft.com/en-us/library/windows/desktop/ms684247%28v=vs.85%29.aspx


Edited by tonemgub, 12 November 2013 - 04:52 AM.


#9 L. Spiro   Crossbones+   -  Reputation: 14026

Like
5Likes
Like

Posted 12 November 2013 - 04:48 AM

Hello, windows high res timer QueryPerformanceCounter always seems to act quirky whenever I use it.

You use it in a quirky way.
First-off you are always dropping some time with your current implementation, and I don’t mean the intentional way.

You should never call QueryPerformanceCounter() more than once per loop. If you need to know the “last time”, make a copy of the current time before updating it.
 
	while(destroyed != 2)
	{
		gameTimer->lastTime = gameTimer->currentTime;
		QueryPerformanceCounter(&gameTimer->currentTime);
		if(PeekMessage(&msg, 0, NULL, NULL, PM_REMOVE))
		{
			TranslateMessage(&msg);
			DispatchMessage(&msg);
		}
		if(gameTimer->deltaTime >= 1.0f / 60.0f)
		{
			for(int i = 0; i < gameObjectList.size(); i++)
			{
				gameObjectList[i]->Update();
			}
			graphicsEngine->Render();
			gameTimer->deltaTime = 0.0f;
		}
		Sleep(1);
		gameTimer->CalculateDelta();
	}
Second-off you are converting the values to “float” while still in expanded form, meaning you are converting values such as 351,357,654,184 etc., which you should know better than to do. (Also, use static_cast for sake’s Pete.)
deltaTime += static_cast<float>(currentTime.QuadPart - lastTime.QuadPart) / frequency.QuadPart;
Third-off, don’t accumulate time as a float. Accumulate time in unsigned long long integers (I’m looking at gameTimer->deltaTime).


L. Spiro

Edited by L. Spiro, 12 November 2013 - 07:11 AM.

It is amazing how often people try to be unique, and yet they are always trying to make others be like them. - L. Spiro 2011
I spent most of my life learning the courage it takes to go out and get what I want. Now that I have it, I am not sure exactly what it is that I want. - L. Spiro 2013
I went to my local Subway once to find some guy yelling at the staff. When someone finally came to take my order and asked, “May I help you?”, I replied, “Yeah, I’ll have one asshole to go.”
L. Spiro Engine: http://lspiroengine.com
L. Spiro Engine Forums: http://lspiroengine.com/forums

#10 samoth   Crossbones+   -  Reputation: 4936

Like
0Likes
Like

Posted 12 November 2013 - 06:57 AM

In addition to what Bacterius and L. Spiro already said, the time taken by TranslateMessage+DispatchMessage is not deterministic -- wait, that's a wrong wording. It is deterministic, but it is neither obvious, nor constant. Not only your window proc, but also the default window proc may take vastly different amounts of time to process messages, and new messages may be generated (as a consequence of messages being received) that you even don't yet know about. In an extreme case, these two harmless lines of code could take 20 times longer than they usually do.

 

Most likely this will not be the cause of your problem, but still it is unwise to wrap a high resolution timer around such a thing (and around Sleep on top of that) and expect something meaningful to happen.

 

About Sleep, this is a valid objection. There is nothing less likely to break your timings (or your frame rate, for that matter) than calling Sleep. On the other hand, it is also a valid cause not to let the CPU busy wait all the time because of power consumption. However, this is not the case where it matters most anyway, and where it matters less you might even want that exact thing to happen.

 

One usually wants to save power on mobile devices where busy waiting will drain your batteries much faster (assuming you're not plugged in). However, on those devices, vertical sync is usually forced on by the drivers, so there is some built-in rate limit anyway. The game renders, so inevitably, it will be throttled. Of course, in between it will still run busy. But, even on a mobile device this may be a benefit.

 

On desktop computers, being busy is more often a benefit rather than a disadvantage, even if it burns a little more power. Unluckily, operating systems apply the power saving craze just as often when it doesn't make sense as when it makes sense. If you search the web for "game stuttering", you'll come up with "fix" instructions like these (which will more likely than not cause unwary users to break something!) for that exact reason.

 

When a CPU is not utilized, the OS will usually reduce its frequency or even "park" a core alltogether, even if you're on a desktop computer with a thick plug in the wall. This is OK when you're half-asleep in front of Microsoft Word at your boring office job. It is, however, not a big win while a game is running. Not at all. Effectively, the CPU runs at one half or one third of its speed when the only thing you really need is... speed.

 

This power saving craze is so extreme that for example on my ASUS notebook, I cannot even get the CPU to run at maximum clock speed. I paid for a CPU that can do 2GHz, but it will never run any higher than 1.7GHz, even if I start a program that has 2 threads busy running. On "idle" (i.e. if no program is busy spinning and you don't press a key for about 2 seconds), it runs with 0,58GHz using a single core. The result is that when you start your web browser or mail program, it takes literally seconds before it's ready (happens "instantly" on desktop). Because hey, we must save power. They don't account for the fact that if everything takes longer, this consumes power too, and your life time.

 

Power throttling is exactly the kind of thing that you normally don't want to happen in a game. Insofar, while the argument against busy waiting is valid, it is not valid without restrictions. In any case, however, there exist better options (Bacterius mentioned waitable timers, to name an example) than Sleep to prevent busy waiting.

 

Unluckily I can't remember the URL now, but there was a funny (well, funny if it doesn't happen to you) site a while ago about a guy who spent big $$$ to buy a faster server, only to discover that the stupefied power throttling made his new server run at less than 50% of its speed because he was only using two out of four cores (those 100% loaded, though), effectively being slower than the old one.


Edited by samoth, 12 November 2013 - 06:58 AM.


#11 Solid_Spy   Members   -  Reputation: 426

Like
2Likes
Like

Posted 12 November 2013 - 08:42 PM

How fast does it render when you are not limiting the FPS?

You should use vsync to limit your FPS to 60, otherwise you'll probably see screen tearing.

It renders pretty fast actually, even faster when I remove Sleep(1);


Edited by Solid_Spy, 12 November 2013 - 08:45 PM.


#12 tonemgub   Members   -  Reputation: 1143

Like
0Likes
Like

Posted 13 November 2013 - 03:31 AM


Then perhaps a WaitableTimer would do the trick

Waitable-timer resolution is the same as  the system timer resolution.


In addition to what Bacterius and L. Spiro already said...

DispatchMessage may take an infinite amount of time if there's a modal window, a window displays a menu, or when processing some of the non-client events for any window.

 

Your concerns about forced power saving are all about laptops/notebooks - they are not founded when it comes to desktops. So why should a simple maze game (for example) need to throttle up my 3.4Ghz qad-core CPU to 100% all the time? There are even AAA games that don't need that much performance. Why should I always have to use the power saving features in Windows 7 to limit the CPU when I'm playing these games (just because I want my CPU to stay as quiet as possible)?

 

Can you take your Asus to 2Ghz by using prime95? A busy-wait loop that does nothing is a good reason for the laptop's CPU to do power saving.



#13 N.I.B.   Members   -  Reputation: 1195

Like
0Likes
Like

Posted 13 November 2013 - 03:39 AM

 


Then perhaps a WaitableTimer would do the trick

Waitable-timer resolution is the same as  the system timer resolution.

 

 


In addition to what Bacterius and L. Spiro already said...

DispatchMessage may take an infinite amount of time if there's a modal window, a window displays a menu, or when processing some of the non-client events for any window.

 

Your concerns about forced power saving are all about laptops/notebooks - they are not founded when it comes to desktops. So why should a simple maze game (for example) need to throttle up my 3.4Ghz qad-core CPU to 100% all the time? There are even AAA games that don't need that much performance. Why should I always have to use the power saving features in Windows 7 to limit the CPU when I'm playing these games (just because I want my CPU to stay as quiet as possible)?

 

Can you take your Asus to 2Ghz by using prime95? A busy-wait loop that does nothing is a good reason for the laptop's CPU to do power saving.

 

Except, you already have a built-in mechanism to limit your FPS, so there's really no point in manual-limit methods. Plus, you'll never be able to exactly get to 60FPS, and you'll start seeing tearing effects.

 

Yes, busy-wait can be useful at times, but most people tend to use it as a brute-force solution when it isn't needed. And unless you are writing a AAA game which requires high-end GPU, you can't assume that people will not run your games on laptops.



#14 L. Spiro   Crossbones+   -  Reputation: 14026

Like
0Likes
Like

Posted 13 November 2013 - 04:18 AM


Waitable-timer resolution is the same as  the system timer resolution.

Do the he said what??

Citation needed.

 

A waitable timer as as accurate as any other event, which is nearly down to the microsecond.  Though of course no guarantees.

 

 

L. Spiro


Edited by L. Spiro, 13 November 2013 - 01:43 PM.

It is amazing how often people try to be unique, and yet they are always trying to make others be like them. - L. Spiro 2011
I spent most of my life learning the courage it takes to go out and get what I want. Now that I have it, I am not sure exactly what it is that I want. - L. Spiro 2013
I went to my local Subway once to find some guy yelling at the staff. When someone finally came to take my order and asked, “May I help you?”, I replied, “Yeah, I’ll have one asshole to go.”
L. Spiro Engine: http://lspiroengine.com
L. Spiro Engine Forums: http://lspiroengine.com/forums

#15 wintertime   Members   -  Reputation: 1800

Like
0Likes
Like

Posted 13 November 2013 - 05:31 AM

I would just dynamically switch to using WaitMessage, GetMessage or similar at times when the game is in pause mode, only showing a menu or has only need for updates in reaction to user input and no animations are running; that should reduce useless burning of cpu time and electricity already.

Maybe also use those timers that send a message on times of low activity. But using them constantly or adding sleeps I would think could give problems with timing, and vsync+measuring time+fixed timesteps for updates independent from drawing may be good enough? Though it could depend on the type of game and I did not see it mentioned by the OP.



#16 samoth   Crossbones+   -  Reputation: 4936

Like
0Likes
Like

Posted 13 November 2013 - 06:42 AM

 


Waitable-timer resolution is the same as  the system timer resolution.

Do the he said what??

Citation needed.

 

A waitable timer as as accurate as any other event, which is nearly down to the microsecond.Though of course no guarantees.

Plus, a waitable timer will cause a waiting thread to be scheduled over its siblings by giving it a priority boost. Sleep simply makes a thread "ready" after the rounded-up time is over. Which means it may get to run some time later. On non-server editions of Windows, this usually means 2 quantums, since that is the default scheduler unit.

 

A waitable timer makes the waiting thread ready and higher priority at the exact time the time is up. Due to the way the Windows scheduler works (serving top-down by priority), this makes a huge difference.

It still does not guarantee that the thread will run immediately, but it guarantees that it will the the first one in its group of similar-base-priority thread peers at the next opportunity. It also means that it may interrupt a peer thread before the assigned number of quantums are over.


Edited by samoth, 13 November 2013 - 06:46 AM.


#17 tonemgub   Members   -  Reputation: 1143

Like
0Likes
Like

Posted 13 November 2013 - 12:53 PM

 


Waitable-timer resolution is the same as  the system timer resolution.

Do the he said what??

Citation needed.

 

A waitable timer as as accurate as any other event, which is nearly down to the microsecond.Though of course no guarantees.

 

 

L. Spiro

 

Well, I don't think Microsoft broke it's terse power-saving rules just to make this specific type of timer more responsive... Then again, I never used it, but I have used an tested (against QPC) most of the other thread-sync events, and they all ran at approximately the system-timer resolution (15ms by default). I then also found some articles that mentioned this - can't remember where, so I don't think I am mistaken about waitable timers, really. But I'd be more than happy to be proved wrong. smile.png

 

 


A waitable timer makes the waiting thread ready and higher priority at the exact time the time is up.

That doesn't help much, for two reasons:

1) Most games already use the highest process&thread prioritiy available

2) The thread becomes active between intervals of the system timer - you can only hope that it becomes active sooner than later.


Edited by tonemgub, 13 November 2013 - 12:58 PM.


#18 L. Spiro   Crossbones+   -  Reputation: 14026

Like
0Likes
Like

Posted 13 November 2013 - 01:56 PM


Well, I don't think Microsoft broke it's terse power-saving rules just to make this specific type of timer more responsive.

Actually they did:


High-frequency periodic timers keep the processor continually busy, which prevents the system from remaining in a lower power state for any meaningful amount of time. This can have a negative impact on portable computer battery life and scenarios that depend on effective power management, such as large datacenters.

 

 


Then again, I never used it, but I have used an tested (against QPC) most of the other thread-sync events, and they all ran at approximately the system-timer resolution (15ms by default).

That’s basically the point.  Windows would be slow as hell if there was no alternative.  When you call WaitMessage() it internally performs the same kind of waiting, and if it was only as low as the system resolution your mouse would be jittery as hell.  Of course you don’t have to use WaitMessage() in games, but in standard apps it is…standard.

 

 


But I'd be more than happy to be proved wrong.

We use it at work.  It works on a very high resolution.  Try it for yourself.

 

 


1) Most games already use the highest process&thread prioritiy available

They use multiple threads and cores, not necessarily increasing thread priority, and in fact rarely doing so.

It’s one thing to distribute work across threads, it’s another to increase priority and overload a single thread.

 

 

L. Spiro


It is amazing how often people try to be unique, and yet they are always trying to make others be like them. - L. Spiro 2011
I spent most of my life learning the courage it takes to go out and get what I want. Now that I have it, I am not sure exactly what it is that I want. - L. Spiro 2013
I went to my local Subway once to find some guy yelling at the staff. When someone finally came to take my order and asked, “May I help you?”, I replied, “Yeah, I’ll have one asshole to go.”
L. Spiro Engine: http://lspiroengine.com
L. Spiro Engine Forums: http://lspiroengine.com/forums

#19 tonemgub   Members   -  Reputation: 1143

Like
0Likes
Like

Posted 13 November 2013 - 02:52 PM


MSDN said
High-frequency periodic timers keep the processor continually busy, which prevents the system from remaining in a lower power state for any meaningful amount of time. This can have a negative impact on portable computer battery life and scenarios that depend on effective power management, such as large datacenters.

Exactly my point - so why would they make only the waitable timers high-performance, then? Everyone (mostly game developers) would just start using them, and that would defeat the purpose of that MSDN warning. Not that you can't use timeBeginPeriod(1), of course, but that's what the warning is for in the first place.

 

 


Windows would be slow as hell if there was no alternative. When you call WaitMessage() it internally performs the same kind of waiting, and if it was only as low as the system resolution your mouse would be jittery as hell.

The system timer has nothing to do with the mouse, or any other bus-connected device. Devices have their own interrupt that signal the CPU when a hardware event happens, and it all happens separately from the system timer's programmable interrupt, which only signals the CPU periodically. So, WaitMessage will return when either the timeout expires (as timed by the system-timer), or when a device-driver responds to a hardware interrupt event, and creates an input event from it (and so on)...

 

I still say that the waitable timers are just software timers backed by the system timer - in which case, the only way to increase their resolution would be to increase the system timer resolution.


Edited by tonemgub, 14 November 2013 - 03:42 AM.


#20 Hodgman   Moderators   -  Reputation: 31121

Like
4Likes
Like

Posted 13 November 2013 - 03:47 PM

I would've guessed the same, so I did some tests tongue.png

On my current PC (Win8), using SetWaitableTimer with an absolute time in the future had a resolution of about 1/10th of a millisecond (way below the scheduling quantum!), but using it with a relative time had a resolution of about 1ms (chrome is running, so that's probably the scheduling quantum).

 

Periodic/non-periodic and manual-reset/synchronisation made no difference for me -- only whether the due time was absolute or relative.






Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS