Sign in to follow this  
Shannon Barber

Why is Win32 Sleep still such garbage?

Recommended Posts

It's actually gotten /worse/ over the years - you can't even sleep for anything close to 1 ms anymore (with the calls to timeBeginPeriod(1)/timeEndPeriod(1)). You get 2 ms. And if you Sleep(2)... you get 3 ms. Besides being a rant, I am curious if there any genuine reason this API call sucks so hard? It seems with today's multicore super computers reliably getting service every 1ms ought to be pretty easy to accomplish ><.

Share this post


Link to post
Share on other sites
A comment here sugests to me that the Sleep time can be affected by the time taken to process messages in your message loop (which I never realised previously).

Quote:
Be careful when using Sleep and code that directly or indirectly creates windows. If a thread creates any windows, it must process messages. Message broadcasts are sent to all windows in the system. If you have a thread that uses Sleep with infinite delay, the system will deadlock. Two examples of code that indirectly creates windows are DDE and COM CoInitialize. Therefore, if you have a thread that creates windows, use MsgWaitForMultipleObjects or MsgWaitForMultipleObjectsEx, rather than Sleep.

Share this post


Link to post
Share on other sites
It's not a matter of the hardware's power; it's a matter of the hardware design (in part) and the OS design (mostly). Windows isn't a real-time OS. Your app has to share the processor with god-knows what other apps written by people you don't even know, let alone control. So rules are in place to prevent monopolization of the processor. The same rules directly imply that you can't always get the processor right now.

Share this post


Link to post
Share on other sites
yes it is a very retarded, same with date/time queries etc (true a bit OS specific)
but these sort of things should of been done correct right from the start, after all theyre not exactly complicated

Share this post


Link to post
Share on other sites
Sleep has never been, and never was designed to be, an accurate method of pausing a thread or process. You have told the system that you wish to nap for AT LEAST n milliseconds. If you need accurate timing then use an accurate timing method (such as thread timers) and not a documented innacurate method.

Share this post


Link to post
Share on other sites
Quote:
Original post by Shannon Barber
Besides being a rant, I am curious if there any genuine reason this API call sucks so hard?
It seems with today's multicore super computers reliably getting service every 1ms ought to be pretty easy to accomplish ><.
The sleep function is old. It is older than the Windows OS. It is older than the Windows name. It dates back to the pre-1985 days, along with "yield" and a few other timing functions that most people have forgotten. (Yes, I started writing software as a kid back in 1981.)


The function was designed for when overclocking was flipping the "turbo" switch that bumped the PC up to an amazing 8 MHz, at the cost of possible hardware glitches.

That function was designed for when memory speed was measured in microseconds and kilobyte capacities.

The purpose of the call has not changed over the years. It tells the system that you want to be idle for at least a certain number of milliseconds.

That's all. It doesn't have requirements of "no more than". It doesn't have requirements of "exactly". It basically says 'just ignore more for a while, and get back to me eventually'.





If you want something different, then you simply need to use a different function. There are many of them to choose from.

Share this post


Link to post
Share on other sites
Quote:
Original post by zedz
yes it is a very retarded, same with date/time queries etc (true a bit OS specific)
but these sort of things should of been done correct right from the start, after all theyre not exactly complicated


Correct me if i'm wrong but aren't the only OS types that does this "correctly" Realtime systems ?

a RTOS has a fairly significant performance disadvantage, you want processes to run uninterrupted for as long as possible before being swapped out,

the only thing timeBeginPeriod(1) does is tell the OS to tell the cpu to stop doing what its doing once every millisecond to run a specific piece of code instead, that specific piece of code would ofcourse be what hand control of the cpu back to the OS, allows its scheduler to run, etc, etc.

This basically only means that when you call
timeBeginPeriod(1)
sleep(1)
timeEndPeriod(1)

what really happens is that you tell the OS to start forcibly grabbing control at 1ms intervals , then you tell the OS that you won't need the cpu for ATLEAST one millisecond.

The 1ms grabbing cycle doesn't start when you call sleep, it started before that which means that the first interrupt that happens after you've called sleep will occur before the 1ms sleep timer has expired.

Your process will never ever be assigned the first full timeslice after the sleep call, thus when you call sleep(1) you give up:
1) the remainder of your current timeslice
2) The timeslice following that one. (you can theoretically get assigned the later part of that timeslice if another process gives it up after your sleep timer has expired)

Then there is no guarantee even that you get the timeslice following the ones you explicitly gave up either, the scheduler still has to make sure that other applications on the system stay reasonably responsive (properly written low priority background services should be able to handle themselves during the part you explicitly gave up but other normal applications may take a full slice).

Share this post


Link to post
Share on other sites
As a side note - if you Sleep(0), you essentially give up your spot in the OS timehsare for threads for the current thread, but do not specify a wait time. This should get you the closest possible approximation when used together with timBeginPeriod()/timeEndPeriod(). You'd have to do your own timing in that case, though, to guarantee minimum granularity.

Share this post


Link to post
Share on other sites
I found that the best and easiest Clocking Method is QueryPerfomanceCounter().
If you are in a GUI Thread and need to handle Messages just insert a
while ( PeekMessage...) somewhere.
Yes it's Processor intensive. But if you don't pool, don't expect Accuracy.
There is a Reason why Games use QueryPerfomanceCounter().

Share this post


Link to post
Share on other sites
As far as Sleep() granularity goes, I'd say PC is mildly irrelevant since it doesn't really provide any additional advantage over using timeGetTime() in 99% of real world applications. While it's good for precise timing, yes, one should take a step back and evaluate if that kind of precision is really needed in the first place. Assuming a granularity of 1 ms with an error of 10 ms (which you might expect from Sleep()), that's 1/1000-1/100 of a second, well within a "100 FPS" precision range. Any fluctuation is bound to be unnoticeable in practice. The PC is most useful for profiling for which in many cases it makes more sense and is less cumbersome to use rdtsc directly.

So, really, ask yourself, are you absolutely sure you need that kind of precision? I know Sleep() sucks, but then again - we're only human (so at the end of the day that's what you should be focusing your attention on :) ).

Share this post


Link to post
Share on other sites
Quote:
Original post by Washu
Sleep has never been, and never was designed to be, an accurate method of pausing a thread or process. You have told the system that you wish to nap for AT LEAST n milliseconds.
Quoted for truth.

Having said that, Sleep works perfectly well with timeBeginPeriod(1) on my system (I've tried that a while ago because I wondered myself how well it works). Doing 10.000 Sleep(1) takes little over 10 seconds (4-6 ms more), which I guess is pretty much as good as you can get. To me, it's good enough, anyway.

Quote:
As a side note - if you Sleep(0), you essentially give up your spot in the OS timehsare for threads for the current thread, but do not specify a wait time.
SwitchToThread is the preferred method here, as it has a much better behaviour. It will run the first ready thread that is scheduled for the same core if there is one, and do nothing otherwise (i.e. the next thread starts with warm caches, and the original thread will still be on the same core unless an urgent reason comes up to move it).

Sleep(0) on the other hand just gives up the time slice and keeps the thread ready, but does not provide any other guarantees. It might run any time later, and in theory on any core that happens to be free at that time.

Share this post


Link to post
Share on other sites
Quote:
Original post by Washu
Sleep has never been, and never was designed to be, an accurate method of pausing a thread or process. You have told the system that you wish to nap for AT LEAST n milliseconds. If you need accurate timing then use an accurate timing method (such as thread timers) and not a documented innacurate method.


Thread timers use a thread-pool correct? I recall them reentrantly calling the code if it was late.
Is there any other method?
I suppose you could have that code just set an event for the actual work thread... hum, how reliable are they?

Share this post


Link to post
Share on other sites
So I tried it...
Thread timers are even worse than sleep!
(Right now we burn up a core spinning hard on QueryPerformanceCounter...)


Here are the numbers trying for a 1000ms timer:

0.001
1.008 1.007
2.022 1.014
3.036 1.014
4.050 1.014
5.064 1.014
6.078 1.014
7.092 1.014
8.106 1.014
9.120 1.014
10.134 1.014

My nekkid eye is about that good.


using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;

namespace TestTimers
{
class Program
{
[DllImport("winmm.dll", EntryPoint = "timeBeginPeriod", SetLastError = true)]
private static extern uint TimeBeginPeriod(uint uMilliseconds);

[DllImport("winmm.dll", EntryPoint = "timeEndPeriod", SetLastError = true)]
private static extern uint TimeEndPeriod(uint uMilliseconds);

static System.Diagnostics.Stopwatch stopWatch = new System.Diagnostics.Stopwatch();
static System.Threading.AutoResetEvent signal = new System.Threading.AutoResetEvent(false);

static void Main(string[] args)
{
TimeBeginPeriod(1);
try
{
stopWatch.Start();
using (System.Threading.Timer timer = new System.Threading.Timer(Program.TimerCheck, null, 0, 1000))
{
System.Threading.Thread.Sleep(10200);
}
}
finally
{
TimeEndPeriod(1);
}
}

static void TimerCheck(object state)
{
decimal time_sec = (decimal)stopWatch.ElapsedTicks / (decimal)System.Diagnostics.Stopwatch.Frequency;
Console.WriteLine(time_sec.ToString());
}
}
}




Anything I'm doing wrong?

My gripe with Sleep and all the other various Windows timing mechanisms is that while they certainly could have gotten much better over the years they have actually gotten worse... and that just doesn't make sense.
A ten year old Pentium II running Windows 2000 hits the times better than a i5 running Windows 7.

Share this post


Link to post
Share on other sites
Quote:
Original post by Shannon Barber
[...]A ten year old Pentium II running Windows 2000 hits the times better than a i5 running Windows 7.
No, it doesn't. If, when given a delay of "at least 1ms", you delay 1ms or more, you're hitting the delay perfectly. Thus, if a pentium 2 delays 1.0001ms when given "at least 1ms" and a modern core i7 delays 37 minutes when given "at least 1ms", both machine are functioning perfectly to the specification.

If you want to specify a delay of something other than "at least X", you'll have to use a different mechanism that works how you desire. The most accurate such mechanism would be a real-time OS on purpose-designed hardware, but you can do much better on an average PC running windows. Jan Wassenburg has written several articles on the topic of timing that can be found here on gamedev and elsewhere.

Share this post


Link to post
Share on other sites
Because people can't wrap their tiny brains around the idea that altering priority levels doesn't make for good multhtreaded code. MS takes that to the extreme until now you simply can't ever really have priority set in a sane manner or anything you do will work like you'd obviously expect it to.

It makes some sense to have signalled threads get more priority, but it takes away your ability to see just what the f is ever going on, and to have some miniscule hope that some multithreaded stuff on.

I guess it's just typical microsoft destructive 'helping'.

Share this post


Link to post
Share on other sites
Quote:
Original post by Extrarius
Quote:
Original post by Shannon Barber
[...]A ten year old Pentium II running Windows 2000 hits the times better than a i5 running Windows 7.
No, it doesn't. If, when given a delay of "at least 1ms", you delay 1ms or more, you're hitting the delay perfectly. Thus, if a pentium 2 delays 1.0001ms when given "at least 1ms" and a modern core i7 delays 37 minutes when given "at least 1ms", both machine are functioning perfectly to the specification.

If you want to specify a delay of something other than "at least X", you'll have to use a different mechanism that works how you desire. The most accurate such mechanism would be a real-time OS on purpose-designed hardware, but you can do much better on an average PC running windows. Jan Wassenburg has written several articles on the topic of timing that can be found here on gamedev and elsewhere.


It's always real-time and always a question of performance.
If you called Sleep(10) in your app and it slept for two minutes every time no one would accept that "it works to spec".
I have a computer that is 40x more powerful yet is 90% less precise.

Quote:
Jan Wassenberg
I've previously written an article about timing, but unfortunately the situation has gotten even worse since then. RDTSC is more broken than before and QPC now uses it instead of only marginally broken alternatives, so that leaves WinXP users on dual-core systems with a total lack of a decent timer.

So we rely on QPC to make the best decision for the hardware and blow a core just to figure out "time to make the donuts".


More to the "meat" of the matter though, there is a new feature in Vista and 7 called "Timer Coalescing" designed for minimum power consumption that will intentionally delayed timers by up to a default of 32ms to make many things happen at once to maximize time spent in low-power idle.
Presumably this is why .NET based timer code is off by 14ms (because it is using the default of +32ms).
When I have a chance I'll try using the newer API SetWaitableTimerEx and see how low I can set the coalesce parameter.

Share this post


Link to post
Share on other sites
What do you mean by QPC "is burning up a core"?

I have a similar problem: I want to have a timer that sleeps for one second, and until that, the program sleeps (So the CPU won't burn on 100%). But I want to do it precisely: it's a time display in a minesweeper game, that is event based (windows message event), so it only does stuff, when it gets win message (okay, it's obvious, I'm over-explaining it, because I don't know the terms).

So how could I achieve this?
Edit:Maybe with sleep() combined with QPC to correct the error of sleep... Nope.
How precise is the SetTimer, WM_TIMER stuff? This seems better

I've looked at Jan Wassenburg's article, maybe I'll just leave it as imprecise as it is.

[Edited by - szecs on February 15, 2010 2:56:31 PM]

Share this post


Link to post
Share on other sites
Quote:

If you called Sleep(10) in your app and it slept for two minutes every time no one would accept that "it works to spec".

Sure they would! If I'm running a really low priority service, and I need to do stuff, then go idle. Sleep(10) could take minutes if there is a high priority task eating up 100% of the CPU. I already said i'm low priority, and I said I don't want to work more than once every 10ms. So the wait time is "to spec" because the user sees that their high priority task(like video encoding or something) isn't being interupted by all the hundreds of random threads on the machine. Waiting on a device is even better. My service will go to sleep till some random time in the future when there is data for me.

Windows only guarantees that I will eventually get time, and will eventually get my data if it exists. It isn't a realtime OS, so it doesn't guarantee it will do any of that in a timely manner.

Quote:

I have a computer that is 40x more powerful yet is 90% less precise.

Your computer is also 40x a powerful, but running 600x as many random tasks. Look back to dos, what did it have running? nothing. command.com that was it, and it was waiting for you to type things. You run another app? yeah it got 100% of the cpu to itself so it could be 100% accurate. You have to remember you are sharing the computer. If you don't have a a very good computer(like my old dual core), try running a high-def 1920x1200 movie and doing anything else. The movie player will be high priority, and everything else takes a back seat. Opening something like firefox(usually a sec or less) can take 15sec to a min, but your video keeps playing along nicely.

Quote:

I have a similar problem: I want to have a timer that sleeps for one second, and until that, the program sleeps (So the CPU won't burn on 100%). But I want to do it precisely: it's a time display in a minesweeper game,

A second is a long time. You can use QPC to get a better guess at the time spent between your 1second event timers, but a person is going to have a hard time noticing a 15-30ms delay in a 1sec timer when they are focusing on clicking on mines.

Quote:

14ms

IIRC the default timeslice in windows XP/Vista is 15ms. That might also explain that number.

Share this post


Link to post
Share on other sites
Quote:
A second is a long time. You can use QPC to get a better guess at the time spent between your 1second event timers, but a person is going to have a hard time noticing a 15-30ms delay in a 1sec timer when they are focusing on clicking on mines.
I want to add best times listing feature, so precision is essential :P

Share this post


Link to post
Share on other sites
Quote:
Original post by szecs
Quote:
A second is a long time. You can use QPC to get a better guess at the time spent between your 1second event timers, but a person is going to have a hard time noticing a 15-30ms delay in a 1sec timer when they are focusing on clicking on mines.
I want to add best times listing feature
Don't use the accumulated event timer callbacks to determine the overall time - accumulation always magnifies errors, so in this case your error will increase every second.

Instead, take a begin and end time and subtract, so that you only have the error inherent in two samples.
Quote:
so precision is essential :P
How precise, exactly? I can't detect a difference in my winning click of less than about 1/4 second, so it doesn't seem reasonable to apply greater precision to the best times list.

Share this post


Link to post
Share on other sites
Quote:
Original post by Shannon Barber
[...]I have a computer that is 40x more powerful yet is 90% less precise.[...]
Your more powerful computer has more powerful hardware that can better perform many tasks, including more precise timers. If you want to take advantage of that increased functionality, you can't use functions created long ago with specifications based on the hardware of the time.

I'd think it very reasonable for a win32 Sleep() call to put the sleeping thread at the end of the scheduling queue (or, 'worse', put it at the end of the queue after the sleep duration has elapsed) since it indicated it wanted to stop processing for at least X amount of time. That could lead to any amount of delay if there were enough other equal-priority threads ready to run.

Is that quote from Jan Wassenberg something you found on the forums or was it a reply to a question you asked him directly? You might want to message/email him if you haven't, as last time I talked to him, he had an extended, updated, and refined version of his article that had not yet been posted publicly on the internet.

[Edited by - Extrarius on February 15, 2010 4:07:39 PM]

Share this post


Link to post
Share on other sites
so precision is essential :P
I'm not serious (at least I tried not to be)

Anyway I will go the WM_TIMER way. Since it's based on milliseconds and I can measure absolute elapsed time, I can adjust the timer value every time to keep as precise as possible.

I mean, how else would you solve it in an event based application (please tell me the correct term for it)?

Share this post


Link to post
Share on other sites
Quote:
Original post by Shannon Barber
It's always real-time and always a question of performance.


I don't think you understand what they meant by real-time. Real-time, in computer programming terms, means that you can make guarantees about how frequently and how long your processes are scheduled. There is a thing called a real-time OS, where if you programatically say "I want to wait for 50 ms before running again," the OS can guarantee that. Obligitory Wikipedia link.

Windows is not a real-time OS, and its not designed to be.

Share this post


Link to post
Share on other sites
Quote:
Original post by Rycross
Quote:
Original post by Shannon Barber
It's always real-time and always a question of performance.


I don't think you understand what they meant by real-time. Real-time, in computer programming terms, means that you can make guarantees about how frequently and how long your processes are scheduled. There is a thing called a real-time OS, where if you programatically say "I want to wait for 50 ms before running again," the OS can guarantee that. Obligitory Wikipedia link.

Windows is not a real-time OS, and its not designed to be.


I understand your sentiment but I'm arguing a higher point; >everything< is real-time task because everything has a deadline - it is just a question of the performance needed.

If you clicked on 'Print' and it took 4 weeks for it to come out of the printer no one would accept "Well, Windows is not a real-time operating system" as an 'answer'.

When you dig into RTOS's it's all about their ability to minimize jitter and few jobs are true "hard real-time".
Most tasks can tolerate /some/ jitter and are so-called "soft real-time".
There is no good excuse for Windows to not be able to achieve 'soft real-time'. You literally cannot play music nor watch videos without this capability and to the extent Windows provides a multi-media API that includes functions to change how quickly it ticks tasks (down to a supposed 1ms and *it the past* you could sleep for 900us which worked "good enough", now you get ~1500us which is no longer "good enough").

[Edited by - Shannon Barber on February 17, 2010 3:45:49 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by KulSeran
Quote:

If you called Sleep(10) in your app and it slept for two minutes every time no one would accept that "it works to spec".

Sure they would! If I'm running a really low priority service, and I need to do stuff, then go idle. Sleep(10) could take minutes if there is a high priority task eating up 100% of the CPU. I already said i'm low priority, and I said I don't want to work more than once every 10ms. So the wait time is "to spec" because the user sees that their high priority task(like video encoding or something) isn't being interupted by all the hundreds of random threads on the machine. Waiting on a device is even better. My service will go to sleep till some random time in the future when there is data for me.

Windows only guarantees that I will eventually get time, and will eventually get my data if it exists. It isn't a realtime OS, so it doesn't guarantee it will do any of that in a timely manner.

Quote:

I have a computer that is 40x more powerful yet is 90% less precise.

Your computer is also 40x a powerful, but running 600x as many random tasks. Look back to dos, what did it have running? nothing. command.com that was it, and it was waiting for you to type things. You run another app? yeah it got 100% of the cpu to itself so it could be 100% accurate. You have to remember you are sharing the computer. If you don't have a a very good computer(like my old dual core), try running a high-def 1920x1200 movie and doing anything else. The movie player will be high priority, and everything else takes a back seat. Opening something like firefox(usually a sec or less) can take 15sec to a min, but your video keeps playing along nicely.

It's not running 600x more task I doubt it's even 10x more and we strip out all the unneeded services and the ones left are all idle and sleeping most of the time and we have 2/4/8 cores now so it's less likely that all the CPUs are 'too busy' and even if they are, we are the highest priority task on the system.
All of our processes and threads (where this timing matters) are set to Real-time process priority and real-time thread priority where they are trying for 1ms updates.
The computer hardware can handle the task we are asking for, it's the OS that is in the way.

I did the 1 second timer just to drive home how crappy they are, off by 14% with an virtual eternity between ticks.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this