• Advertisement
Sign in to follow this  

How wait on condition variable works fundamentally

This topic is 466 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey Guys,

 

In a multi-thread scenario, like a threadpool, if one thread is waiting to be waked up. How does that get implemented?  I want an answer deeper than "It's just wait on condition variable wake_one/ wake_all call"

 

In my mind, there must be a 'bool flag" this thread is periodically checking, otherwise, how this 'sleeping' thread know it's time to wake up? So if it is 'periodically' evaluating a memory address (or CPU register), then does that mean using condition variable is just another 'sleep(time)' call under the hood  with much fine grain delta time?

 

Thanks

Share this post


Link to post
Share on other sites
Advertisement

It's a bit more than a bool flag. A simple bool still has the risk that you'll check it and while youre checking it, the other thread changes it and you miss that change (this is known as a race condition and is very, very bad).

 

At the lowest level it can be implemented as something called a "spinlock", "semaphore" or "mutex".

 

These are what are known as atomic operations, e.g. it always completes in its entirity without interruption.

 

Think of threads, and how they switch periodically outside of your control, they're pre-empted by the OS. Well, a spinlock, mutex or semaphore can't be pre-empted, it is kernel level code, and it temporarily disables the task switcher to work without being pre-empted away.

 

You can read up more about spinlocks at the lowest level here: http://wiki.osdev.org/Spinlock and here http://wiki.osdev.org/Synchronization_Primitives

 

At their most basic level, beneath the C veneer, locking primitives in windows work this way.

 

Note that by its nature a true spinlock, mutex or semaphore can't be implemented in userland code, and is always part of the OS. You wouldn't write it yourself unless you're writing an OS, so this all for curiosity's sake.

 

If you need any more info please let me know!

Edited by Brain

Share this post


Link to post
Share on other sites

The scheduler can wake it, depending on the features of the scheduler.

Perhaps, but depending on the nature of the locking primitive, waking it might not be the wisest course of action :)

Share this post


Link to post
Share on other sites
Note that by its nature a true spinlock, mutex or semaphore can't be implemented in userland code, and is always part of the OS. You wouldn't write it yourself unless you're writing an OS, so this all for curiosity's sake.

 

Thanks Brain, I have implemented spinlock kinda busy-waiting locking with atomics, but what really impressed me from condition_variable is the ability to given up CPU time while been blocked. So in you reply it seems this has to make some kernel call to achieve it, right?

 

The one reason I always try to avoid using mutex is that it causes use/kernel mode switch which is expensive, and I think that's why in some cases spinlock acquire/release can be much much faster than using mutex.

 

So when I want to implement a threadpool myself, I was thinking using spinlock to protect the job queue. But soon I realize that if my job queue is empty, I will saturate CPU. And then I find condition variable can avoid this busy-waiting pretty good, however how cv achieve it is a little disappointing to me: first it require a lock (I guess I can use spinlock too, right?).... and now it seems its thread wakeup mechanism involve another kernel call....

 

I was thinking of use a hybrid spinlock to achieve both fast lock acquire/release while avoiding busy-wait by calling sleep after around 4000 spin on spinlock, however my threads may not be super responsible (since incoming job is not waking up sleeping thread but is waiting the thread to wake up....)

 

 

However if I understand your reply correctly, primitives for condition variable do need kernel mode code right? there is no way to avoid it?

 

 

Even if it is kernel mode, how does OS achieve it?  does the scheduler maintain something like map<condition_variable, list<*threadContext>> and once a condition_variable get modified, os  find it's corresponding threadContext list and make context switch accordingly?

 

Thanks

Edited by Mr_Fox

Share this post


Link to post
Share on other sites

waking it might not be the wisest course of action

 
If its wake condition becomes true then it should probably get a wake at least in the near future?

Share this post


Link to post
Share on other sites

 

waking it might not be the wisest course of action

 
If its wake condition becomes true then it should probably get a wake at least in the near future?

 

Yes, however isnt waking it the kernel's job, not the application? I think i got crossed wires with what you were saying. :)

 


However if I understand your reply correctly, primitives for condition variable do need kernel mode code right? there is no way to avoid it?

 

 

Even if it is kernel mode, how does OS achieve it?  does the scheduler maintain something like map<condition_variable, list<*threadContext>> and once a condition_variable get modified, os  find it's corresponding threadContext list and make context switch accordingly?

 

The implementation details of how the scheduler switches thread contexts and wakes sleeping threads are OS dependent. I couldnt say for sure how windows does it, as ive never delved that deep. I can say i'd probably do something like what you said, if i was doing it myself though.

 

It  is recommended to have a sleep in an idle thread rather than busy-waiting. Busy-waiting is more responsive but it will consume a lot more CPU cycles, starving other threads on the system, and will consume more energy (annoying the greenies) :)

Share this post


Link to post
Share on other sites

Yes, however isnt waking it the kernel's job, not the application?


If the kernel is doing the scheduling then yes. If it has "waitfor" features then it can check conditions as it goes so the waiters don't have to spin.

Share this post


Link to post
Share on other sites

You can make a condition variable as perversely convoluted (pthreads) or as simple as you like (homebrew), but basically you need two things:

  • a value, or a kind of flag that shows that some other value is valid
  • something that blocks (futex, NtKeyedEvent, Win32 Event, whatever... usually wrapped with an atomic flag as optimization)

The idea is, one thread wants to know whether another thread is done doing something, so it can access the result.
 
 
The pthreads version of condition variables is outright perverse. You need an hour of study to even be able to follow their logic correctly (that's the case for me at least!). This is for two reasons. First, pthreads still work correctly in designs which are outright unwise, and second it covers many architectures, among them some 25-year old mainframes which had broken libraries and which would behave incorrectly under some conditions.
It however gives guarantees that you arguably don't need, too. For example, with pthreads the writer can be certain that the reader is not reading the data that it is overwriting, and two writers don't get in each other's way, nor do two readers or "readers" that modify the variable. Now... that's all nice and well, but in my opinion, if these things are a concern, then your design is fucked. The normal situation for a cond var is there's exactly one producer and one consumer.
 
The C++ standard library version is middle ground. You are to use a mutex to access the value which strikes me as somewhat nonsensical, but otherwise it's pretty straightforward. It does cover some exotic edge cases that the simplest solution does not cover, but for "normal" usage, the simplest solution just good enough.
 
The simplest solution is some value and an "event", i.e. something that lets one thread wait, and another thread release the waiting thread. futex or keyed events, or W32 events, anything. You can make the value a flag and you can access it atomically if that makes you feel better, but it really doesn't improve anything. As soon as you go through signalling the event this is somewhat superfluous, memory order is well-defined, writes are guaranteed to be visible.
 
For performance reasons, the "event" internally usually has an atomic flag that it can check quickly and cheaply, and if that didn't work, it calls an OS-specific blocking function. Usually, they don't spin since that doesn't make a lot of sense here. If a result is not already ready at the time a thread asks for it, you cannot predict how long it will take, but since tasks are known to be lengthy, chances are that it will take considerable time, longer than you want to spin. The CPU time is spent better in computations. (This is much different from a spinlock where you actually want to spin a couple of hundred/thousand times, knowing that lock times are usually very short and in all likelihood spinning will succeed without needing a mode switch.)
 
The workflow for the producer is:

  • (collect socks)
  • Perform work. Write to value, do whatever (you are the sole owner of the value, no concurrency)
  • Signal the event. This is a release barrier.
  • Done.

 

The workflow for the consumer is:

  • Wait on the event, this is an acquire barrier
    • internally, if the flag says "set", this returns immediately
    • otherwise call system specific wait function, block until the OS wakes you
    • check for spurious wake, and either return or wait again
  • Read value (nobody is writing to it any more, no concurrency)
  • Do something with it.
Edited by samoth

Share this post


Link to post
Share on other sites
The pthreads version of condition variables is outright perverse. You need an hour of study to even be able to follow their logic correctly (that's the case for me at least!). This is for two reasons. First, pthreads still work correctly in designs which are outright unwise, and second it covers many architectures, among them some 25-year old mainframes which had broken libraries and which would behave incorrectly under some conditions. It however gives guarantees that you arguably don't need, too. For example, with pthreads the writer can be certain that the reader is not reading the data that it is overwriting, and two writers don't get in each other's way, nor do two readers or "readers" that modify the variable. Now... that's all nice and well, but in my opinion, if these things are a concern, then your design is fucked. The normal situation for a cond var is there's exactly one producer and one consumer.

 

... somewhere, in an alternate dimension, philosophers are stabbing themselves in the eye with forks... :wacko:

Edited by Brain

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement