Using a variable in multithreaded environment - volatile, mutex or what?

Started by
18 comments, last by maxest 5 years, 6 months ago
14 hours ago, Shaarigan said:

I use that behind the scenes as platform and architecture independent std::atomic replacement for something like spin locking in our professional engine



#if defined(__GNUC__)
#define SpinLock(__lock) { while (sync_lock_test_and_set(&(__lock), 1)) while (lock) {} }
#define SpinUnlock(__lock) { __sync_lock_release(&(__lock)); }
#elif defined(WINDOWS)
#define SpinLock(__lock) { while (InterlockedExchange(&(__lock), 1)) while (__lock) {} }
#define SpinUnlock(__lock) { InterlockedExchange(&(__lock), 0); }
#endif

But anybody should stay aware of exotic platforms that implement their own interlocked functions like PSSDK or Switch SDK does

While we're on the topic, Intel recommends that you put a _mm_pause or YieldProcessor (MSVC) call inside the body of any spin loop, which emits a special kind of NOP instruction. The CPU will recognize this pattern, de-pipeline the decoded loop instructions, switch to the HW hyper-thread if the loop keeps spinning, and reduce power consumption until the loop exits :) 

For everyone else: using spin-loops is in general a pretty bad idea for performance. Your compiler's default locks (e.g. a critical section on Win32/MSVC) will use a tried, tested and well tuned spin-lock and then progressively fall back to more heavyweight algorithms if it spins for "too long".

Advertisement

First of all thanks for all your answers.

Please correct me if I'm drawing bad conclusions from it but I think that in a case where thread 1 is only writing and thread 2 is only reading then the only purpose of mutex lock/unlock is to prevent compiler from reordering instructions and making wrong optimizations. Is that correct? Because precisely what I'm doing with my bool variable is to mark the moment when some work has been done. So I wouldn't really want my compiler to set that variable to true *before* the actual work is done.

If the above is correct then it doesn't really matter if thread 2 reads memory while thread 1 is halfway writing to it, or does it? Assuming thread 2 is waiting for the variable to be true, then if thread 1 is writing (changing variable from false to true) the thread 2 doesn't really care when it reads true eventually, right?

@Shaarigan Could you elaborate a bit more on "If the operation is considered to longer runs then using Mutex is absolutely ok but keep in mind that Mutexes in Windows are protected from OS to prevent self locking and you need to use Semaphore instead"? In what case would semaphore work whereas mutex would not (on Windows)?

 

5 minutes ago, maxest said:

Please correct me if I'm drawing bad conclusions from it but I think that in a case where thread 1 is only writing and thread 2 is only reading then the only purpose of mutex lock/unlock is to prevent compiler from reordering instructions and making wrong optimizations. Is that correct? Because precisely what I'm doing with my bool variable is to mark the moment when some work has been done. So I wouldn't really want my compiler to set that variable to true *before* the actual work is done.

Yep. But not JUST the compiler. Some CPUs will reorder your instructions at runtime (e.g. Intel), some will reorder memory reads and writes (Intel in certain situations), some will do both! Low level ASM/binary instructions are actually basically a high level byte code these days, and modern CPUs will take that instruction stream and dynamically compile it into another set of internal instructions... Optimising on the fly :o

So, you do some work, make sure that the work is actually completed and visible to other CPU cores, then write the boolean. The lock will take care of that platform-specific CPU ordering, cache flushing, RAM visibility nonsense, with compile-time hints, and runtime instructions, if necessary on your current platform. 

In this particular situation, you could also just use a std::atomic instead of a lock+boolean -- they have functions that let you write a value after also performing a memory-fence operation. 

5 minutes ago, maxest said:

If the above is correct then it doesn't really matter if thread 2 reads memory while thread 1 is halfway writing to it, or does it? Assuming thread 2 is waiting for the variable to be true, then if thread 1 is writing (changing variable from false to true) the thread 2 doesn't really care when it reads true eventually, right?

Due to CPUs reordering things, thread 2 might read the work and then the boolean, or thread 1 might write the boolean and then write the work. Either of those situations would cause thread 2 to process old/uninitialized data instead of thread 1's actual work output. Memory fences (internally handled by locking primitives and std::atomic) make sure that this kind of reordering won't occur (both at compile time and at execution time within the CPU). 

@maxset

I have had a concrete usecase for a self-blocking lock in my ThreadPool/ TaskScheduler. Idle threads should themselves go to sleep they wont burn any CPU time and I first implemented this on a self-locking mutex. Debug build worked well but I got failure in Release.

In the end, Mutexes in Windows are designed to some kind of self locking detection while Semaphore isn't so this was my remark to this topic

Thanks Hodgman.

@Shaarigan Yeah, I also once used semaphore instead of a mutex because on Semaphore Acquire operation the CPU is yielded on that thread. So by saying " you need to use Semaphore instead " you meant it was more optimal this way, not that using semaphore was correct in that case whereas using a mutex was not?

To achieve what I described above you need! to use Semaphore at least on Windows because as I wrote Mutex on Windows is protected against self locking. So a thread holding the lock already can't rely to be set to sleep when it will aquire the same lock on the same thread again.

This was my case because each Thread holds it's own lock in locking state and if it would get idle instead of burning cycles, it aquires the same lock again to deadlock itself and go to sleep indefinitely. Another process then will check the lock-state and fire a release so the idle thread is notified that there is work to do again.

This was my personal usecase for that :)

On 10/2/2018 at 4:02 AM, Shaarigan said:

#define SpinLock(__lock) { while (sync_lock_test_and_set(&(__lock), 1)) while (lock) {} } // ERROR: "lock" undefined.



L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

Keep the bug, it's a giveaway

@Shaarigan Understood what you mean. I wasn't even aware that you can mutex lock twice on Windows and it's basically non-blocking when happens on the same thread. It would be logical for me for a thread to go to Sleep upon the second lock. But I think it would be a bad practice to implement self-locking this way. Using semaphore is way more elegant to me. I used semaphore as well for keeping track of the number of jobs to be processed by a job system. If anyone is interested in taking a look at how to make (a simple) one, or pinpoint some issues :), check here:
https://github.com/maxest/MaxestFramework/blob/master/src/common/jobs.h
https://github.com/maxest/MaxestFramework/blob/master/src/common/jobs.cpp
 

This topic is closed to new replies.

Advertisement