Followers 0

# Are mutexes really fool-proof?

## 21 posts in this topic

Take a look at the following code. Is it theoreticzally possible for two CPU cores to read the value of Mutex::Locked at the exact same time, both assuming the mutex is unlocked and proceeding to lock the mutex at the same time? If so, what's a good way to work around this problem?

class Mutex
{
bool Locked;
public:
Mutex() { Locked = false; }
void lock() { while (Locked); Locked = true; }
void unlock() { Locked = false; }
};

mutex.lock()
SharedData->operationAlpha();
mutex.unlock();

mutex.lock()
SharedData->operationBeta();
mutex.unlock();

Edited by Nüb?ek Pænus
0

##### Share on other sites

Dont use a bool to verify if the mutex is locked or not, that defeat the purpose of the mutex in the first place, you need to use WaitForSingleObject(), not to mention that this while loop will consume 100% of 1 cpu for nothing instead of sleeping.

I dont think what you are talking about is possible, since that's exactly what mutex are for, avoid data to be accessed at the same time by different threads or processes.

You might also consider using a critical section instead if you don't need to lock between multiples process since they are more lightweight.

Edited by Vortez
2

##### Share on other sites

Dont use a bool to verify if the mutex is locked or not, that defeat the purpose of the mutex in the first place, you need to use WaitForSingleObject(), not to mention that this while loop will consume 100% of 1 cpu for nothing instead of sleeping.

I dont think what you are talking about is possible, since that's exactly what mutex are for, avoid data to be accessed at the same time by different threads or processes.

You might also consider using a critical section instead if you don't need to lock between multiples process since they are more lightweight.

I'm guessing that's what SDL semaphores are for. Silly me. I never understood why I needed them when I could just make my own Mutex class. Silly, silly me.

Guess I'll be looking into that then. Thank you!

0

##### Share on other sites

You might also consider using a critical section instead if you don't need to lock between multiples process since they are more lightweight.

Just a heads-up, talking about critical sections is Windows-only. In Boost::Thread for example a mutex is equivalent to a Windows critical section (it used to be implemented as one but nowadays they use something different that is also extremely lightweight). The more complex mutex is called recursive_mutex (this has nothing to do with interprocess communication, that's another layer of heavyweightedness to add). This kind of mutex can be locked multiple times from the same thread without deadlocking itself.

Edited by BitMaster
0

##### Share on other sites

I'm guessing that's what SDL semaphores are for. Silly me. I never understood why I needed them when I could just make my own Mutex class. Silly, silly me.

No, that's what SDL_mutex is for.

SDL_semaphore is for something else.

1

##### Share on other sites

I see. I've read about critical sections before. Would either of you happen to know what (if any) are the advantages of using the Win32 API with critical sections as opposed to SDL_mutex or Boost? I'm already using SDL for pretty much everything else but I'll try the other options if there's potential for speed.

0

##### Share on other sites

Dont use a bool to verify if the mutex is locked or not, that defeat the purpose of the mutex in the first place, you need to use WaitForSingleObject(), not to mention that this while loop will consume 100% of 1 cpu for nothing instead of sleeping.

This article explains what is going on in the background when you are acquiring a lock: http://en.wikipedia.org/wiki/Test-and-set

The test-and-set-operation has to be done in one operation so that there is no chance for another process to come in between.  When you are testing a bool afterwards, you split the operation in more than one part so that another thread can come in between and disturb, and wreck havok.

That is why you don't use a bool to verify, as Vortez is saying.

1

##### Share on other sites

Multithreading is far more complex than first realized. In the For Beginners forum many people talk about starting out with multithreading, and it is highly discouraged.

Before venturing into the land of locks and mutexes you need to have knowledge not just of the high level programming language, but also a good knowledge of every step of the process, including how the optimizers work, how the operating systems work, how the CPU works, how the processors work, how the hardware caches work, and how the processors maintain synchronization. Any issue at any of those levels can cause subtle but extremely painful bugs in your multithreaded code.

Concepts like memory barriers, memory tearing, memory synchronization and prefetch, atomic operations, and many others all come into play. Failure to understand these can also lead to subtle and extremely painful bugs.

For example, you may run into bugs that only manifest themselves very rarely, and even then only on 6-core processors, or only on machines with specific memory speeds, or in seemingly random times and places that have no discernible patterns.

It is a good thing to learn and it is a valuable skill to have.

Just be aware that the transition from single processing to multiprocessing is one of the largest transitions in most programmers' careers. It is not something you can just try to pick up and suddenly have working correctly; it takes years of careful study and practice to reach a minimal level of competence. Even after you are competent the bugs you inevitably create are some of the most insidious flaws in the software world. Some of the most expensive software rewrites I have seen in the industry were due to multithreading bugs that could not be solved.

Most seasoned veterans (including those who are skilled at multithreading) will generally avoid writing their own multithreaded code, relying on asynchronous calls and external systems rather than go through the pains of writing their own new, untested, and undebugged code. It is a skill that is useful and if you want to pick it up, that is a good thing. Just be prepared for a VERY long and bumpy road ahead.

I don't understand how it can be so complicated. I just need to share an array of data between two threads. It's not working as it should so you're probably right, but it's just an array, wrapped around a class with a mutex that locks before and unlocks after every read or write operation. Why would that not work? I understand what memory tearing is, but how could that happen if the data is protected by a mutex?

I wrote my code so that it works both in threads and in a single thread, except either thread freezes after several seconds for no apparent reason, as you've predicted, with no discernible pattern..

Edited by Nüb?ek Pænus
0

##### Share on other sites

Because your mutex is too simplistic to actually work.

Here are some of the various possible failures it could run into:

Both threads could try locking at the same time, both read it as unlocked, and both write it locked and enter.

The compiler could be clever, and cache locked in a register, resulting in an infinite loop if one tries locking an already locked mutex.

The compiler could inline the lock, and shuffle code around such that part of the block you're trying to protect happens before locking the mutex.

The CPU could speculatively execute past the lock.

etc.

1

##### Share on other sites

Because your mutex is too simplistic to actually work.

Here are some of the various possible failures it could run into:

Both threads could try locking at the same time, both read it as unlocked, and both write it locked and enter.

The compiler could be clever, and cache locked in a register, resulting in an infinite loop if one tries locking an already locked mutex.

The compiler could inline the lock, and shuffle code around such that part of the block you're trying to protect happens before locking the mutex.

The CPU could speculatively execute past the lock.

etc.

Yeah, I got my mutex's broken. I've since replaced it with an SDL mutex. When that didn't work I used Win32 mutexes. Aren't those supposed to be safe and work as expected across threads?

// SDL version
class Mutex
{
SDL_mutex *Lock;

public:
Mutex() { Lock = SDL_CreateMutex(); }
void lock() { SDL_mutexP(Lock); }
void unlock() { SDL_mutexV(Lock); }
};

// Win32 version (betting \$5 SDL wraps around this as well)
class Mutex
{
CRITICAL_SECTION Lock;

public:
Mutex() { InitializeCriticalSectionAndSpinCount(&Lock, 500); }
void lock() { EnterCriticalSection(&Lock); }
void unlock() { LeaveCriticalSection(&Lock); }
};

Edited by Nüb?ek Pænus
0

##### Share on other sites

I've since replaced it with an SDL mutex. When that didn't work I used Win32 mutexes

In what way did it not work?

1

##### Share on other sites

I've since replaced it with an SDL mutex. When that didn't work I used Win32 mutexes

In what way did it not work?

The exact same thing happens with or without a mutex, and with or without compiler optimizations. I'd post some code but it's a large application with tons of classes. That's why I asked a very simple question here. If SDL mutexes are supposed to work as I thought they did, and apparently they do, then my problem's somewhere else.

Edited by Nüb?ek Pænus
0

##### Share on other sites

I'm not familiar with SDL but I suppose their codebase is not the problem here. The problem must be within your code. Which parts access the mutex and the shared memory?

1

##### Share on other sites

I'm not familiar with SDL but I suppose their codebase is not the problem here. The problem must be within your code. Which parts access the mutex and the shared memory?

I never thought it was an issue with SDL, I just wanted to make sure I understood mutexes correctly. I've learned SDL uses a different mechanism than just storing the lock as a bool, like I tried to do.

It's hard to explain without going into much boring detail, so let's assume this is a program that generates video and then displays it.

• Thread A generates the video content.
• Thread B plays it back to the user.
• The shared memory is a large array containing every change that was made to the video, which allows for rewinding.
• Technically there should also be a thread C that takes care of swapping parts of the data with the filesystem since the video can get really big, but let's assume we have infinite memory.

Thread A always runs at full speed, but thread B plays back the video at 10FPS, therefore thread A will be ahead most of the time unless the user hits fast forward and reaches the end.

The reason I need multithreading here should be obvious; there's no reason to hamper thread A with playback/GUI code. Thread A is always pushing data at the END of the array, while thread B is always reading data from the middle of the array. For the sake of testing I've never let thread B get anywhere near the end of the array.

Edited by Nüb?ek Pænus
0

##### Share on other sites

SDL's mutexen definitely work. However, correct thread synchronisation is a tricky thing, it is not hard to make a mistake. One example might be locking/unlocking mutex A in thread A but locking/unlocking mutex B in thread B. Unless you're locking and unlocking the same SDL mutex, you cannot guarantee the behaviour. Another example is accidentally performing an unsynchronised read or write somewhere - again you'll still be at risk of strange behaviour. A more subtle example is performing a synchronised read, releasing the mutex, doing some more work using the value read, then performing a synchronised write. For some algorithms, this isn't problematic, but for others this sort of non-transactional behaviour can introduce incredibly subtle bugs.

The fact that the relative timings of the operations seems to affect the behaviour is a strong indicator that this is some kind of race condition.

Can you not break into the program when it is frozen using your debugger? You mentioned in your [url="http://www.gamedev.net/topic/651500-sdl-threads-very-slow-is-it-my-code-or-is-this-normal/#entry5124795"]other thread[/url] that you're using some kind of updated Dev-C++. While this might be maintained better than the original, my recollections of using that software when I was starting out was that the debugger was barely functional - it might at best give me the correct stack trace when dealing with a segfault.

2

##### Share on other sites

SDL's mutexen definitely work. However, correct thread synchronisation is a tricky thing, it is not hard to make a mistake. One example might be locking/unlocking mutex A in thread A but locking/unlocking mutex B in thread B. Unless you're locking and unlocking the same SDL mutex, you cannot guarantee the behaviour. Another example is accidentally performing an unsynchronised read or write somewhere - again you'll still be at risk of strange behaviour. A more subtle example is performing a synchronised read, releasing the mutex, doing some more work using the value read, then performing a synchronised write. For some algorithms, this isn't problematic, but for others this sort of non-transactional behaviour can introduce incredibly subtle bugs.

The fact that the relative timings of the operations seems to affect the behaviour is a strong indicator that this is some kind of race condition.

Can you not break into the program when it is frozen using your debugger? You mentioned in your other thread that you're using some kind of updated Dev-C++. While this might be maintained better than the original, my recollections of using that software when I was starting out was that the debugger was barely functional - it might at best give me the correct stack trace when dealing with a segfault.

You're right, Dev-C++ has very minimal debugging features. It does catch segfaults and allows proper examination of the stack trace, but you can't stop the program. I'll go read up on race conditions and try to find anything like that in my code, thanks for the tip.

I've ran GDB from a command line and this is what I get at the start of the program, even when running in single thread mode (which by the way works without a hitch, but very slowly):

By the way here's a pretty diagram of how my application should work:

Edited by Nüb?ek Pænus
0

##### Share on other sites

Can you not break into the program when it is frozen using your debugger?

This is by far the easiest way to diagnose hangs in my experience. Surely there's a way to do this in gdb. If not, you can always make a build in Visual Studio to debug it in there. Visual Studio can definitely suspend a deadlocked process to find out where it's stuck.

Hangs in my experience are most often caused when you have two or more mutexes which you acquire in inconsistent orders in different threads. This is sometimes called a "lock hierarchy violation": http://software.intel.com/sites/products/documentation/doclib/iss/2013/inspector/lin/ug_docs/GUID-38B68BDA-257C-4D4E-9AC9-B0B2698AD950.htm

LHVs are similar to but not exactly the same as deadlocks: http://software.intel.com/sites/products/documentation/doclib/iss/2013/inspector/lin/ug_docs/GUID-07E441C6-BBB2-47EC-8C62-81FC0F6FF78E.htm Edited by Nypyren
1

##### Share on other sites

Could you show us the relevant code where you are using the lock in both thread? That would help a lot.

1

##### Share on other sites
can't type anything ;-;

press ctrl-c to send interrupt signal, that will interrupt your program and let you look around. Then you will get a prompt "(gdb)", and you can type in "thread <thread number>" to look at different threads. When you are ready to continue the program, type in "c".

Edited by ultramailman
1

##### Share on other sites

I finally solved this problem. Well, more like worked around it. I made both threads communicate through a socket instead of a shared array. At first I thought this would waste memory but then I realized that thread A doesn't really need to access the array, it simply adds data to the end of it.

Thanks everybody who helped! :D

0

## Create an account

Register a new account