Sign in to follow this  
davetyler

Threading question: Why to lock?

Recommended Posts

davetyler    168
Hopefully a simple question for someone: If I have a multithreaded program written in C using some pthread implementation do I need to lock in the following case:
void threaded_function(int shared_variable)
{
  while(1)
  {
    shared_variable++;
    thread_sleep(100); //Implementation defined sleep function taking ms param.
  }
}

void main_function(int shared_variable)
{
  int temp_var;
  
  while(1)
  {
    temp_var = shared_variable;
    thread_sleep(100);
  }
}
In essence I know that a problem with multithreaded programs is that execution order between threads is undefined, however in the above case assume I don't care whether the shared_variable has been updated when I read it in the main thread so execution order is not important. Should I lock it? If so why? Another related problem is whether I need to lock in order to read shared data on both threads (rather than one thread reading and one writing imagine both are reading.) The situation here might be that a pointer to a structure has been passed in to a child thread, does the structure need a global lock before accessing the structure or does it only need a lock for each independent member variable in that structure that will actually be shared. I have a very clear idea in my head of what I am trying to explain but it may not be 100% obvious in text so please ask questions if the above doesn't make sense! Thanks David

Share this post


Link to post
Share on other sites
69mij    132
Yes you should lock.

shared_variable++ isn't a one-time instruction. It involves fetching the data from the memory to a register, applying an operation to the register and then storing it back

Consider now this. If in the beggining shared_variable starts as 0 and you have 2 threads.

Now imagine that thread1 starts first. After it has fetched the value from the memory into the register its execution is paused. So the register in thread1=0. Now thread2 starts and let's assume it is allowed to complete the cycle meaning that in the end shared_variable=1. Thread1 comes back to execute and it increases its own register by 1 so shared_variable=1 again!

Result: After 2 threads having run shared_variable=1 in the end. I know it is an extremity but it does show that you can never predict what happens behind the scenes or in what way the scheduler schedules the threads for operation. Lock critical sections to be safe

Share this post


Link to post
Share on other sites
davetyler    168
Ok, that isn't quite the situation that I was attempting to describe in my original post. I believe you are talking about a situation where two threads can change the same data (and in that case I agree that the fact that ++ is not atomic means that I would have to lock).

In the original problem I have a single thread that may change the data (thread 1) and 1 (or more) threads that can read that data.

Then I am curious to know whether there can be any problems with not locking.

A game programming example would be that you have a single thread for putting things on the screen. Then a separate thread for the physics of a ball. That thread updates the x,y positions of the ball and the graphics thread puts it on the screen. There is no need to update the x,y positions in the graphics thread.

Share this post


Link to post
Share on other sites
Evil Steve    2017
So long as access to shared_variable is atomic, you don't need to lock. If the CPU takes more than one instruction to fetch shared_variable, then you could get the situation where the CPU fetches two parts of it across an update (E.g. for a point with X and Y coordinates), which is usually bad.

If shared_variable is a class with multiple members, then you're likely to need locking for the same reason. If it's a plain int, then the worst case is that two threads will see slightly different versions of the variable.

Share this post


Link to post
Share on other sites
69mij    132
Well I guess in theory if one writes and many read there shouldnt be a need to lock at all, but in your example having a thread computing the physics of an item while other threads doing other stuff, wouldnt that cause a kind of "latency" if its not scheduled very regularly (meaning you would see the ball standing still) or sth like that?

Share this post


Link to post
Share on other sites
davetyler    168
Quote:
Original post by Evil Steve
So long as access to shared_variable is atomic, you don't need to lock. If the CPU takes more than one instruction to fetch shared_variable, then you could get the situation where the CPU fetches two parts of it across an update (E.g. for a point with X and Y coordinates), which is usually bad.

If shared_variable is a class with multiple members, then you're likely to need locking for the same reason. If it's a plain int, then the worst case is that two threads will see slightly different versions of the variable.


Ah I think that this is exactly what I was wondering. It seems like a relatively risky proposition to assume that a read will be done in a single instruction. It does sound like a wonderful way to generate obscure bugs when you decide to change the type of a variable :p.

Thanks

p.s. 69mij - It probably would, but I was just pulling an example out of thin air and that was what came to mind :).

Share this post


Link to post
Share on other sites
Hodgman    51229
On x86, i believe all (edit: properly aligned?) 32bit (int sized) reads and writes are atomic (they fetch the full 4 bytes in one go).

To be safe, you can (edit: should!) use the InterlockedExchange functions on Windows (or equiv on other operating systems).

Quote:
Original post by davetyler
A game programming example would be that you have a single thread for putting things on the screen. Then a separate thread for the physics of a ball. That thread updates the x,y positions of the ball and the graphics thread puts it on the screen. There is no need to update the x,y positions in the graphics thread.
In this case though, obviously (x,y) isn't a single 32bit value so it's not going to be atomic ;) Locking would be the safe bet.



Another problem -- your optimiser!
Your compiler could easily look at your function, and decide to optimise it to this:
void main_function(int& shared_variable)
{
int temp_var = shared_variable;
while(1)
{
thread_sleep(100);
}
}



You can hack around this problem by making shared_variable volatile, which tells the compiler not to optimise away redundant memory accesses.

You should be aware though that generally volatile does not make your code thread safe (contrary to popular belief! Look up "volatile considered harmful" for more info). Many CPUs have an in-built optimiser that can do this kind of thing at runtime! Normally you would also be required to include special "memory barrier" instructions to stop this behaviour.
Luckily in your case though, you only need volatile (not memory barriers too), because you don't care about reading outdated data. If you cared about race conditions, you would need a memory barrier (the easiest way to create a memory barrier is to lock/unlock a mutex, but the Windows Interlocked* functions also create a full barrier).



Back to the (x,y) example, this is a, implementation of a read-write lock using the low-level CPU hardware (instead of high-level locks):
struct Data
{
volatile int owner;//0=CanWrite, 1=Writing, 2=CanRead, 3=Reading
int x, y;
};

void UpdateThread( Data& data )
{
//if data.owner is 0, set it to 1 (atomically)
if( InterlockedCompareExchange( &data.owner, 1, 0 ) == 0 )
{
//Memory barrier inserted above
data.x++;
data.y++;
//Memory barrier inserted below
InterlockedExchange( &data.owner, 2 );//set data.owner to 2 (atomically)
}
}
void RenderThread( Data& data )
{
//if data.owner is 2, set it to 3 (atomically)
if( InterlockedCompareExchange( &data.owner, 3, 2 ) == 2 )
{
//Memory barrier inserted above
int drawX = data.x;
int drawY = data.y;
//Memory barrier inserted below
InterlockedExchange( &data.owner, 0 );//set data.owner to 0 (atomically)

cout << drawX << drawY;
}
}



[Edited by - Hodgman on November 9, 2009 4:53:01 PM]

Share this post


Link to post
Share on other sites
the_edd    2109
Quote:
Original post by davetyler

void threaded_function(int shared_variable)
{
while(1)
{
shared_variable++;
thread_sleep(100); //Implementation defined sleep function taking ms param.
}
}

void main_function(int shared_variable)
{
int temp_var;

while(1)
{
temp_var = shared_variable;
thread_sleep(100);
}
}



shared_variable isn't shared. Each function gets its own copy. I assume this is a thinko when entering the code?

Share this post


Link to post
Share on other sites
outRider    852
Quote:
Original post by MaulingMonkey
Quote:
Original post by Hodgman
On x86, i believe all 32bit (int sized) reads and writes are atomic (they fetch the full 4 bytes in one go).

Nope. Consider an unaligned int straddling page boundaries.


A few more cases: straddling cache lines, straddling bus width boundaries (64-bit value on a system with a 32-bit bus, etc). Although not guaranteed to be atomic, some might still be depending on the memory controller, since it can simply stall the CPU for however many bus cycles are needed to return the result atomically.

Share this post


Link to post
Share on other sites
davetyler    168
Quote:
Original post by the_edd
Quote:
Original post by davetyler

void threaded_function(int shared_variable)
{
while(1)
{
shared_variable++;
thread_sleep(100); //Implementation defined sleep function taking ms param.
}
}

void main_function(int shared_variable)
{
int temp_var;

while(1)
{
temp_var = shared_variable;
thread_sleep(100);
}
}



shared_variable isn't shared. Each function gets its own copy. I assume this is a thinko when entering the code?


Yes it was, I was at work just typing as I thought. Fortunately I got an answer despite the sloppiness.

Share this post


Link to post
Share on other sites
the_edd    2109
Quote:
Original post by davetyler
Yes it was, I was at work just typing as I thought. Fortunately I got an answer despite the sloppiness.


Sure, just making sure there wasn't some deeper misunderstanding :)

Share this post


Link to post
Share on other sites
MaulingMonkey    1728
Quote:
Original post by outRider
Quote:
Original post by MaulingMonkey
Quote:
Original post by Hodgman
On x86, i believe all 32bit (int sized) reads and writes are atomic (they fetch the full 4 bytes in one go).

Nope. Consider an unaligned int straddling page boundaries.


A few more cases: straddling cache lines, straddling bus width boundaries (64-bit value on a system with a 32-bit bus, etc). Although not guaranteed to be atomic, some might still be depending on the memory controller, since it can simply stall the CPU for however many bus cycles are needed to return the result atomically.


Aye. Which is another way of saying the resulting bugs will be very very difficult to reproduce consistently and track down :/

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this