There is a shared state in this synchronization (LockState). If you want to access it from both threads you have to synchronize these accesses. One way is to use hardware memory barriers before and after the access of the LockState. This will block the instruction and memory access reordering of the compiler and processor and synchronizes memory. The other way is to use an atomic read/write which locks the the memory bus for the time of the memory access. (This is what a CAS-cycle does.) But in the current state of the code it synchronizes the two threads only with luck.