Is POD assignment atomic?

Started by
25 comments, last by MaulingMonkey 16 years, 10 months ago
I've really enjoyed the discussion so far. However, I found this nice passage in the Platform SDK docs (under "Interlocked Variable Access"):

Quote:
Simple reads and writes to properly-aligned 32-bit variables are atomic. In other words, when one thread is updating a 32-bit variable, you will not end up with only one portion of the variable updated; all 32 bits are updated in an atomic fashion. However, access is not guaranteed to be synchronized. If two threads are reading and writing from the same variable, you cannot determine if one thread will perform its read operation before the other performs its write operation.


I'm no stranger to synchronization methods and implementation, but low-level CPU behavior is not my expertise. However, this paragraph from the Platform SDK docs seems to contradict some of what I've been reading here, and maybe someone who knows more can enlighten us. In the case of a simple bool to flag a thread termination, the above paragraph seems to indicate that you can never get an "undefined" value. I believe the bool data type is implemented as a 32 bit quantity in MSVC, and assuming it was properly aligned, changing it from true to false or vice versa should be atomic without any special care taken.

As stated above, simple reads and writes, although atomic, do not guarantee sequence. A critical section wouldn't guarantee sequence either, just that only one thread could read or write the value at a time. Given the above paragraph, it seems that on 32-bit Intel chips, a critical section around a simple bool for flagging thread termination is redundant and unnecessary.

Anyway, just curious what you experts think about the above quote and how it relates to the current discussion.
Advertisement
Quote:Original post by strtok
However, this paragraph from the Platform SDK docs seems to contradict some of what I've been reading here, and maybe someone who knows more can enlighten us. In the case of a simple bool to flag a thread termination, the above paragraph seems to indicate that you can never get an "undefined" value.


Never is a little bit extreme. As I've said above, the C and C++ standards themselves provide no atomicity guarantees, so you must rely on external guarantees provided by external software (compiler, library) or hardware (processor). The Platform SDK provides one such guarantee (on x86 Windows using Microsoft Visual C++) but it does not apply to other hardware, software and compilers combinations unless explicitly stated. For instance, g++ on x86 Windows might not make that guarantee and perform some optimizations which would break it, or it might.

The bottom line is, as always, to seek external guarantees, because there are no internal ones, and then:
  • Thoroughly document and code-assert the existence of these guarantees.
  • Provide an alternate code path to be used when those guarantees are not used.
Assume the worst and hope for the best. That's what it comes down to.
Best not try and skimp on thread-saftey.
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Quote:Original post by strtok
Anyway, just curious what you experts think about the above quote and how it relates to the current discussion.


I think the quote was written before the days when multicore CPUs were the norm.

Sure, in a simple multithreaded environment on a single CPU such operations are atomic but with undefined sequencing. On a modern machine, each CPU has its own cache and if a thread running on one CPU writes to memory it doesn't mean another thread with its own cache will ever read that value from memory. Not without appropriate cache invalidation. That's where all thwm memory barriers and interlocking instructions and stuff come in to play.

If you're writing software now, by the time it gets released into the wild you'll only be able to find single-CPU systems in museums and garage sales. Future-proof your software.

Stephen M. Webb
Professional Free Software Developer

bool is typically a byte (definitely on MSVC), and I'm not aware of any CPUs which allow a read halfway through a write to a byte - it's generally impossible since the write happens to all bits simulataneously. On most if not all desktop CPUs the same goes for correctly aligned words too. So you can't get an undefined value unless you do unaligned access (where supported).

The real issue is synchronisation as the quote from MSDN indicates. If you write a byte on one CPU it may not flush from the cache for a significant amount of time. That's typically not what you want when trying to sync up threads.
Quote:Original post by ToohrVykThe Platform SDK provides one such guarantee (on x86 Windows using Microsoft Visual C++) but it does not apply to other hardware, software and compilers combinations unless explicitly stated. For instance, g++ on x86 Windows might not make that guarantee and perform some optimizations which would break it, or it might.


The platform SDK doesn't assume MSVC. Also the property that write operations are atomic is guaranteed by the x86 processor (as long as it's a 486 or newer), so it works with any compiler and OS combination.

A partial answer to the 64-bit question of the OP - pentium and newer x86 processors guarantee atomic writes to aligned 64-bit memory locations.
Quote:Original post by Jerax
The platform SDK doesn't assume MSVC. Also the property that write operations are atomic is guaranteed by the x86 processor (as long as it's a 486 or newer), so it works with any compiler and OS combination.

How do you know that the optimizer won't alter code such that it isn't simply a write instruction? As long as there are no guarantees in the actual standard we can't be sure that it generates the assembly we expect it to. These kind of bugs will be rare, but also very hard to spot and fix.
Quote:Original post by Jerax
The platform SDK doesn't assume MSVC. Also the property that write operations are atomic is guaranteed by the x86 processor (as long as it's a 486 or newer), so it works with any compiler and OS combination.


I can write a perfectly Standard-abiding C++ compiler which performs non-atomic write operations on an x86 processor (if only by doing two partial atomic writes).

The Platform SDK only discusses the interaction between MSVC and the x86 processor. It has no control or knowledge over what machine code other C++ compilers generate.
I agree that a guarantee should always be used, and that a variable should always be kept from being accessed simultaneously by two different processors for example, however this cache some people have talked about seems strange.

Firstly I think that in a loop while(var == 0) where var is initialized to 0, none of this matters, since the only time var == 0 will be false is if var is changed by some other code, and if this will only ever happen when the loop is supposed to exit none of this should matter unless there is the cache-problem mentioned.
I still agree this shouldn't be counted on, and a critical section should be used.

However, if there is a problem with cache I don't see how this can be solved with critical sections?
That a critical section is entered doesn't necessarily have to mean that cache is flushed, does it?
If a non-local variable is changed this should never be kept in the cache, if it is things will break no matter how much you protect this with critical sections, won't it?
If one thread is constantly doing:
while(!exit) {EnterCriticalSection();if(globalVar != 0) exit = true;LeaveCriticalSection();}

and another does
EnterCriticalSection();globalVar = 1;LeaveCriticalSection();

this will always work, because the variable will never be accessed by both at the same time, however only because the globalVar is 'volatile' or whatever it's called. The critical section can't change the 'volatility' can it?
So any cache miss in
while(!exit) {if(globalVar != 0) exit = true;}

with another thread doing
globalVar = 1;

should also be present in the example with critical sections?

Cache can't possibly work that way for volatile variables, they must be guaranteed to be written and read non-cached.

/Erik
Quote:Original post by ToohrVyk
Quote:Original post by Jerax
The platform SDK doesn't assume MSVC. Also the property that write operations are atomic is guaranteed by the x86 processor (as long as it's a 486 or newer), so it works with any compiler and OS combination.


I can write a perfectly Standard-abiding C++ compiler which performs non-atomic write operations on an x86 processor (if only by doing two partial atomic writes).

The Platform SDK only discusses the interaction between MSVC and the x86 processor. It has no control or knowledge over what machine code other C++ compilers generate.


You could, but why would you? Besides with caching your two writes would likely be combined into a single atomic write in any event.

This topic is closed to new replies.

Advertisement