Is POD assignment atomic?

Started by
25 comments, last by MaulingMonkey 16 years, 9 months ago
Hi. Is the assignment of POD types atomic operation? If one thread writes 32bit integer and another one reads it, could it happen that reader thread gets garbage because writer thread managed to update only one or two bytes out of four, before thread switch happened? What's with 64bit types like doubles and long long integers on 32bit CPUs compared to 64bit CPUs? Is their assignment atomic on 64bit architectures?
Advertisement
There are no threads in the C or C++ language. Because of this, the languages themselves provide no guarantee (or even concept) of atomicity for any of their semantics. So, no, POD assignment is not atomic. Not even one-byte assignment is atomic.

However, you can perform atomic assignment using:
  • Your hardware. Your assembly language quite possibly describes which operations are guaranteed to be atomic, and which aren't. For instance, ld.global on a GeForce 8800 GTX is atomic and works for up to 128 bits. Use these. On a specific compiler and hardware C or C++ code may compile to atomic operations, but this is not portable behaviour.
  • Your software. Your threading library certainly has primitives for atomic operations, along with synchronization and mutex primitives to ensure atomicity of sets of operations.
Thanks ToohrVyk.
ToohrVyk is right; however, in practice, assignment of primitive types is atomic on common cpu architectures and compilers assuming the right conditions are met. The tricky bit is of course those conditions. The standard one is that data has to be properly aligned.

There's another gotcha for playing fast & loose like this on multi-processor machines. If you don't play the game by the right rules you can write some memory but the other processor won't detect that you've done so and thus will continue to return the old value from it's cache back to the program. This may or may not be a problem depending on how your program is designed.

When in doubt use a syncronization primitive. On Windows critical sections are pretty light weight. For single values the Interlocked operations are very light weight.
-Mike
Quote:ToohrVyk is right; however, in practice, assignment of primitive types is atomic on common cpu architectures and compilers assuming the right conditions are met. The tricky bit is of course those conditions. The standard one is that data has to be properly aligned.

There's another gotcha for playing fast & loose like this on multi-processor machines. If you don't play the game by the right rules you can write some memory but the other processor won't detect that you've done so and thus will continue to return the old value from it's cache back to the program. This may or may not be a problem depending on how your program is designed.


Assumption is the mother of all screw-ups.

After all, it's 1970, nobody will be using our air traffic control software in 30 years...

Concurrent programming is annoying and frustrating to debug. When everything goes right. Unless you have a guarantee in either language (Java has such guarantees, yet still had certain flaws with concurrency) or some platform API call, that that particular part is re-entrant, or in this case, atomic, assume it's not, and treat it as such.

As always, synchronization isn't as expensive as it seems at first. If it is a major bottleneck, then that's usually design fault. But in concurrent applications, you rarely have the luxury of your application crashing or faulting. It'll simply emit an invalid value once every 20 hours, possibly giving you weeks of headaches tracking the issue down.

Just assume the worst from the start.

Also before foregoing various safety checks, make sure to understand the platform you're working on in detail. Various basic operations will indeed be atomic in practice, but can still cause problems on multi-core systems as mentioned above. It's same as C++'s undefined behaviour. Some cases will behave the same across several compilers and platforms. But there are no guarantees that they won't break horribly tommorow.
Quote:Original post by Anon Mike
ToohrVyk is right; however, in practice, assignment of primitive types is atomic on common cpu architectures and compilers

No it isn't.
An assignment might consists of three operations (load into register, update register value, store to memory). The load part might not be necessary if you're *only* doing an assignment, but that still leaves two potential operations. The store part is, on common singlecore systems, atomic, yes, but changing the value and then storing it to memory (which is what you're usually doing in an assignment) may not be.

True, the CPU might be able to combine multiple operations into one instruction, and then it *might* again be atomic, but... you don't know.
Quote:Original post by Spoonbender
Quote:Original post by Anon Mike
ToohrVyk is right; however, in practice, assignment of primitive types is atomic on common cpu architectures and compilers

No it isn't.
An assignment might consists of three operations (load into register, update register value, store to memory). The load part might not be necessary if you're *only* doing an assignment, but that still leaves two potential operations. The store part is, on common singlecore systems, atomic, yes, but changing the value and then storing it to memory (which is what you're usually doing in an assignment) may not be.

True, the CPU might be able to combine multiple operations into one instruction, and then it *might* again be atomic, but... you don't know.
Worse than that, no significant processor takes memory barriers on loads and stores unless explicitly requested. That means that the multiprocessor caching issue that Anon Mike mentioned comes into play. So any code that assumes that assigning processor words will be thread safe is completely broken.
SlimDX | Ventspace Blog | Twitter | Diverse teams make better games. I am currently hiring capable C++ engine developers in Baltimore, MD.
Assuming POD structures will be atomic will even break on single proc boxes. OP probably meant "intrinsic" or "primitive". POD has a separate, broader (and occasionally useful) definition -- one which one would do well not to confuse with such terms.
A small expansion to the question:
Say I have a bool that indicates if a worker thread should stop. The worker thread periodically checks the bool, which is initialized as false in the beginning. Then, at one time, the main thread assigns true to that boolean.

Now that wouldn't have to use locks (critical sections), right?
Quote:Original post by DaBono
Now that wouldn't have to use locks (critical sections), right?


In C and C++, the worker thread might read the boolean halfway through the write, resulting in undefined behaviour (because the value is undefined). This kind of approach is unreliable.

Other pratical problems (even when you're lucky to have hardware with atomic assignment of booleans, and a compiler which uses this assignment model) include caching policies preventing the modification from reaching the worker thread for a few seconds or even minutes.

Lock-free concurrent programming requires atomicity guarantees, typically in the form of an atomic CAS operation. If your concurrent code uses neither locks nor an atomic CAS, the probability is extremely high that it will break in a subtle yet annoying manner.

This topic is closed to new replies.

Advertisement