The compiler is absolutely 100% free to change the orders of reads and writes as long as the final result is the same, and I can promise you this happens on every single compiler I have encountered, for every platform.
This is the relationship between Microsoft®’s extensions and volatile. It guarantees that in the compiled binary the code will be executed in the order specified by your code. [<-- and that's enough]
It does absolutely nothing to prevent the above situation between MOV, CMP, and JMP. It adds no fences at all. If you do not believe me, make your own program and check its disassembly.
x86 CPU's don't reorder memory ops though, and they don't have fence instructions.
... Actually "Loads may be reordered with older stores to different locations" and there are rarely used LFENCE/SFENCE/MFENCE instructions ...[/edit]
Any instruction with the lock prefix acts like a fence, but that prefix is only required when you need to perform an atomic operation on memory that could potentially be modified by two threads simultaneously.
Slicer's code works on mutual-exclusion. Assuming that there's no reordering, then there's no variables that are going to be written to by two threads at once -- every variable always has only 1 writer. So, it will work on x86 with just compile-time fences.
It won't be portable to other CPU's unless you add fences, so he really should be using a std::atomic still
Yeah, if his code was more complex and the state variable could have two potential writers, he would need to use CAS instead of an if and assignment.
The standard C/C++ volatile is not a proper compile-time fence, it only acts as a fence with regards to other volatiles -- reads/writes to volatiles aren't reordered past each other, but reads/writes to non-volatiles can be reordered past volatile reads/writes. The MSVC extension goes further than this turns turns volatile reads/writes into full compile-time fences.
A standard C++ compiler would be able to remove the =1 and =2 lines from this code, but MSVC keeps them, because there's a volatile write between them all.
volatile int g_state_v = 0; extern int g_state; ... g_state = 1; g_state_v++; g_state = 2; g_state_v++; g_state = 3; printf( "%d %d", g_state, g_state_v );