[C++] Volatile variables

Started by
27 comments, last by Shannon Barber 15 years, 10 months ago
Quote:Original post by Red Ant
However, except for the most trivial cases, I'd recommend using proper synchronization constructs instead (CRITICAL_SECTION, boost::mutex or whatever), in which case volatile isn't necessary any more anyway.

Using actual synchronization constructs doesn't free you from declaring shared variables volatile. If you try to reference shared variables without volatile from multiple threads, the compiler may still try to use cached values.
Advertisement
Quote:Original post by SiCrane
Quote:Original post by Red Ant
However, except for the most trivial cases, I'd recommend using proper synchronization constructs instead (CRITICAL_SECTION, boost::mutex or whatever), in which case volatile isn't necessary any more anyway.

Using actual synchronization constructs doesn't free you from declaring shared variables volatile. If you try to reference shared variables without volatile from multiple threads, the compiler may still try to use cached values.


Does this mean one should indiscriminately mark all shared variables volatile?
Quote:Original post by Red Ant
As I've recently found out in another discussion about volatile (also on this forum), in Visual C++ 2003 or newer, making an integral variable volatile means you can safely access it from several threads without having to worry that its value might get corrupted. Check this Visual C++ 2005 description of the volatile keyword: http://msdn.microsoft.com/en-us/library/12a04hfd(VS.80).aspx

Quote:
[...]This allows volatile objects to be used for memory locks and releases in multithreaded applications.



What they are describing has always been the case - x86 processors guarantee that a single read or write of a word sized variable or smaller will always be atomic. Note: this applies regardless of compiler.

What they are not saying is you don't need to synchronize access. You can still easily get into situations where thread a reads the value, then thread b writes it, then thread a acts on the previously read value without seeing the change thus effectively corrupting program state.
Quote:Original post by Jerax
What they are describing has always been the case - x86 processors guarantee that a single read or write of a word sized variable or smaller will always be atomic. Note: this applies regardless of compiler.

What they are not saying is you don't need to synchronize access. You can still easily get into situations where thread a reads the value, then thread b writes it, then thread a acts on the previously read value without seeing the change thus effectively corrupting program state.

This is exactly what I tried to explain above. Just because a single read or write cpu instruction is atomic doesn't mean the entire block of source code is atomic.

This particular block of code: while(lockVariable); lockVariable = true;

Let's consider the simple case of this being the only program running on the PC. No operating system, no threads, no nothing.

The variable is first accessed and loaded into the CPU. There are many cycles between when the query is processed by memory, sent through the bus, and finally loaded into the CPU (processed as an atomic operation, so the integer value doesn't tear). That value is used in a comparison, which also takes a cycle or two. Next is a conditional jump instruction, which takes a moment. Finally we write to the variable. After the write, we must wait around for the value to exit the pipeline and be written out to main memory through the very slow bus (again, written as an atomic operation so the integer value doesn't tear).

That's a very big round trip. Sure there is no word tearing because integer operations themselves are atomic, but there is plenty of time for somebody else to grab the value because the code is not atomic.

When you add in the detail of an operating system that can interrupt processing at any point, and the detail that concurrent processors might be running the code at precisely the same instant, and the detail that even when running on a single processor the instructions between concurrent threads can be intermixed, you have a recipe for a hard-to-find bug.


If you need any values transmitted between threads, you must either lock access to it through a mutex or use a method of IPC such as semaphores, pipes, sockets, a quality existing library written by experts, or even the OS's message queue. All of these methods can guarantee that your data is properly shared between threads.

Anything less than this kind of true IPC is not reliable, and will be a source of nightmarish bugs later on.
Quote:Original post by frob

This is exactly what I tried to explain above. Just because a single read or write cpu instruction is atomic doesn't mean the entire block of source code is atomic.


If volatile is respected by compiler, then it will organize the code in such a way as to prevent this type of failure.

But!

volatile is not a concurrency primitive. It's merely support for developing those. Up until the increased popularity of multi-core chipsets, as well as

Quote:Anything less than this kind of true IPC is not reliable, and will be a source of nightmarish bugs later on.


volatile is there to keep compiler from being too smart:
bool running = true;while (running) {}
In this example, compiler is perfectly free to not even allocate running. It's never changed, and all we have is an infinite loop.

Marking running volatile will tell compiler that this variable will be modified by multiple threads, and it should be pessimistic about optimization.

So in order to do stay away from obscure and inexplainable bugs (as the above example), cautious and appropriate use of volatile is perfectly warranted.

In general, best concurrency is obtained by replicating the state, avoiding any kind of locks altogether. But there are several edge cases (lock-less algorithms, various shared data where consistency isn't important), where volatile is either required, or useful. But they really are edge cases.

I digress though, someone should run the following test:
- A thread is reading a timer (1000Hz), storing last result into int sized variable
- Multiple consumers read this shared variable (1..50kHz, variable or fixed rates)

1) shared variable is marked volatile
2) shared variable is accessed through some synchronization primitive
3) shared variable is plain old int

Choice 3 is probably worst, since compiler might optimize the access away. As for 1 and 2, I have no immediate rule of thumb which would be faster/more light-weight. On one hand, under 1) access would be unoptimized (whatever that means in the context), whereas under 2) it would require use of mutex, which is generally considered heavy-weight, especially for this type of fine-grained access.
Quote:Original post by fpsgamer
Does this mean one should indiscriminately mark all shared variables volatile?


Well, if you're limiting it to shared variables, it's not indiscriminate is it? But, in any case, no, you don't need to do that. You can also access objects via volatile qualified pointers.
Quote:Original post by Red Ant
As I've recently found out in another discussion about volatile (also on this forum), in Visual C++ 2003 or newer, making an integral variable volatile means you can safely access it from several threads without having to worry that its value might get corrupted. Check this Visual C++ 2005 description of the volatile keyword: http://msdn.microsoft.com/en-us/library/12a04hfd(VS.80).aspx

As said above, that is true if the value in question is *already* atomic (32-bit reads/writes typically are. My homemade object which just so happens to have a size of, say, 24 byte won't be)

Quote:bool running = true;

while (running) {}

In this example, compiler is perfectly free to not even allocate running. It's never changed, and all we have is an infinite loop.

Marking running volatile will tell compiler that this variable will be modified by multiple threads, and it should be pessimistic about optimization.

Couldn't we just give running external linkage? Then it has to be allocated because the linker has to be able to refer to it. But we still allow it to be cached on the CPU (Which volatile doesn't allow?)

I'll admit I've never been too clear on *all* the subtleties of volatile. I've seen some pretty knowledgeable people claim it is *completely* useless, and others say it has some justification when used carefully in combination with "real" synchronization primitives. There's an awful lot of superstition about what it does and does not do.
Quote:Original post by SiCrane
Quote:Original post by fpsgamer
Does this mean one should indiscriminately mark all shared variables volatile?


Well, if you're limiting it to shared variables, it's not indiscriminate is it? But, in any case, no, you don't need to do that. You can also access objects via volatile qualified pointers.


I just finished reading this article on DDJ regarding the volatile keyword, and I have to say it cleared up a lot of my confusion. (Spoonbender, you may care to read it too, it has some nifty tricks)

Among other things, it stated that inside critical sections it is not necessary to use the volatile keyword.

This means that inside critical sections values may still be cached in registers. Do native OS threading constructs automagically synchronize registers with main memory when you exit critical sections?
Quote:Original post by fpsgamer
Do native OS threading constructs automagically synchronize registers with main memory when you exit critical sections?
AFAIK, they do by use of a memory barrier which forces all preceding memory accesses to commit before any later accesses. This does have the side-effect of potentially causing some pipeline stalls across multiple cores though.
Quote:Original post by fpsgamer
This means that inside critical sections values may still be cached in registers. Do native OS threading constructs automagically synchronize registers with main memory when you exit critical sections?


The way I read it, it doesn't synchronize registers (how would that even be possible for the OS to do?). If you work with non-volatile variables inside a critical section, it's up to you to ensure they get written out before you leave it. (For example by using a non-volatile temporary inside the critical section, and copying the value to a volatile variable before you leave)
He's just saying that as long as you stay inside the critical section, you don't need to use volatile, and it's ok for data to be kept in registers.

Anyway, this all makes good sense from a CPU perspective. But does it also take into account compilre optimizations? If my data inside a critical section is nonvolatile, what's to stop the compiler from reordering things a bit, moving some of the writes outside the critical region?

(Actually, come to think of it, I can't see why that'd happen. When you enter a critical region, the compiler sees a function call to... something defined in another compilation unit. It can't make assumptions about side effects then, of course, so it can't safely reorder writes across it... Unless I forgot something [grin])

I also ran across this blog entry a while back, which was a useful read as well.

This topic is closed to new replies.

Advertisement