C++, shared memory, threads

Started by
14 comments, last by Sc4Freak 14 years, 11 months ago
The following program doesn't terminate when compiled with GCC in Linux, using default compiler flags. However, it does terminate when a "sleep(1);" is added in the Thread1::operator() method after the std::cout line. My guess is that what's wrong is that the variable "run" isn't read more than once, only before the while loop. Making "run" volatile didn't help either. How should the program below be written (and what compiler flags should be used) to work as intended in a platform independent manner (e.g. it should shut down thread1 when the main thread has finished executing the loop)?

bool run=true;
struct Thread1 {
	void operator()() const {
		while(run) {
			std::cout << "Hello from thread1\n";
		}
	}
};

int main(int argc, char *argv[]) {
	Thread1 t;
	boost::thread bt(t);

	for(int i=0; i<10; ++i) {
			std::cout << "Hello from thread2\n";
	}
	run=false;

	bt.join();

	return 0;
}


Advertisement
It terminates for me on same setup. Program runs and finishes although the spawned thread may not output anything because run gets set to false before getting into Thread1's while() loop. You don't need protection around the run boolean (mutex or volatile) in this case. I would however be concerned about std::cout since as far as I am aware it is not thread safe and this could well explain the bad behaviour in not terminating.
Observe that the thread's operator() is not guaranteed to run at any particular time.

You construct the new boost::thread(), do some processing, set the flag, and then wait.

The act of creating a thread and preparing it to run takes a significant amount of work. It is likely that the system was still doing that initialization work when the flag was adjusted and the join() method was reached in the main thread.

Simply marking your run condition as volatile would not help, because that would not adjust program flow. It would force it to fetch the value again, but that amount of work is trivial compared with the act of creating a new thread.


In your case, you need to use one of the synchronization objects to wait until the other thread is fully running. Avoid the mutex, it is often the first concept learned when multiprocessing, yet it is the most expensive and time consuming when executed. It appears that a boost::condition_variable would satisfy your needs.

You would want one before the main thread's loop, so that the child thread has time to become fully initialized and begin running. Your run flag would be another condition_variable, indicating when the processing is done.
Ok, I will try that, will soon be back to report the result.

But apart from that, in general, what is preventing the compiler from optimizing away the check of "run" unless its volatile? Shouldn't it need to be volatile for it to work?
Probably nothing. Maybe the compiler saw threads in your code and decided to play it safe. Maybe if you increase the optimisation level it'll disappear.

But relying on volatile is a serious faux-pas anyway. Don't do it unless you know exactly what you're doing, which if you need to post a thread like this, you don't. (No offense.) If you want to communicate between threads, use something explicitly provided by your threading library for the purpose to do it. Shared data is the root of all concurrent-programming evil.
Does boost::thread at all provide anything for the purpose of shared memory?
Quote:Original post by Kylotan
But relying on volatile is a serious faux-pas anyway. Don't do it unless you know exactly what you're doing, which if you need to post a thread like this, you don't. (No offense.)

No offense taken. I know most of the theory on what may happen with combining compiler optimization and multi-threading, and I have a vague memory of reading some article about problems with volatile, but I can't recall the details and this uncertainty is why I'm asking! In short, how can the above program be implemented so that it is guaranteed to work?
Quote:Original post by all_names_taken
Does boost::thread at all provide anything for the purpose of shared memory?

What do you mean by "shared memory"?

It provides locking mechanisms and signals.

One locking mechanism is the shared lock base, which handles one of the safe ways to share access to memory. Anybody can read when it is unlocked, but to write it must be locked to write.

Does that answer your question? There are several other ways to transmit data between threads, and other ways to "share" memory, and boost offers a pretty comprehensive suite of the building blocks.
Quote:Original post by all_names_taken
No offense taken. I know most of the theory on what may happen with combining compiler optimization and multi-threading, and I have a vague memory of reading some article about problems with volatile, but I can't recall the details and this uncertainty is why I'm asking! In short, how can the above program be implemented so that it is guaranteed to work?
You need a memory barrier to actually transfer data between cores. Without them an updated flag value or an old flag value can end up getting stuck within a processors cache. The volatile keyword is intended to disable compiler optimizations when dealing with things like hardware registers or when communicating with signal handlers, as such it merely forces the the compiler to issue read and write instructions in the appropriate places and has no direct effect on the cache.

Of course in this case you almost certainly want to use a higher-level synchronization primitive provided by your threading library, like a mutex or a condition variable. These will take care of the appropriate memory barriers for you and work with the operating system to avoid busy-waiting.
Quote:Original post by frob
What do you mean by "shared memory"?

It provides locking mechanisms and signals.

One locking mechanism is the shared lock base, which handles one of the safe ways to share access to memory. Anybody can read when it is unlocked, but to write it must be locked to write.

Thanks, I'll take a look at that!

Quote:Original post by frob
Does that answer your question?

Well, there are two issues with a test program of this type:
- The first one is mutex: e.g. to ensure that there aren't any conflicts with thread A reading while thread B is writing, such that thread A reads something that is halfway updated. (Since a boolean write operation is most likely atomic, this part shouldn't be a problem even without a mutex lock.)

- The other issue is preventing the compiler from making optimizations that will only work in a single-threaded application. The while loop may for example compile to the following assembly output in an unoptimized build (pseudo code assembly, hehe):
...start:read variable RUN to register Xif register X is false goto ENDdo instructions for loop body...goto startend:...

But since "RUN" isn't written to inside the loop body, in a single-threaded setting the compiler might want to be smart and optimize away the repeated reads of X from memory, giving this:

...read variable RUN to register Xstart:if register X is false goto ENDdo instructions for loop body...goto startend:...

This is disastrous if the intention is for the other thread to be able to modify RUN while the while loop of this thread is executed, as a way to signal termination of the loop.

This topic is closed to new replies.

Advertisement