A plain boolean is enough. No need to make it atomic as the thread doesn't change it. It may get an inconsistent state or miss a check, but it will be picked up the next time.
Unfortunately not. Without some kind of thread-safe modifier, the compiler is allowed to assume single-threaded behaviour, if it cannot see the variable being modified in a given area of code, it can assume the valaue never changes - as such, changes in thread A may *never* appear to thread B.
Locking and unlocking also take time.
True in general, but note that atomic operations don't actually lock or unlock, though you could build such behaviour on top of them.
Edit: You should make the boolean volatile, to avoid the compiler optimizing the access away.
Volatile is quite different from thread safety. It says to the compiler "make no assumptions about the current state of this variable, it could be changed at any time (e.g. by a hardware peripheral)".
While it is true that some compiler vendors have extended the meaning of "volatile" to be used for inter-thread communication, that was due to the absence of a standard alternative. Adding the thread-safe primitives to the standard was supposed to replace such non-standard extensions.
If you're worried about the cost of atomics, volatile should be much worse as it requires the current value to be fetched from real memory, bypassing all CPU cache levels in between. Atomic operations can work within the caches, though there is a cost to whatever inter-core cache consistency messaging protocols are used by the processor.
I can't remember the exact details now, so please double check this, but I believe there is an additional issue using "volatile", which is that the compiler and processor are still allowed interleave other instructions relative to the volatile memory access which can result in thread-safety issues. However, thread safe atomic operations have very strict ordering rules which the compiler must respect, and it needs to emit whatever memory barrier instructions which may be necessary to ensure the processor doesn't re-order incorrectly either. I believe this is the kind of non-standard guarantee that certain compiler vendors would make about "volatile".