## Recommended Posts

Eleventy    313

##### Share on other sites
cache_hit    614
Because you're encountering a race condition. If two threads try to access the same value at the same time, bad things can happen. Instead of doing this:

(*pdwNumThreads) -= 1;

do this

::InterlockedDecrement(pdwNumThreads);

and similarly for addition change it to InterlockedIncrement.

##### Share on other sites
Eleventy    313
Ah, I see. I thought "incrementing" a value in memory was just that - changing the value of the variable directly in memory. I was unaware the value had to be copied locally, incremented, then copied back to the final location. That changes things quite a bit, as I thought simple things such as manipulating integers and DWORDS were multithread-safe (were a one-step process).

Thanks a lot for your help.

##### Share on other sites
cache_hit    614
Quote:
 Original post by EleventyAh, I see. I thought "incrementing" a value in memory was just that - changing the value of the variable directly in memory. I was unaware the value had to be copied locally, incremented, then copied back to the final location. That changes things quite a bit, as I thought simple things such as manipulating integers and DWORDS were multithread-safe (were a one-step process).Thanks a lot for your help.

The problem is that the result of incrementing a value depends on what was there before. In x86 assembly it might boil down to the following:

INC   [DWORD PTR ebp-0x10]

But you have no guarantee about optimizations that the compiler might perform. Actually in the case of a global variable, you kind of do, but even then only sometimes, in particular if the variable is not in an anonymous namespace or not declared file static.

Anyway, the point is that the rules are complicated. Suppose for example you had something like this:

DWORD ThreadFunc(LPVOID arg){    Instance* pInstance = reinterpret_cast<Instance*>(arg);    for (int i=0; i < 10; ++i)       ++pInstance->count;}

This function is called from multiple threads. The compiler might see that and say, "oh hey, I can optimize that so it doesn't have to do a pointer indirection every single time. Let's transform the code like this:

DWORD ThreadFunc(LPVOID arg){    Instance* pInstance = reinterpret_cast<Instance*>(arg);    int temp = pInstance->count;    for (int i=0; i < 10; ++i)       ++temp;    pInstance->count = temp;}

Ooops! Race condition. This is a contrived example, but the point is that there are almost always way too many gotchas, most of which are obscure and a mere mortal has no way of realizing all the different ways that things can screw of.

So... moral of the story is always use atomic operations or mutual exclusion locks around shared data, even when you think it doesn't require it. (Especially when you *think* it doesn't require it).