# visual studio code optimization is breaking my code.

This topic is 2144 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

hello everyone, i'm having a weird issue with my code, and i believe i've tracked it down to the compiler optimizing out/pre-fetching/doing something that is breaking my code.

it's a multithreaded locking mechanism, so it basically looks like this:

 unsigned char Flag=0; void FunctionA(void){ while(1){ if(!(Flag&amp;2)){ //if not locked... //Do stuff if(Flag&amp;1) Flag|=2; //Locked this thread... } } } void FunctionB(void){ Flag|=1; //Request lock while(!(Flag&amp;2)); //Wait for lock <-- THIS IS BROKEN? //Do stuff Flag^=3; //Unlock } 

with debugging through output, i've discovered that FunctionA does indeed lock the variable, and FunctionB is inside of the loop, waiting for locking, but even after my output clearly tells me flag has been locked, my waiting thread simple does not break, i assume this is because flag might be pre-catched or something?

note, that without optimization this works as expected.

edit: ok, so in vs, i can even pause the application, and look directly at the flag, which is reported to the value i'm waiting on, yet, for w/e reason, i'm still inside the loop?! Edited by slicer4ever

##### Share on other sites
If you want the compiler to know that a variable may be modified outside the code it can see at the time, you need to mark that variable as volatile (or reference it through a volatile reference, etc.).

##### Share on other sites
ah, i didn't know that, thanks 100 fold, fixs my problem=-)

##### Share on other sites

There's at least one trivial deadlock condition and possibly also conditions in which your code can allow multiple threads inside the lock, and that's easy to spot without spending much time analyzing the behavior.

purely in the above code?, how can functionA interact until after m_Flag^=3 is done inside functionB?

Also, if your actual code is more complex than this, be warned that it's actually probably got more problems, not fewer.

yes, my code is far more complex, but it boils down to exactly the situation as the above code.

Threading is hard. Locking in pure user-space is not possible to do safely. You need to use kernel-level locks or hardware-level locks or you will introduce very, very nasty bugs into your programs.

i don't see why user-space locks can't work similar to kernel/hardware locks, the only exception being if the os, or an outside program interferes, but in my opinion, if someone wants to monkey around with their ram to modify my program at runtime, that's their business, and my program shouldn't try to prevent this, or any exceptions caused by such behavior. Edited by slicer4ever

##### Share on other sites
Threading is hard. Locking in pure user-space is not possible to do safely. You need to use kernel-level locks or hardware-level locks or you will introduce very, very nasty bugs into your programs.
Intel would like to disagree with you there, also, CRITICAL_SECTION based locking on a windows platform first seeks to do user-mode locking before it asks the kernel for help.

I would say safe user-mode locking is not possible without a lot of research and applied knowledge, instead of plainly stating its impossible.

So in this case, all those bit op's would need to become atomic for starters, but spinning can still be done on the "naked" volatile variable, you just need to use a atomic op to ensure its free when you try lock it for the current thread. Edited by Necrolis

##### Share on other sites
This has nothing to do with someone "interfering" with your program. It's a property of pre-emptive multitasking kernels that you cannot avoid.

In your code, consider the following sequence of events:

- Flag starts at 0
- In thread 1, function B runs
- Function B loops until the second bit of Flag is cleared (which is immediately true, since Flag is 0)
- At this exact moment, after breaking from the loop but prior to the "do stuff" portion, thread 1 is suspended by the kernel
- Thread 2 begins executing inside function A
- Flag & 2 is false, because as we just stated the loop exited in Function B
- Therefore we "do stuff" in function A
- Partway through "doing stuff", thread 2 is suspended by the kernel, and thread 1 resumes

Now both Thread 1 and Thread 2 are inside the "lock" and doing stuff.

I will say I was incorrect about deadlocks, but you can still starve threads trivially. For example, you can starve the execution of "doing stuff" in Function A with the following pattern:

- Thread 1 runs Function A
- Thread 2 on a concurrent core comes in and executes Function B
- Thread 2 running Function B suspends just prior to the xor in the last line
- Function A continues to run in thread 1 and has just executed the inner if() check BUT NOT the flag change
- At this point Flag is still 1!
- Flag is clobbered to 2 by the xor
- Thread 1 resumes, thread 2 continues on its merry way doing something unrelated
- Now, remember that function A was suspended just before the inner if() check
- Flag is now set back to 3
- A loops; because bit 2 is set, it skips the outer if() check and begins to spin
- Thread 2 comes back into function B
- Flag is "updated" to 3 by function B
- Thread 2 suspends just after exiting the loop (since the 2nd bit of Flag is still set)
- Thread 1 remains spinning until whenever Thread 2 wakes up and gets around to clearing the bits

I'm sure there are more problems that I could find if I took the time to look over all the interleaving possibilities of thread executions, but the presence of a serious lock failure and a starvation condition should be enough to convince you that your lock is busted.

 And that's only considering two-thread interactions. You can get all kinds of funky bugs by introducing a third thread. I highly urge you to try (mentally) running three threads with this lock pattern and seeing if you can find some of the race conditions. Edited by ApochPiQ

##### Share on other sites

[quote name='ApochPiQ' timestamp='1341206672' post='4954779']Threading is hard. Locking in pure user-space is not possible to do safely. You need to use kernel-level locks or hardware-level locks or you will introduce very, very nasty bugs into your programs.
Intel would like to disagree with you there, also, CRITICAL_SECTION based locking on a windows platform first seeks to do user-mode locking before it asks the kernel for help.

I would say safe user-mode locking is not possible without a lot of research and applied knowledge, instead of plainly stating its impossible.

So in this case, all those bit op's would need to become atomic for starters, but spinning can still be done on the "naked" volatile variable, you just need to use a atomic op to ensure its free when you try lock it for the current thread.
[/quote]

Intel thoroughly agrees with me. If you read their source listing, they're using hardware-level locks (aka interlocked instructions) to achieve atomicity. Interlocking is still a form of hardware-level locking, especially on modern memory buses and multi-layer cache architectures.

##### Share on other sites

This has nothing to do with someone "interfering" with your program. It's a property of pre-emptive multitasking kernels that you cannot avoid.

In your code, consider the following sequence of events:

- Flag starts at 0
- In thread 1, function B runs
- Function B loops until the second bit of Flag is cleared (which is immediately true, since Flag is 0)
- At this exact moment, after breaking from the loop but prior to the "do stuff" portion, thread 1 is suspended by the kernel
- Thread 2 begins executing inside function A
- Flag & 2 is false, because as we just stated the loop exited in Function B
- Therefore we "do stuff" in function A
- Partway through "doing stuff", thread 2 is suspended by the kernel, and thread 1 resumes

Now both Thread 1 and Thread 2 are inside the "lock" and doing stuff.

the problem here is that Function B loops until the second bit is SET, not when it is cleared, it sets bit 1 of flag requesting for a lock, and then cycles until bit 2 is set, in functionA, it sees bit 1 is set, and set's bit 2. at this point it cannot access it's data until bit 2 is cleared, and bit 2 is only cleared until the end of functionA, when it xor's the flag.

- Thread 1 runs Function A ----------------------------------------------------------------------------------------------------------//FunctionA starts
- Thread 2 on a concurrent core comes in and executes Function B--------------------------------------------------------- //FunctionB starts same time as A
- Thread 2 running Function B suspends just prior to the xor in the last line-------------------------------------------------// since it has reached xor(but not done it), we can assume flag is 3
- Function A continues to run in thread 1 and has just executed the inner if() check BUT NOT the flag change------//FunctionA not sure what you mean here?, do you mean at if(Flag&2) and it has not done Flag&2?, i was under the assumption the thread couldn't be suspended at such an operation?
- At this point Flag is still 1!-------------------------------------------------------------------------------------------------------------//Flag is still 3 since we have not xor'd
- Flag is clobbered to 2 by the xor-----------------------------------------------------------------------------------------------------//xor clears flag to 0 not 2
- Thread 1 resumes, thread 2 continues on its merry way doing something unrelated
- Now, remember that function A was suspended just before the inner if() check
- Flag is now set back to 3--------------------------------------------------------------------------------------------------------------//Can a thread resume actually overwrite a variable(again didn't know), wouldn't the 3 be in a temporary stored register, at this point, and not actually write the value?
- A loops; because bit 2 is set, it skips the outer if() check and begins to spin
- Thread 2 comes back into function B
- Flag is "updated" to 3 by function B-------------------------------------------------------------------------------------------------//Again, did not know a thread resume can directly overwrite a user-space variable.
- Thread 2 suspends just after exiting the loop (since the 2nd bit of Flag is still set)
- Thread 1 remains spinning until whenever Thread 2 wakes up and gets around to clearing the bits

I'm sure there are more problems that I could find if I took the time to look over all the interleaving possibilities of thread executions, but the presence of a serious lock failure and a starvation condition should be enough to convince you that your lock is busted.

i added some comments to the steps, but i did not know that thread resumes can overwrite a value by themselves, i was under the assumption that they couldn't perserve the variable in such a way(as well, i've actually never encountered this level of multi-threading problem)
edit: my spacing wasn't persevered, my comments next to each step is after //
edit2: if i read this correctly, on thread 2's resume it would fix the flag?(if so, i suppose my above issue is a bit out of context, as functionB is inside a loop of the thread, so in theory, this issue would resolve itself as long as thread 2 properly restores itself, and thread 1 doesn't keep gettting stuck at the inner-loop.

edit3: sorry if your in the process of responding with my edits, anyway, as to your bottom statement, i have to disagree about my lock being bust, at best it prevents the data it's locking from being accessed by the thread that isn't suppose to be able to, at worse(and only in situations where thread 2 is suspended permanently after it's operation's, is that it locks the data permanently.) however, since my situation(not the above code, my actual program) means that Thread 1 will always terminate before Thread 2, i can safely assume that any starvation is only a temporary occurence, and this particular thread locking is not used in any real-time conditions.

 And that's only considering two-thread interactions. You can get all kinds of funky bugs by introducing a third thread. I highly urge you to try (mentally) running three threads with this lock pattern and seeing if you can find some of the race conditions.

yes, i realize three threads can defiantly have a huge problems with this particular locking system, i've developed another method for handling n threads in a similar manner, essentially, each thread is given a bit to represent it's locked state, and on request, each thread locks itself when able to, and the requestor moves forward only after all the threads have succesfully locked themselves. Edited by slicer4ever

##### Share on other sites

Intel thoroughly agrees with me. If you read their source listing, they're using hardware-level locks (aka interlocked instructions) to achieve atomicity. Interlocking is still a form of hardware-level locking, especially on modern memory buses and multi-layer cache architectures.
somehow over looked the part about hardware locking, guess its too early in the morning for me . In which case I agree with your statement

• 23
• 10
• 19
• 15
• 14