However if I understand your reply correctly, primitives for condition variable do need kernel mode code right? there is no way to avoid it?
The CPU instruction that puts a core to sleep is priviliged. Therefore, only the kernel can put you to sleep. Note though, the kernel may decide to not put the core to sleep and do something else (like switch to a different thread or process). Note only the Kernel can do that too.
Otherwise malware would do disasters on your machine. A virus could perfectly freeze your system forever (until you hit reset button). Or prevent a high priority process from ever being scheduled. In short, lots of bad things.
Implementations do their best to stay away from entering the kernel as much as possible. But if putting the thread to sleep is unavoidable, the kernel must get involved (see an example of Lightweight Mutex showing how atomics can be used around a mutex to stay away from entering kernel code as long as threads don't access the mutex at the same time)
Even if it is kernel mode, how does OS achieve it? does the scheduler maintain something like map<condition_variable, list<*threadContext>> and once a condition_variable get modified, os find it's corresponding threadContext list and make context switch accordingly?
You're not too far off; and it probably won't make the context switch right away, but rather schedule the woken up thread for execution. It may so happen it often gets scheduled to be executed immediately after though.
The details can vary a lot per OS; and it's quite rudimentary because kernels are often in C and there's a lot of restrictions (e.g. the Linux kernel does not like FPU code, allocations are tricky, kernel code size must be kept to a minimum otherwise the chances of a bug get higher and a bug in kernel code can be disastrous, etc)
If you are more interest then go study the Linux Kernel and FreeBSD's pthread_cond_init implementations. They're open source after all.
In my mind, there must be a 'bool flag" this thread is periodically checking, otherwise, how this 'sleeping' thread know it's time to wake up? So if it is 'periodically' evaluating a memory address (or CPU register), then does that mean using condition variable is just another 'sleep(time)' call under the hood with much fine grain delta time?
The popular OSes we know are interrupt-driven. That means the OS schedules themselves to the CPU "wake me up in 16ms". The interrupt causes the CPU to execute kernel code at the scheduled point in time. When woken up, the OS will go through a list of threads it needs to run and execute them. If there's nothing to run, then it puts the core back to sleep (not without rescheduling itself before sleeping).
On Windows, how frequent the OS schedule interval itself is controlled (more like hinted) by timeBeginPeriod (it's a dreaded system-wide value).
On Linux, it's defined at compile time.
Both Linux & Windows now support tickless kernel (which means the OS won't schedule itself). But beware that means all but one core are tickless. There must be at least one core rescheduling itself periodically.
Technically if this were DOS, it could be fully tickless if the OS waits for an external interrupt, like keyboard input. But modern systems hardly are this simple (e.g. just setting an alarm to ring at 8pm means the OS can no longer wait on external interrupts: it has to reschedule itself at periodic intervals to check if the current date exceeds the alarm date).
How interrupts work? Well there's a crystal oscillator vibrating at a specific frequency which will send signals to the CPU at regular intervals... ok enough. If I keep talking we'll end up talking about atoms and open questions in science. You have more than enough now to keep digging on your own.