A lock-free queue *is* an order of magnitude more complicated to write/debug, so you should only bother if you need that performance boost. The performance difference can be *huge* though.
I gave up on trying to get a really fast lock-free queue working though. Every 6 months I'd find another flaw that would crash once in a billion...
I'm now using
wait free queues, which turns out to be easier - just use std::vector/etc and solve thread-safety with scheduling instead of synchronisation ;)
Quote:Original post by DmytryQuote:Original post by AntheusQuote:Original post by Dmytry
Out of curiosity. What would be the real world usage scenario where you would actually need this thing?
Active Objects.
You have n objects that are handled by thread pool. At any given time, an active object will be active in at most one thread only.
Other active objects running in separate threads are free to post messages to other objects.
Active Objects are only real world when your real world pointy headed boss tells you to implement Active Objects coz he read somewhere that those are a must... otherwise you still need a scenario that is best solved with active objects.
I'm building a thread-safe entity/component system at the moment, which works pretty much the same as
Active Objects or the
Actor model. It's very handy because all of the game logic is implicitly multi-threaded and thread-safe. Inter-entity function calls are converted into message objects at compile time and written into these queues instead of being instantly executed.
I use something like the following for the queue structure. One write queue is created per thread, and a single read-only queue.
template<class T>class ThreadLocalQueue{ typedef std::vector<T> Vector; typedef std::vector<Vector> TlsVector; TlsVector m_Write; Vector m_Read;public: ThreadLocalQueue( int numThreads ) : m_Write(numThreads) {} Vector& Write() { return m_Write[GetThreadId()]; } const Vector& Read () const { return m_Read; } void Flip() { //Barrier //assert worker threads aren't running! //clear m_Read //for each m_Write // copy contents of m_Write[n] into m_Read // clear m_Write[n] //Barrier }}
To distribute work, each thread is given a different range of the read queue to execute. As the threads do their work, they write results into each of the write queues.
After all the threads have finished working (i.e. finished using the queue), the main thread calls Flip() and schedules up another cycle of work for the threads to execute.
The only caveats are to make sure memory re-ordering doesn't mess up Flip() with appropriate memory fences, and the vectors should be aligned to the cache-line size. The GetThreadId() function (which makes sure each thread writes to a different queue) is only a few lines of code based on
thread local storage.
[Edited by - Hodgman on September 13, 2009 5:40:52 PM]