Jump to content
  • Advertisement
Sign in to follow this  
ApochPiQ

Unity Double-check my lock-free voodoo

This topic is 3462 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I've been working on a lockless message passing system for the last couple of days, and finally came up with what seems like a pretty solid implementation. First, some important background: this is designed for a one consumer, multiple producer scenario. It will assuredly not work in a multiple-consumer situation. However, from my checks and logic, it should be fully safe as long as only one thread is consuming data. The theory is very simple. We maintain three stacks. The first stack is the "write stack." All producers append their messages to the write stack with a simple CAS-based method. The second stack is the "read stack." This stack is only visible to the consumer; all work by the consumer is done on the read stack. It is effectively not aware of the write stack. The third stack is a little bit of a hack. I want this system to have FIFO semantics, but the write stack is implemented as a LIFO stack. Therefore, the read process uses this third stack to essentially reverse messages that it reads. Consuming a message means popping an item off this third stack. Anyone with basic familiarity with lock-free structures should be familiar with the CAS LIFO stack, so I won't bother explaining it. The consumer half is a bit wonky, though, so here's the logic. (I make heavy use of a "sentinel" node which is just a node in either the read or write stack that has no attached data payload. This sentinel is basically the "end of list" marker.)
  • If the read head is the sentinel node, swap the pointers to the read and write stacks, and proceed as usual
  • Traverse the read stack (which is basically a singly linked list at this point) and push each item onto the third stack as it is visited.
  • If the third stack is empty, return NULL, indicating no pending messages
  • Pop the first item off the third stack, clean up some memory usage, and return it to the caller as the current message
I've reviewed this logic repeatedly, and stress-tested the implementation under 30 threads of load on my dual-core workstation. As near as I can tell the code is free of potential problems, primarily because of the magic constraint that we only have one consumer. So here's the code: [edit] Removed the original code; please see the updated version here. [/edit] I welcome any comments, suggestions, and/or gentle explanations that I am a moron and that this doesn't actually work [grin] [Edited by - ApochPiQ on March 20, 2009 6:22:50 PM]

Share this post


Link to post
Share on other sites
Advertisement
Is there particular reason why you don't just use lock-less FIFO queue, which is also safe for multiple-producers multiple-consumers case.

Share this post


Link to post
Share on other sites
Every lockless FIFO I've seen requires not just CAS but CAS2; since CAS2 is not widely supported (especially in 64-bit land), I've opted to try and stick to CAS for portability.

Share this post


Link to post
Share on other sites
What level of portability are you going for? volatile gives you different guarantees depending on the compiler you're using (or even the *version* of the compiler you're using). I'd try not to use it at all and instead insert the barriers explicitly.

I'm not qualified to comment much further than that on the nitty gritty. You have some exception safety issues, though, unless you're assuming new will never throw (which isn't the case on Windows, for example).

Share this post


Link to post
Share on other sites
I thought cmpxchg8b is quite widely supported on 32-bit x86 platforms. Not sure how well cmpxchg16b is supported for 64-bit platforms though. Do you have something to back up the "CAS2 is not widely supported" claim, e.g. a supported instruction table or something?

Share this post


Link to post
Share on other sites
I was looking back through my notes and intertron history and realized that all the sources I have regarding CAS2 on 64-bit processors were pretty old, and likely outdated. Seems like cmpxchg16b is pretty common now.

In any case, I've got a working algorithm (I think) and don't care to go through all the trouble of building another one [wink]


Yeah, exception safety is a bit of a problem here; I'll be tuning that up along with lots of other code that needs it.

Manual memory fencing also sounds like a good idea. I'll add that to my list and hopefully get it hammered out pretty quickly.


Thanks all for the feedback thus far; although I'm still curious if anyone can find a problem in the implementation...

Share this post


Link to post
Share on other sites
Here's a fun one: Visual C++ provides no intrinsic for cmpxchg16b (unless they added it very recently), nor will it compile 64 bit assembly. How much do you love MASM? [grin]

Share this post


Link to post
Share on other sites
Quote:
Original post by Promit
Here's a fun one: Visual C++ provides no intrinsic for cmpxchg16b (unless they added it very recently), nor will it compile 64 bit assembly. How much do you love MASM? [grin]


Your information is slightly dated. _InterlockedCompareExchange128 was added late 2007.

Share this post


Link to post
Share on other sites
Quote:
Original post by ApochPiQ
In any case, I've got a working algorithm (I think) and don't care to go through all the trouble of building another one [wink]

I haven't checked through your code, so I can't tell if there are issues with it, but one thing you might want to try to do is to reproduce the ABA issue, which is why CAS2 is often used. The problem may rise very rarely, thus it's difficult to test empirically. I implemented a lock-free FIFO queue in Spin-X Engine based on "Optimized Lock-Free FIFO Queue continued" paper though, and it's not terribly difficult.

Share this post


Link to post
Share on other sites
I've walked through every combination of partial execution/preemption and can't find a case where ABA occurs. This is primarily because of the way the single consumer works - I have a guarantee during the SwapReadAndWrite operation that the only preemption can come from additional writers (since there's only the one consumer). Similarly, because of the way the read/write stacks are swapped by the consumer, I should never see an ABA pattern in the producer's push operation.

However, I'm fairly green when it comes to lockless programming, and we all know how pernicious and subtle lockless bugs can be - so I'd really appreciate it if someone else could double-check my logic and make sure I'm not overlooking something.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!