Question regarding lock free queue implementation

Started by
1 comment, last by Acar 7 years, 5 months ago

I'm trying to understand and implement a lock free queue in C. However there're some parts of the whitepaper I'm reading which I'm confused about.

First, if you look at line D5, locally stored head and Q->Head are being compared. When remove, it's causing issues regarding dead memory but I cannot understand how this provides any protection as Q->Head can be modified after the comparison was made.

Second, if you look at lines D4 and D19, head.ptr can(and sometimes is) be a freed pointer causing an access violation exception to be thrown.

Current implementation I have suffers from the second issue I mentioned:


int dequeue(struct Queue *queue, void **data)
{
     struct Pointer QUEUE_ALIGN     head;
     struct Pointer QUEUE_ALIGN     tail;
     struct Pointer QUEUE_ALIGN     next;
     struct Pointer QUEUE_ALIGN     new_ptr;
 
     while (1)
     {
          head   = queue->head;
          tail   = queue->tail;
          next   = head.ptr->next;
 
          if (head.ptr == queue->head.ptr && head.counter == queue->head.counter)
          {
               if (head.ptr == tail.ptr)
               {
                    if (next.ptr == 0)
                    {
                         return 0;
                    }
                    new_ptr.ptr      = next.ptr;
                    new_ptr.counter  = tail.counter + 1;
                    DWCAS((void*)&queue->tail, (void*)&tail, (void*)&new_ptr);
               }
               else
               {
                    *data           = next.ptr->data;
                    new_ptr.ptr     = next.ptr;
                    new_ptr.counter = head.counter + 1;
 
                    if (DWCAS((void*)&queue->head, (void*)&head, (void*)&new_ptr))
                    {
                         break;
                    }
               }
          }
     }
     free(head.ptr);
     return 1;
}
Advertisement

if you look at line D5, locally stored head and Q->Head are being compared. When remove, it's causing issues regarding dead memory but I cannot understand how this provides any protection as Q->Head can be modified after the comparison was made.

The point of that is that the algorithm wants to atomically read three variables. After reading, they may be out of date (actual data may be modified), but the algorithm wants to ensure that no modifications occurred during the three read operations.
However, your C code is not a faithful implementation of this part of the algorithm because you're not using any memory barriers to enforce program instruction ordering.
You need to read up on memory barriers to be able to implement these kinds of algorithms correctly.

D4 shouldn't cause any issue with old pointers, because only line D12 actually tries to dereference the pointer... However, yes, D12 is an issue if the nodes are actually deallocated, because it does attempt to read the node's value before the CAS operation.
So, instead of actually deallocating any nodes, you want to just move them over to a "free list" where they can be reused later -- this will cause a "free'ed" node to read a garbage value in D12, but then safely fail the D13 CAS check and retry the entire loop.
[edit] scratch that. The same solution works though -- you can't actually deallocate "free'ed" nodes; you have to keep them around for recycling.

This is actually the biggest problem with the Michael/Scott queue -- it assumes you've already built a lock-free allocator from which to allocate the nodes from in the first place. If you haven't (e.g. are just using malloc/free), then your queue will not be lock-free whatsoever.

So before you can attempt this algorithm, you've first got to build a lock-free pool allocator...

The point of that is that the algorithm wants to atomically read three variables. After reading, they may be out of date (actual data may be modified), but the algorithm wants to ensure that no modifications occurred during the three read operations.

However, your C code is not a faithful implementation of this part of the algorithm because you're not using any memory barriers to enforce program instruction ordering.
You need to read up on memory barriers to be able to implement these kinds of algorithms correctly.

Thanks, I'll read up on these.

So, instead of actually deallocating any nodes, you want to just move them over to a "free list" where they can be reused later -- this will cause a "free'ed" node to read a garbage value in D12, but then safely fail the D13 CAS check and retry the entire loop.

[edit] scratch that. The same solution works though -- you can't actually deallocate "free'ed" nodes; you have to keep them around for recycling.

This is actually the biggest problem with the Michael/Scott queue -- it assumes you've already built a lock-free allocator from which to allocate the nodes from in the first place. If you haven't (e.g. are just using malloc/free), then your queue will not be lock-free whatsoever.

So before you can attempt this algorithm, you've first got to build a lock-free pool allocator...

You're right! After reading your answer I went back an reread the paper and it indeed mentions about a freelist for nodes. I will go ahead and read about lockfree linked lists. Thanks for the answer.

This topic is closed to new replies.

Advertisement