Jump to content

  • Log In with Google      Sign In   
  • Create Account


cache friendly lock free concept?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
41 replies to this topic

#41 slicer4ever   Crossbones+   -  Reputation: 3220

Like
0Likes
Like

Posted 04 November 2013 - 02:20 PM

hey guys, thanks for the continued input so far. i was reading up on false cache sharing to better understand what's going on, and what i can do(i saw the 64 bytes as most cache-lines). I found this article by intel: http://software.intel.com/sites/default/files/m/d/4/1/d/8/3-4-MemMgt_-_Avoiding_and_Identifying_False_Sharing_Among_Threads.pdf which gave me some interesting insights, most importantly was this passsage:

Avoid false sharing but use these
techniques sparingly. Overuse can hinder the effective use of
the processor’s available cache. Even with multiprocessor shared cache designs, avoiding false
sharing is recommended. The small potential gain for trying to maximize cache utilization on
multi processor shared cache designs does not generally outweigh the software maintenance
costs required to support multiple code paths for different cache architectures.


are they suggesting(as L. Spiro did before), that in the long run it's not really worth it to try to prevent false sharing. What i read from this(and i might be reading it wrong), as that optimizing for this only realistically provides a small performance gain anyway?

@Matias Goldberg, your link actually really helped me understand more of what's going on. I think one of my biggest problems with writing well-designed multi-processor code is that i've been relying on the x86 architecture's threading mechanisms, it's clear to me now that my locks could very easily fail as-is on other processors which don't have such strict store/load guidelines.

now then, let's say i added memory fences, would this be sufficient to prevent the issues that are presented with the double-checked locking problem:


violatile int LockState = UNLOCKED;
linklist<Packet*> *InPackets = new linklist<Packet*>;
linklist<Packet*> *OutPackets = new linklist<Packet*>;
linklist<Packet*> *TempList = new linklist<Packet*>;
 
 
void NetworkThread(){
    RetrievePendingPackets(InPackets);
    if(LockState==LOCK_REQUESTED){
       [read fence]
       linklist<Packet*> *Temp = TempList;
       TempList = InPackets;
       InPackets = Temp;
       [write fence]
       LockState = LOCKED;
   }
   if(LockState==UNLOCK_REQUESTED){
      [read fence]
      for(link<Packet*> *Lnk = TempList->GetFirst(); Lnk!=null; Lnk = Lnk->Next){
         SendPacket(Lnk->value);
      }
      TempList->Clear();
      [write fence]
      LockState = UNLOCKED;
   }
   return;
}
 
void GameThread(){
   if(LockState==LOCKED){
     [read fence]
     for(link<Packet*> *Lnk = TempList->GetFirst(); Lnk!=null; Lnk = Lnk->Next){
        ProcessPacket(Lnk->value);
     }
     TempList->Clear();
     linklist<Packet*> *Temp = TempList;
     TempList = OutPackets;
     OutPackets = Temp;
     [write fence]

     LockState=UNLOCK_REQUESTED;
   }
   if(LockState==UNLOCKED){
      [write fence]
      LockState = LOCK_REQUESTED;
   }
   UpdateGames();  //Will push packets into the outpackets list.
}
i think i woudn't have to put read fence's in front of the access to LockState, as neither thread really care's when it see's the correct value, just so long as the linklist's are seen with the correct values when i begin working on them is all i care about...is my line of thinking correct here?
Check out https://www.facebook.com/LiquidGames for some great games made by me on the Playstation Mobile market.

Sponsor:

#42 King Mir   Members   -  Reputation: 1914

Like
1Likes
Like

Posted 04 November 2013 - 02:59 PM

It's easier to instead of using memory fences yourself, to use C++11 atomic primatives with the correct ordering specified. In this case you need minimally memory_order_aquire on each read of lock state (if statement condition), and memory_order_release on every write to lockState. The only exception is the if(LockState==UNLOCKED)LockState = LOCK_REQUESTED; Here, both can be memory_order_relaxed, because there is no critical section.

Alternatively don't bother with all that and just use the default behavior of atomic, changing violatile int LockState = UNLOCKED; to std::atomic<int> LockState = UNLOCKED; only.

With the memory barrier code you added, it's essentially correct, except you don't need the write fence before LockState = LOCK_REQUESTED.

EDIT: Speaking of if(LockState==UNLOCKED)LockState = LOCK_REQUESTED, you could just remove that, and reorder your code to only go back and forth once.

Edited by King Mir, 04 November 2013 - 03:57 PM.





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS