Back to General and Gameplay Programming

cache friendly lock free concept?

Tim · 2013-11-04T20:59:18

I have suspicions that the following concept is not very cache friendly, but i'm no expert on the subject, and like some pointers. essentially, i have 2 threads, one for networking, and one for the game simulation. i want to process the packets in the game simulation thread, rather than in the network thread so that the simulation can be maintained as deterministic. here's the concept i came up with for switching in/out packets between the two threads: violatile int LockState = UNLOCKED; linklist<Packet*> *InPackets = new linklist<Packet*>; linklist<Packet*> *OutPackets = new linklist<Packet*>; linklist<Packet*> *TempList = new linklist<Packet*>; void NetworkThread(){ RetrievePendingPackets(InPackets); if(LockState==LOCK_REQUESTED){ linklist<Packet*> *Temp = TempList; TempList = InPackets; InPackets = Temp; LockState = LOCKED; } if(LockState==UNLOCK_REQUESTED){ for(link<Packet*> *Lnk = TempList->GetFirst(); Lnk!=null; Lnk = Lnk->Next){ SendPacket(Lnk->value); } TempList->Clear(); LockState = UNLOCKED; } return; } void GameThread(){ if(LockState==LOCKED){ for(link<Packet*> *Lnk = TempList->GetFirst(); Lnk!=null; Lnk = Lnk->Next){ ProcessPacket(Lnk->value); } TempList->Clear(); linklist<Packet*> *Temp = TempList; TempList = OutPackets; OutPackets = Temp; LockState=UNLOCK_REQUESTED; } if(LockState==UNLOCKED) LockState = LOCK_REQUESTED; UpdateGames(); //Will push packets into the outpackets list. } my thoughts are that if the pointers are being swapped in one thread on a different core, then it needs to update the pointer for the value of the pointer for the cache on the thread for the other core. so, thoughts on it are much appreciated.

General and Gameplay Programming Programming

Started by slicer4ever October 31, 2013 03:57 PM

40 comments, last by King Mir 10 years, 5 months ago

slicer4ever

6,773

Author

November 04, 2013 08:20 PM

hey guys, thanks for the continued input so far. i was reading up on false cache sharing to better understand what's going on, and what i can do(i saw the 64 bytes as most cache-lines). I found this article by intel: http://software.intel.com/sites/default/files/m/d/4/1/d/8/3-4-MemMgt_-_Avoiding_and_Identifying_False_Sharing_Among_Threads.pdf which gave me some interesting insights, most importantly was this passsage:

Avoid false sharing but use these
techniques sparingly. Overuse can hinder the effective use of
the processor’s available cache. Even with multiprocessor shared cache designs, avoiding false
sharing is recommended. The small potential gain for trying to maximize cache utilization on
multi processor shared cache designs does not generally outweigh the software maintenance
costs required to support multiple code paths for different cache architectures.

are they suggesting(as L. Spiro did before), that in the long run it's not really worth it to try to prevent false sharing. What i read from this(and i might be reading it wrong), as that optimizing for this only realistically provides a small performance gain anyway?

@Matias Goldberg, your link actually really helped me understand more of what's going on. I think one of my biggest problems with writing well-designed multi-processor code is that i've been relying on the x86 architecture's threading mechanisms, it's clear to me now that my locks could very easily fail as-is on other processors which don't have such strict store/load guidelines.

now then, let's say i added memory fences, would this be sufficient to prevent the issues that are presented with the double-checked locking problem:



violatile int LockState = UNLOCKED;
linklist<Packet*> *InPackets = new linklist<Packet*>;
linklist<Packet*> *OutPackets = new linklist<Packet*>;
linklist<Packet*> *TempList = new linklist<Packet*>;
 
 
void NetworkThread(){
    RetrievePendingPackets(InPackets);
    if(LockState==LOCK_REQUESTED){
       [read fence]
       linklist<Packet*> *Temp = TempList;
       TempList = InPackets;
       InPackets = Temp;
       [write fence]
       LockState = LOCKED;
   }
   if(LockState==UNLOCK_REQUESTED){
      [read fence]
      for(link<Packet*> *Lnk = TempList->GetFirst(); Lnk!=null; Lnk = Lnk->Next){
         SendPacket(Lnk->value);
      }
      TempList->Clear();
      [write fence]
      LockState = UNLOCKED;
   }
   return;
}
 
void GameThread(){
   if(LockState==LOCKED){
     [read fence]
     for(link<Packet*> *Lnk = TempList->GetFirst(); Lnk!=null; Lnk = Lnk->Next){
        ProcessPacket(Lnk->value);
     }
     TempList->Clear();
     linklist<Packet*> *Temp = TempList;
     TempList = OutPackets;
     OutPackets = Temp;
     [write fence]

     LockState=UNLOCK_REQUESTED;
   }
   if(LockState==UNLOCKED){
      [write fence]
      LockState = LOCK_REQUESTED;
   }
   UpdateGames();  //Will push packets into the outpackets list.
}

i think i woudn't have to put read fence's in front of the access to LockState, as neither thread really care's when it see's the correct value, just so long as the linklist's are seen with the correct values when i begin working on them is all i care about...is my line of thinking correct here?

Check out https://www.facebook.com/LiquidGames for some great games made by me on the Playstation Mobile market.

King Mir

2,506

November 04, 2013 08:59 PM

It's easier to instead of using memory fences yourself, to use C++11 atomic primatives with the correct ordering specified. In this case you need minimally memory_order_aquire on each read of lock state (if statement condition), and memory_order_release on every write to lockState. The only exception is the if(LockState==UNLOCKED)LockState = LOCK_REQUESTED; Here, both can be memory_order_relaxed, because there is no critical section.

Alternatively don't bother with all that and just use the default behavior of atomic, changing violatile int LockState = UNLOCKED; to std::atomic<int> LockState = UNLOCKED; only.

With the memory barrier code you added, it's essentially correct, except you don't need the write fence before LockState = LOCK_REQUESTED.

EDIT: Speaking of if(LockState==UNLOCKED)LockState = LOCK_REQUESTED, you could just remove that, and reorder your code to only go back and forth once.

cache friendly lock free concept?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

cache friendly lock free concept?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines