Avoid false sharing but use these
techniques sparingly. Overuse can hinder the effective use of
the processor’s available cache. Even with multiprocessor shared cache designs, avoiding false
sharing is recommended. The small potential gain for trying to maximize cache utilization on
multi processor shared cache designs does not generally outweigh the software maintenance
costs required to support multiple code paths for different cache architectures.
are they suggesting(as L. Spiro did before), that in the long run it's not really worth it to try to prevent false sharing. What i read from this(and i might be reading it wrong), as that optimizing for this only realistically provides a small performance gain anyway?
@Matias Goldberg, your link actually really helped me understand more of what's going on. I think one of my biggest problems with writing well-designed multi-processor code is that i've been relying on the x86 architecture's threading mechanisms, it's clear to me now that my locks could very easily fail as-is on other processors which don't have such strict store/load guidelines.
now then, let's say i added memory fences, would this be sufficient to prevent the issues that are presented with the double-checked locking problem:
violatile int LockState = UNLOCKED;
linklist<Packet*> *InPackets = new linklist<Packet*>;
linklist<Packet*> *OutPackets = new linklist<Packet*>;
linklist<Packet*> *TempList = new linklist<Packet*>;
void NetworkThread(){
RetrievePendingPackets(InPackets);
if(LockState==LOCK_REQUESTED){
[read fence]
linklist<Packet*> *Temp = TempList;
TempList = InPackets;
InPackets = Temp;
[write fence]
LockState = LOCKED;
}
if(LockState==UNLOCK_REQUESTED){
[read fence]
for(link<Packet*> *Lnk = TempList->GetFirst(); Lnk!=null; Lnk = Lnk->Next){
SendPacket(Lnk->value);
}
TempList->Clear();
[write fence]
LockState = UNLOCKED;
}
return;
}
void GameThread(){
if(LockState==LOCKED){
[read fence]
for(link<Packet*> *Lnk = TempList->GetFirst(); Lnk!=null; Lnk = Lnk->Next){
ProcessPacket(Lnk->value);
}
TempList->Clear();
linklist<Packet*> *Temp = TempList;
TempList = OutPackets;
OutPackets = Temp;
[write fence]
LockState=UNLOCK_REQUESTED;
}
if(LockState==UNLOCKED){
[write fence]
LockState = LOCK_REQUESTED;
}
UpdateGames(); //Will push packets into the outpackets list.
}
i think i woudn't have to put read fence's in front of the access to LockState, as neither thread really care's when it see's the correct value, just so long as the linklist's are seen with the correct values when i begin working on them is all i care about...is my line of thinking correct here?