# Cache Coherence and Memory Barriers

This topic is 3181 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

http://en.wikipedia.org/wiki/Write_barrier
http://en.wikipedia.org/wiki/Cache_coherence

I understand it this way: "Cache coherence" is automatic in a programmers point of view and i dont have to do anything to get cache coherence. Cache coherence is managed by the hardware.

"Memory Barriers" manages, lets say the "Mainmemory coherence", and only is useful at Mainmemory and they are not used to get cache coherence.

This would also mean, that i dont need to have memory barriers, when my CPU is a multicore CPU that has a common cache for all cores as long as my data elements dont get out of cache?

anything wrong here?

lg
Matthias

##### Share on other sites
I think it depends on the hardware/compiler.

I believe that all processors used in desktop computers have a strict memory model, although it seems not be the case for processors like intel itaniums who have a weak memory model.

It should be safe to say that cache coherency is not a concern from a developer point of view. That doesn't mean a developer shouldn't be aware of cache contention performance hit.

Memory barriers are useful to prevent instructions reordering and out of order execution. Especially in the context of lock less programing.

yeah, i agree.

##### Share on other sites
Quote:
 Original post by EddycharlyIt should be safe to say that cache coherency is not a concern from a developer point of view.That doesn't mean a developer shouldn't be aware of cache contention performance hit.

That applies to business IT programmers. Games programmers should be more than aware of the effects of cache coherency, cache misses typically cost more in performance than anything else (except vector:float load-hit stalls).
It also has a profound effect on the Cell architecture of the ps3 where the SPUs have a very very very limited amount of memory, and so the size of structures is important.

##### Share on other sites
Quote:
 This would also mean, that i dont need to have memory barriers, when my CPU is a multicore CPU that has a common cache for all cores as long as my data elements dont get out of cache?

No, memory barriers are useful to guarantee ordering and visibility. And usually you'll have to ensure both (and atomicity) in multithreaded programs -- shared cache or not. That's the reason (e.g.) a Mutex in general has a build-in memory barrier. Simple example:
thread 1: read x into registerthread 2: read x, change it, write it backthread 1: oops, still has old x in register and doesn't know it should reload it

But (from my experience) you don't need to worry about this in general, only if you're using low-level synchronization.

EDIT: Ok, maybe I didn't understand the question. If you were only asking about hardware barriers I guess you are right.

##### Share on other sites
This might help.

Making Pointer-Based Data Structures Cache Conscious,Trishul M. Chilimbi, Mark D. Hill, and James R. Larus,IEEE Computer, December 2000. ftp://ftp.cs.wisc.edu/wwt/computer00_conscious.pdf

##### Share on other sites
Hey,

But is the code from Macnihilist not a Problem of cache coherence? thread 1 is using a cached value and it does not know that the value should be reread. I thought the mesi protocol is handling that: http://de.wikipedia.org/wiki/MESI

I ran into the same problem here:

{
int i = w->ID;
int j = 1-i;
flag = true;
victim = i;
while(flag[j] && victim == i) { MemoryBarrier(); };
//std::cout << "in";
};

{
int i = w->ID;
flag = false;
//std::cout << "out";
};

When i dont put the memoryBarrier into the loop, the loop is looping forever, even when the other thread puts flag[j] to false again.

##### Share on other sites
Quote:
 But is the code from Macnihilist not a Problem of cache coherence? thread 1 is using a cached value and it does not know that the value should be reread. I thought the mesi protocol is handling that: http://de.wikipedia.org/wiki/MESI

No, thread 1 and 2 can use the same cache and the problem still occurs. MESI only ensures that caches are coherent, i.e. they do not return different values for the same address. Reloading data from memory is the responsibility of the CPU. The reasoning is that the code only sees the boundary CPU <-> memory and don't has to care for a cache hierarchy. (Expect for performance reasons.) The other way around the cache doesn't have to care what the CPU has in it's registers.

Regarding the code: Not sure I understand it correctly, but it seems to be the problem I described. One thread doesn't see the changes made by another, because it has the data stored in a register. It has nothing to do with cache coherence.
Cache coherence would be an additional concern if the threads were running on different CPUs and wouldn't share a cache. Then the data in one cache would be outdated and this cache would have to pull in the correct data first from its neighbor.

To summarize there are two problems here:
1. Force CPU to reload data from memory (or to store to memory) (code has to do this)
2. Ensure the the loaded data (possibly loaded from cache) is the correct one (if it has been modified by another CPU) (hardware usually does this on SMPs, e.g with MESI)

Disclaimer: My knowledge about this stuff is a bit rusty, but I think I got it right overall...

##### Share on other sites
yeah ok, i think i start understanding it, too.

##### Share on other sites
Quote:
 Original post by LessBreadThis might help. Making Pointer-Based Data Structures Cache Conscious,Trishul M. Chilimbi, Mark D. Hill, and James R. Larus,IEEE Computer, December 2000. ftp://ftp.cs.wisc.edu/wwt/computer00_conscious.pdf

Looks like the link only works as http now.

• ### Game Developer Survey

We are looking for qualified game developers to participate in a 10-minute online survey. Qualified participants will be offered a \$15 incentive for your time and insights. Click here to start!

• 12
• 14
• 10
• 14
• 24