Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!


1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


#ActualKing Mir

Posted 03 November 2013 - 01:18 AM

This is mostly what i care about, basically i'm exchanging the pointers, so let's say the threads are running on two separate cores, so basically, if i understand what's going on at that level(which i probably don't, hence these questions) ThreadA swaps the pointers, then in order for ThreadB to have these new pointer locations in it's cache, will ThreadA send an update message to ThreadB(doubtful, as in theory ThreadB doesn't know ThreadA also has these pointers), so to get the updated values, it has to read it back from ram(meaning ThreadA would write to ram), or potentially from a higher level cache? can someone explain to me what's going on at this level, and if i should be doing something else here to ensure i don't have more false cache sharing?

No that's not what's going on.

A thread on Core A writes one of the 4 variables, which puts it in it's L1 cache. Then a thread on Core B reads one of the variables, but noticing that the cache line of that variable was modified by core A, it must get it from the write buffer of Core A, which depending on the detail of the implementation involve going though the L3 cache, or otherwise has a comparable latency. It might then write to that cache line itself, so that Core A's L1 cached version is invalidated, and it must read from the write buffer of B. Then Core A might write to the cache line, which invalidates Core B's l1 and L2 cache of that line. And back and forth.

That kind of thing is unavoidable on LockState, and to a lesser extent on TempList, but if all 4 variables are on the same cache line, they are all shared as if they were one variable, incurring the performance penalties of sharing on all of them whenever any one of them is accessed. It's like they are one variable to the cache. So the solution is to ensure that all four variables are on a separate cache line.

One way to do this is instead of using pointers and a primitive integer, use a class the size of a cache line. You can even make it your linked list, if you make swapping the list cheep, and put enough padding at the end of it to fill a cache line.

#1King Mir

Posted 03 November 2013 - 01:16 AM

This is mostly what i care about, basically i'm exchanging the pointers, so let's say the threads are running on two separate cores, so basically, if i understand what's going on at that level(which i probably don't, hence these questions) ThreadA swaps the pointers, then in order for ThreadB to have these new pointer locations in it's cache, will ThreadA send an update message to ThreadB(doubtful, as in theory ThreadB doesn't know ThreadA also has these pointers), so to get the updated values, it has to read it back from ram(meaning ThreadA would write to ram), or potentially from a higher level cache? can someone explain to me what's going on at this level, and if i should be doing something else here to ensure i don't have more false cache sharing?

No that's not what's going on.

A thread on Core A writes one of the 4 variables, which puts it in it's L1 cache. Then a thread on Core B reads one of the variables, but noticing that the cache line of that variable was modified by core A, it must get it from the write buffer of Core A, which depending on the detail of the implementation involve going though the L3 cache, or otherwise has a comparable latency. It might then write to that cache line itself, so that Core A's L1 cached version is invalidated. So the Core A might write to the cache line, which invalidates Core B's l1 and L2 cache of that line. And back and forth.

That kind of thing is unavoidable on LockState, and to a lesser extent on TempList, but if all 4 variables are on the same cache line, they are all shared as if they were one variable, incurring the performance penalties of sharing on all of them whenever any one of them is accessed. It's like they are one variable to the cache. So the solution is to ensure that all four variables are on a separate cache line.

One way to do this is instead of using pointers and a primitive integer, use a class the size of a cache line. You can even make it your linked list, if you make swapping the list cheep, and put enough padding at the end of it to fill a cache line.

PARTNERS