Jump to content
  • Advertisement
Sign in to follow this  
PrestoChung

Cache behavior

This topic is 2888 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I used this site to learn about how caches work and I have a question

I have queried my system and found the cache line size to be 64 bytes.

Let's say I have a Matrix class that is 64 bytes.


If I have an array[] of these Matrices, when I request, say, the 3rd matrix (Matrices[3]) are any other adjacent matrices loaded into the cache? I think it might depend on the segment size but I haven't found any suggestions on how to see what segment size my system has.

Thanks.

Share this post


Link to post
Share on other sites
Advertisement

I used this site to learn about how caches work and I have a question

I have queried my system and found the cache line size to be 64 bytes.

Let's say I have a Matrix class that is 64 bytes.


If I have an array[] of these Matrices, when I request, say, the 3rd matrix (Matrices[3]) are any other adjacent matrices loaded into the cache? I think it might depend on the segment size but I haven't found any suggestions on how to see what segment size my system has.

Thanks.

Check out Gallery of Processor Cache Effects as well.
You can defiantly use instructions to force it to pre-fetch data. And some operations may cause it to predicatively pre-fetch data. But it depends on the hardware.

Share this post


Link to post
Share on other sites
It depends on things like in what language you have the array (different language structure memory differently), the exact details of the type of the Matrix class (some languages arrange objects in memory differently depending on their type) and how exactly the matrix array is allocated.

Share this post


Link to post
Share on other sites

I used this site to learn about how caches work and I have a question

I have queried my system and found the cache line size to be 64 bytes.

Let's say I have a Matrix class that is 64 bytes.


If I have an array[] of these Matrices, when I request, say, the 3rd matrix (Matrices[3]) are any other adjacent matrices loaded into the cache? I think it might depend on the segment size but I haven't found any suggestions on how to see what segment size my system has.

Thanks.


Don't forget that you have to specifiy an alignment for this to make sure your matrices are aligned to certain byte boundaries to make cache work effieciently for you. Even though you create and array of 64 byte elements this doesn't mean it is alligned with the byte boundary if there are values before this array in your class for example.

Share this post


Link to post
Share on other sites
It's probably worth pointing out that to get good cache performance it definitely helps to have an understanding of how caches work internally, but ultimately there's so many different processors with different cache sizes, etc, that you will struggle to target each one individually. Following a couple of simple rules is normally enough.

Keep frequently accessed classes as small as possible
Allocate those classes in blocks or in a contiguous array

Variations on this would be if you have a big class that you can't get any smaller, separate the "hot" data (data accessed frequently) from the "cold" data (infrequently accessed). Allocate the hot data in one block and the cold data in another, then construct your class bringing the two together.

There is lots more to it but if you follow those main points you'll probably be making better use of the cache.

Share this post


Link to post
Share on other sites
Another useful rule when trying to be cache friendly is to iterate forwards in memory when performing accesses.

Desktop CPUs perform reasonably agressive pre-fetching on the assumption that if you touch this block you'll more than likely want the next; the i3/5/7 series has a very agressive pre-fetcher so foward walking in memory is a very good case of it.

On the flip side if you walk backwarks along cache lines the cpu won't be pre-fetching the data so every 64bytes you'll trigger a cache miss and end up waiting around while the data is fetched from main memory.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!