Back to General and Gameplay Programming

How to cache?

JohnnyCode · 2013-09-09T17:02:38

Hi. I have trouble understanding how cache is implemented. I understand all low level aspects of digital computations, yet this is totaly out of my scope. Is there a way to implement cache out of cpu management? Like, if there is a part of memory that is going to get accesed to read from, how would I cache it? The question is not wheather it would benefit me, the question is why it would not benefit me. My more precise problem is GPU cache. There are multiple SIMD (single instruction multiple data) threads running at my side, but they many times access the same memory to read from. This couses huge stalks if the memory is accessed by more SIMD threads. So my understanding is that cache is a memory of every individual SIMD thread that gets populated with memory that SIMD thread is likely to access. Is it true? My problem is, how to make sure that every SIMD reads from its own spot in memory. Only way I can distinguish between SIMD thread is index of pixel they are computing, this differs fo r every SIMD thread, running or going to run. My only idea of implementing cache for those threads would be: cache=new [numofthreads*cachesize]<=data likely to read by all paralel threads (say 80, that means 80 identical data often) thread { ......... var some=cache[threadID][particulardata]; ........ } Is this how it works? Thanks a lot for any clarifications!

General and Gameplay Programming Programming

Started by JohnnyCode September 05, 2013 03:31 AM

19 comments, last by Ravyne 10 years, 7 months ago

Ravyne

14,306

September 09, 2013 05:02 PM

Caching on the CPU generally works like this: The entire memory space is chopped up into 64-byte chunks conceptually -- whenever you read any byte within that chunk, the entire chunk is loaded up. So if you read the first byte of a chunk, all the bytes after it are loaded too. If you read the last byte, all the bytes before it are read. If you read from somewhere in the middle, all the bytes before and after are read. This is typical of L2 and L3 caches; I think L1 cache sometimes (still? used to?) operate with smaller cache-lines that were 16 or 32 bytes. Each cache 'line' corresponds to several chunks (total memory / total cache), so it can be the case that reading from some other memory location will sometimes evict a line from the cache prematurely; a direct-mapped cache would always evict because there's only one cache line to go around for all the memory locations it corresponds to, but an n-way set-associative has has n 'slots' (an abstraction of a cache line) so that nothing will be prematurely ejected as long as you aren't actively reading from n+1 or more memory locations simultaneously (most desktop systems today are 4-way set associative, which is sufficient for most access patterns.

On a GPU, caching for texture access is prioritized for spacial-locality (in texture coordinates) rather than linear access patterns. I'm honestly not sure what the state of automated caching is for GPU compute, but there's also a small block of memory for each cluster of SIMDs in a GPU that's under direct programmer control, which fulfills the role of cache too.

throw table_exception("(? ???)? ? ???");

How to cache?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

How to cache?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines