How to cache?

Started by
19 comments, last by Ravyne 10 years, 7 months ago

Caching on the CPU generally works like this: The entire memory space is chopped up into 64-byte chunks conceptually -- whenever you read any byte within that chunk, the entire chunk is loaded up. So if you read the first byte of a chunk, all the bytes after it are loaded too. If you read the last byte, all the bytes before it are read. If you read from somewhere in the middle, all the bytes before and after are read. This is typical of L2 and L3 caches; I think L1 cache sometimes (still? used to?) operate with smaller cache-lines that were 16 or 32 bytes. Each cache 'line' corresponds to several chunks (total memory / total cache), so it can be the case that reading from some other memory location will sometimes evict a line from the cache prematurely; a direct-mapped cache would always evict because there's only one cache line to go around for all the memory locations it corresponds to, but an n-way set-associative has has n 'slots' (an abstraction of a cache line) so that nothing will be prematurely ejected as long as you aren't actively reading from n+1 or more memory locations simultaneously (most desktop systems today are 4-way set associative, which is sufficient for most access patterns.

On a GPU, caching for texture access is prioritized for spacial-locality (in texture coordinates) rather than linear access patterns. I'm honestly not sure what the state of automated caching is for GPU compute, but there's also a small block of memory for each cluster of SIMDs in a GPU that's under direct programmer control, which fulfills the role of cache too.

throw table_exception("(? ???)? ? ???");

This topic is closed to new replies.

Advertisement