Jump to content
  • Advertisement
Sign in to follow this  
Nicholas Kong

What is bad about a cache miss since a initial cache miss is inevitable

This topic is 1510 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

If a cache miss occurs, then the instruction with the cache miss will be send to CPU cache. This means if the instruction is processed by the CPU, it will be a cache hit.

So I don't see stressing of what makes a cache miss bad if it is bound to happen on the first try or attempt for all instructions processed by the CPU?

Share this post


Link to post
Share on other sites
Advertisement
Instructions have locality to them.  If you don't have any branches, the CPU will load a series of instructions into the cache, so you'll only get a cache hit for the size of the cache.  It's not "one miss per instruction", but rather "one miss per N instructions", where "N" is the cache line size divided by the instruction size.
 
In reality, it's going to be even less than that because modern CPUs prefetch instructions, meaning they're basically streaming in more as they consume them.
Edited by SeraphLance

Share this post


Link to post
Share on other sites


Even still this is not a lot of data, and is a big part of the reason why some games run different parts of their simulation at different rates (for example, you might run simulation logic at 30fps, or only run 1/4th of your AI entities each frame.)

 

Not entirely related to the question by the OP, but this is one of those magical things I think a lot of people (including myself) don't realize when starting out or even for many years -- you don't need to do everything every frame, or even on a single frame at all

Share this post


Link to post
Share on other sites
Woe is he who uses virtual dispatch on his hot loops. -- ancient Chinese proverb

Yes, certainly the cult of OOP papers over their ongoing transgressions and its leadership still encourages blind adherence to doctrine that can be harmful. But as Frob pointed out so well, at least you're getting what it is you're paying for. A wiser person would simply avoid virtual dispatch where it wasn't necessary.

Data-Oriented Design techniques, its worth noting, ought to positively impact cache efficacy for both data (increasing spacial/temporal locality, splitting data structures along usage lines) and code (sorting polymorphic collections such that all foo subclasses are processed at once, then bar, and so on).

There's performance to be gained for sure, but you ought to be fairly well off if you haven't done something painfully naive. I'd pretty much exhast optimizing D-cache behavior before examining I-cache behavior (though would design from the start with both in mind).

Share this post


Link to post
Share on other sites

The cache is like up front of data, wheather written to or red from, that your program performs over, on multiple levels of cached continuous data, up into the registers of CPU themselfs. To put simply what it exagerates for a thread on its exclusive read write memory, is that it does not realy perform write operation onto higher level if the higher level is still not red from, and does not read from higher level until reading cannot read in the level. If a cache miss occures, the entire level is written to higher level- since it is obvious it is in higher level already (notice that the memory is still not in RAM yet) and operation on memory tries to continue on higher level, if cache hit occurs, entire lower level is fullfilled with the localized memory and operation continues to registers, and after it is done in  registers, moves to memory of next instruction the same way in its closest cache, and swaps them higher and higher untill cache hit.

 

For example, when I was young I measured my PC speed by an unintentionaly super cache friendly loop:

 

float* fs=new float[1000000];

for (int i=0;i<1000000;i++)

{

    *(fs+i)=*(fs+i)*3.7f;

}

 

you don't wanna know how bestialy fast the 4MB of data is delt with (read and write)

 

 

Share this post


Link to post
Share on other sites

Most important things have been said but ... You also need to ask your self what does the cpu do when a cash miss happens ? Short answer, most of them will just spin and wait; this results in waisted cpu cycles. The table below gives you a idea of those cycles and average time required on most modern desktop systems:

 

[attachment=24330:cpu-cach-info.png]

* Screen from a article I'm working on.

Edited by Nemo Persona

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!