Sign in to follow this  
wqking

x86 PC, is it worth optimizing for cpu cache?

Recommended Posts

wqking    761
[ Please keep the topic to cpu cache only, no need to talk such as "optimize algorithm" first, that should be another topic. ]

Hi cpu optimizing ninjas,

What I need to do in one of my hobby C++ project, is to process small data (hundreds or thousands of bytes in a lot of (may be up to 10K or 100K) iterations eath time, such like changing some bits in some bytes, etc. And that may happen a lot of times too, such like 100K. 10K*100K is driving me to think about optimizing...

Question 1, if you have any similar experience on optimizing cpu cache missing on x86 PC, can you tell the result? Is it significant performance improvement?
I highly guess so, but some real life experience may give me more confidence.

Question 2, if you can recommend any very good guide on optimizing cpu cache in C++, that would help me to kick off fast. I'm googling and gdnet-ing also. But my current cpu knowledge is still at the 80386 era.
Currently I found this post is quite good to read,
[url="http://www.gamedev.net/topic/542247-cache-optimisations-for-beginners/"]http://www.gamedev.net/topic/542247-cache-optimisations-for-beginners/[/url]

Share this post


Link to post
Share on other sites
Hodgman    51334
1) IMHO, cache-usage optimisation is THE main low-level optimisation technique these days. CPU cycles are cheap, but memory is horribly slow in modern computers. I don't care about how many CPU cycles an algorithm takes, I only care about which parts of RAM it's accessing.

As an example:
At work I've been rewriting our renderer lately. The old renderer was not optimised for cache at all, but with the new one, I've been thinking about the cache [i]constantly[/i]. Every time I write a structure or allocate some memory, I consider how, when and why that data will be used by the CPU.

We knew that the old renderer was slow, but there were no real 'bottlenecks' in it -- when you profiled it, there wasn't an obvious part that needed to be optimised. It was just slow everywhere due to constant cache misses.

The old renderer took over 8ms of time to process a sample level, whereas the new re-written renderer takes 0.6ms to process the same level. That's more than a 10x speed-up, mostly due to caring about memory!

2) It's hard to take [url="http://macton.smugmug.com/gallery/8936708_T6zQX#593426709_ZX4pZ"]existing code[/url] and optimise it for good cache usage. Usually you'll have to rewrite your data structures.
[url="http://research.scee.net/files/presentations/gcapaustralia09/Pitfalls_of_Object_Oriented_Programming_GCAP_09.pdf"]http://research.scee...ing_GCAP_09.pdf[/url]
[url="http://gamesfromwithin.com/data-oriented-design"]http://gamesfromwith...oriented-design[/url]
[url="http://bitsquid.blogspot.com/2010/05/practical-examples-in-data-oriented.html"]http://bitsquid.blog...a-oriented.html[/url]
[url="http://www.slideshare.net/DICEStudio/introduction-to-data-oriented-design"]http://www.slideshar...oriented-design[/url]
[url="http://www.slideshare.net/DICEStudio/a-step-towards-data-orientation"]http://www.slideshar...ata-orientation[/url]

Share this post


Link to post
Share on other sites
wqking    761
[quote name='Hodgman' timestamp='1305862720' post='4813311']
The old renderer took over 8ms of time to process a sample level, whereas the new re-written renderer takes 0.6ms to process the same level. That's more than a 10x speed-up, mostly due to caring about memory!
[/quote]

8->0.6 is already a good reason for me to keep cpu cache optimization in my mind.
I will try to tweak my OOP-only data structure and code a little more DOP (data oriented) like.

Thanks for the timing data!

[quote name='Hodgman' timestamp='1305862720' post='4813311']

2) It's hard to take [url="http://macton.smugmug.com/gallery/8936708_T6zQX#593426709_ZX4pZ"]existing code[/url] and optimise it for good cache usage. Usually you'll have to rewrite your data structures.
[url="http://research.scee.net/files/presentations/gcapaustralia09/Pitfalls_of_Object_Oriented_Programming_GCAP_09.pdf"]http://research.scee...ing_GCAP_09.pdf[/url]
[url="http://gamesfromwithin.com/data-oriented-design"]http://gamesfromwith...oriented-design[/url]
[url="http://bitsquid.blogspot.com/2010/05/practical-examples-in-data-oriented.html"]http://bitsquid.blog...a-oriented.html[/url]
[url="http://www.slideshare.net/DICEStudio/introduction-to-data-oriented-design"]http://www.slideshar...oriented-design[/url]
[url="http://www.slideshare.net/DICEStudio/a-step-towards-data-orientation"]http://www.slideshar...ata-orientation[/url]
[/quote]

I also found this presentation is quite good,
[url="http://www.research.scea.com/research/pdfs/GDC2003_Memory_Optimization_18Mar03.pdf"]http://www.research.scea.com/research/pdfs/GDC2003_Memory_Optimization_18Mar03.pdf[/url]

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this