Back to General and Gameplay Programming

Code optimisation

General and Gameplay Programming Programming

Started by Leo_E_49 November 03, 2005 11:30 AM

42 comments, last by Last Attacker 18 years, 5 months ago

Leo_E_49

228

Author

November 07, 2005 08:46 AM

Thanks. :D That's just what I was looking for.

One question on caching, how do I do that in C++? (I did a quick google on it and didn't come up with any good tutorials, only more complex applications)

Anonymous

November 07, 2005 09:33 AM

It depends on which kind of caching you're talking about. Essentially, the first few types of caching mentioned above involve changing many slow requests for unchanging information to a single request that stores the information somewhere faster.

The last type (processor cache) is automatic in modern machnes - it automatically loads stuff from RAM into the cache, and when it needs to load something and there isn't more room, it writes from the cache to the RAM it represents to make room for more informtaion.

The best way to take advantage of cache is to make your memory accesses be to locations in memory that are near eachother, because that increases the chance that the cache will be able to hold onto data for a long time. If you're constantly accessing memory all over the place, the cache will have to constantly load and store data in RAM so it won't be able to eliminate the delay caused by accessing RAM.

A good example is a linked list vs an array - because an array is a single chunk of memory, a lot of the array can fit into cache so looping through the array won't have to read directly from RAM very often. On the other hand, linked lists can have nodes everywhere in memory, so the next node isn't as likely to be in the cache which means the program will ending up accessing RAM quite often. Of course, you can't just say this makes arrays better, because if your problem involves lots of insertions and deletions from the middle, using an array would kill your performance because it would mean constantly shifting around elements.

A decent compromise between the two data structures that would help cache coherency ('make the cache have the data you need') by locality of reference('referencing memory close to memory you previously accessed') would be to allocate arrays of nodes (say 32 at a time) and when adding a node to the list, attempt to put it near at least one of the nodes it will be pointing to. You could also make an 'optimize' routine, that would reorder the list nodes so they're in order (ie node[0] in a block points to node[1], and node[31] points to node[0] in the next block) before you do a lot of iteration over it. You'd have to test it quite a bit to find out how many iterations are required for the 'optimization' to actually increase performance, but overall such a structure should be quite a bit faster than individually allocated nodes even if you don't reoder the list because the nodes will still probably end up near nodes they interact with if the code tries to insert them close to either their parent or child nodes.
-Extrarius

Leo_E_49

228

Author

November 07, 2005 08:11 PM

Thanks, that explains it. :)

Last Attacker

172

November 08, 2005 12:32 AM

Quote:Original post by Michalson
Um, not to burst your bubble, but array vs. linked list is a high level optimization (a structural/implimentation optimization, as I talked about in my post). Because it is a high level optimization, you can get a huge speed increase. When was the last time you needed assembly to choose between a linked list and an array? Many people attempting assembly optimization who show off their "amazing" results have often only changed their high level implimentation, but wasted a lot of time doing it at the raw assembly level (and they don't realize it because the assembly is hard to read in the first place). It's often the case that you can then reverse that "assembly optimization" by implimenting the method (i.e. changing array to linked list) they used in assembly in a higher level language, and get faster results because of the presence of compiler optimization.

I realise what you are saying but the whole linked list thing was just to state that without lower level optimization, a high level optimization CAN suffer some performance hits although the algorithm is good.
I don't want to argue or anything, but in my opinion, high level optimization is good, very good infact ... but I also like to combine that with the low level optimizations that I know (and love ;) ) to get very good performance results.

"Take delight in the Lord and He will give you your heart's desires" - Psalm 37:4My Blog

Code optimisation

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Code optimisation

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines