Hello,
I was recently watching Mike Acton's talk at the CPP Con, regarding "Data Oriented Design", and I've been thinking a lot about how I might be able to structure my game code to better take advantage of cache coherency. The ideal here would be that all major processing is done on contiguous arrays of data, directly in-order.
But I've been struggling on one particular aspect of this: the problem is that the ordering requirement between various systems is not the same. For example, suppose I have a typical flow of data: I have high-level behavior that updates positions/rotations, which then needs to be transformed by collision and collision response, and then converted into matrix transforms for draw calls. There are distinct ordering requirements for each step here: behavior will want data grouped by type, collision will want it grouped by spatial locality, and rendering will want it grouped by draw order/material/etc. Beyond that, these requirements will change as the state of my game changes ( not referring to static objects here ).
This means that I cannot simply set up fixed arrays and run my processing in the same order across the board. Somehow, I have to convert between the different orderings as I go. Specifically I am considering the efficiency of the reads from the source, the writes to the destination, and then the reads at the destination. I have thought of three possibilities to do this:
Maintain the data in linked lists per-system
- This is what I have done for many years to solve this problem. It is easy to use, and there is no major processing required to maintain ordering. But the issue is, you end up iterating ( reads at the destination ) out of order, and you have all the node values filling up your cache.
Use linked lists for indexing and copy to arrays per-system
- I have just recently tried this out. The iteration at the destination is noticeably faster, but the copy ( writes to the destination ) is now out of order and takes longer. Overall, it has been only a minor speed boost for me.
Copy to arrays directly and sort per-system
- I have not tried this out yet, but I intend to. Theoretically this could have the least amount of cache misses as everything is done in order. It's hard to believe that the cost of running the sort algorithm would not outweigh the gains, but I'll have to test that.
Does anybody else out there have solutions that I am not thinking of? I'd be greatly interested to know how people tackle this kind of problem.
Thanks for reading!