Some questions on Data Oriented Design

Started by
5 comments, last by Norman Barrows 8 years, 1 month ago

My favorite part about programming is the opportunity to keep learning new things, and recently one of those things has been Data Oriented Design. Having taken a class in optimization last year, I have a new appreciation for cache coherency (the final project was to optimize a program written in assembly, 90% of the optimizations I achieved came from simply having better data layout), so DOD is something I'm interested in pursuing.

After having read some the commonly cited articles on it, as well as having watched Mike Acton's cppcon talk on the subject, I have a much better grasp on this topic, but I still have a few questions.

It seems that Data Oriented Design is commonly brought up as the antithesis to Object Oriented Design. To me, OOD has always been about encapsulation, and it seems having that here would only be an even greater advantage. To combine the two, couldn't you just shift your thinking from objects as the data, to being views of the data? That way you could reorganize your internal data structures around every day, and as long as your OOP interface remains stable the rest of your code will be perfectly happy.

In all of the examples I've seen, it seems that you're essentially building new data structures specifically for each function, based purely on what data that function needs and how it will be operating on it. If that's the case, couldn't you be losing a lot of performance just transforming the data from one function so that you can pass it into another?

A lot of Mike Acton's criticisms of other code bases involved how people like to think of each function working on a single piece of data at a time, even though that's often not the reality of the situation. His solution to that was to refactor functions into operating on an array of values, rather than a single value. However, why couldn't you simply run that function in a loop, rather than putting the loop into the function?

At the beginning of his cppcon video, he listed Insomniac's priorities when it comes to code. I really appreciated him doing that, because he seems to be conveying that nobody actually thinks that this model somehow makes code more "beautiful" than before; rather that it solves the real problems that game developers face when shipping AAA games, in the least hideous way possible. My original resistance (hah) to DOD stemmed from my assumption that people were saying this makes code prettier, and I still really don't think it does.

Which brings me to my final question: as a hobbyist, my priorities are rather different than his. While I value performance, as long as things are running quickly enough I really don't care. Given that, I'd say my main metric for code quality is flexibility and beauty, which (from what I've seen) DOD doesn't necessarily promote. Can anyone here give some advice on how/where the best ways/places to apply this model would be, while still maintaining clarity and sanity in the rest of the code base?

Advertisement

The two are not necessarily at odds. The way many people implement their classes is a poor fit for data flows, but that does not mean object oriented programming is at odds with data driven programming.

When it comes to programming paradigms, code can often follow many at once. Your code can be object oriented, and event driven, and data driven, and more all at the same time.


To combine the two, couldn't you just shift your thinking from objects as the data, to being views of the data?

That's assuming a person thinks of objects that way to begin with. My view (when in c++) is that a class is a cluster of behavior. I do not care what the underlying details for data layout are, nor do I care about the internal implementation details of what the data is or how it is laid out. My biggest concern is that I have a collection of verbs or verb-phrases that I can do with an item. It could be Move() or WalkTo() or RunTo() or SnapToGrid(), how it happens doesn't matter.

Data oriented typically is more about containing collections of objects as a quickly-enumerated series, or as as a sequence of actions.

So combining them, if you are working with a collective set of things (lots of points, lots of particles, lots of ...) you can certainly create a cluster of behavior that operates on the collective set. You can additionally create a cluster of behavior that operates on a single item within that collective set.


Which brings me to my final question: as a hobbyist, my priorities are rather different than his. While I value performance, as long as things are running quickly enough I really don't care. Given that, I'd say my main metric for code quality is flexibility and beauty, which (from what I've seen) DOD doesn't necessarily promote. Can anyone here give some advice on how/where the best ways/places to apply this model would be, while still maintaining clarity and sanity in the rest of the code base?

The famous Donald Knuth quote still applies. "Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."

Almost always we should write our code to be written and understood by humans first, by computers secondly. That is what you called "flexibility and beauty". Knuth's ad-hoc value was "about 97% of the time", which is still roughly true.

Sometimes, with an ad-hoc value of about 3% of the time, it is critical to do things in a way driven by computer performance rather than human interpretation. When they found they should be critically reviewed.

If you don't know what areas in your code it applies, pop open a profiler and take a look. Usually they'll be quite obvious.

When code is slow and a profiler reveals a high cache miss and cache eviction rate, that is a good sign that reviewing your data access patterns will help.

Over time you learn to see patterns that have caused issues in the past and take steps toward them. You still need to profile before, during, and after making changes for performance reasons, but if you know one pattern is slow and a different pattern is faster and you can choose between them, generally choose the faster one.

In all of the examples I've seen, it seems that you're essentially building new data structures specifically for each function, based purely on what data that function needs and how it will be operating on it. If that's the case, couldn't you be losing a lot of performance just transforming the data from one function so that you can pass it into another?


Don't forget that a lot of the examples are either taken out of context or were conceived in isolation. I would say that building new data structures for each function is only a first step. Remember that DOD isn't (really) about dogmatically enforcing any particular kind of design, as OOP practitioners often do - it's about working out what data you have, what data you need, and how best to do that transformation between the two given the hardware you have. If performance is suffering from converting data around, and you can get a performance gain from consolidating some of the different structures, then it would be perfectly reasonable to coalesce some or all of the structures, perhaps applying similar principles to Casey Muratori's "semantic compression." If you haven't come across it, this blog post may be of interest to you.

A lot of Mike Acton's criticisms of other code bases involved how people like to think of each function working on a single piece of data at a time, even though that's often not the reality of the situation. His solution to that was to refactor functions into operating on an array of values, rather than a single value. However, why couldn't you simply run that function in a loop, rather than putting the loop into the function?


On a high level, he's arguing for thinking on the system level rather than the object level. Having each function operate on a set of objects rather than a single object forces you to think about the "general case," which often is sets of objects rather than singletons. It also leaves a window open for cases where you WILL need to operate on sets of objects rather than singletons even if that doesn't turn out to be the general case at first.

On the implementation level, sometimes there may be certain optimizations that you can apply that require knowledge of the data as an aggregate even when you're only mutating a single datum. One can do some pretty interesting things with SSE intrinsics, for instance. Or maybe there's some expensive operation that produces data that every element in the array will need and is constant across the array, meaning that it's both faster and simpler to extract from the loop.

Can anyone here give some advice on how/where the best ways/places to apply this model would be, while still maintaining clarity and sanity in the rest of the code base?


I've found it useful to think about my code in terms of what operations I need to do, while I think about my design in terms of what data I need. Suppose I need to draw a tree on the screen; the data I'm trying to get is a bunch of pixels - or, since I'm likely to be going through the hardware, a draw call. The first question I ask is, what data do I need to generate that draw call? Well, I might need a transform and a shader, and that shader consumes a mesh, a texture, and lighting information so I'll need those as well. So my next questions are, where does that data come from, and how do I store that data efficiently given where it's coming from? Maybe the transform comes from a scene graph, and the assets come from disk. But loading things from disk is really slow, so I'll want to cache those because that's more efficient. Meanwhile, the scene graph is being populated and maintained by the gameplay code... And so on.

For maintaining clarity and "sanity" (which means different things to different people), in my own code I do somewhat as you suggest, actually. OOD is useful for enforcing design contracts (eg. state invariants), and with DOD, the design contracts tend to move away from applying to single objects to applying to systems of related data. If for instance I'm using a structure of arrays type of system, I might use OOD to enforce that entities represented by those arrays are consistent with one another. The entities themselves are not "objects" - the system is an object.


To combine the two, couldn't you just shift your thinking from objects as the data, to being views of the data?

I don't think I understand this. But Classes are exactly the same as structs. They can just be views of data, only with methods. Ogre 2.1 does this and has seen a dramatic speed up without fully conforming to the "extremeness" that people take this to.


In all of the examples I've seen, it seems that you're essentially building new data structures specifically for each function, based purely on what data that function needs and how it will be operating on it. If that's the case, couldn't you be losing a lot of performance just transforming the data from one function so that you can pass it into another?

I... don't think I have ever seen a DOD implementation that builds a new data structure for each function. usually it's only a few portions or the entire struct/array that is passed in and edited on the output. But honestly this really just depends. In some cases it could just be super fast to do it that way. Some cases... maybe not.


However, why couldn't you simply run that function in a loop, rather than putting the loop into the function?

You could. But Mike is talking about performance here. If you think about what is happening on the lower level... if you do it with passing items to the function through a loop this happens.

Processor bounces through ram to locate a piece of data that it needs to process and stores it in a registry.
The processor then calls the function.
It does some processing to it, changes data... spits out new data... etc.
When it's done... it needs to bounce through memory again (which can be costly for cases where performance matters.)

if you pass the entire array to the function and let the function sort it out, you get this result instead.

Processor bounces through memory, and grabs the address of the first object.
The Compiler has noticed that this function iterates through an array,so it does a bit of prediction. It loads in several reveral items to it's registers as it processes.
Once it'd done processing, and moves to the next register, the processor will just need to iterate a single size through the memory and it automatically has it's next piece of data.

That is significantly faster than constantly bouncing through the memory.


My original resistance (hah) to DOD stemmed from my assumption that people were saying this makes code prettier, and I still really don't think it does.

Which brings me to my final question: as a hobbyist, my priorities are rather different than his. While I value performance, as long as things are running quickly enough I really don't care. Given that, I'd say my main metric for code quality is flexibility and beauty, which (from what I've seen) DOD doesn't necessarily promote. Can anyone here give some advice on how/where the best ways/places to apply this model would be, while still maintaining clarity and sanity in the rest of the code base?

Best place to apply this model...

Honestly... low level systems that is guaranteed to process a LOT of data.

full DOD will have little use in a GUI, but it means a LOT when you're updating the transforms of your objects, doing physics, computing matrices, culling, etc.


A lot of Mike Acton's criticisms of other code bases involved how people like to think of each function working on a single piece of data at a time, even though that's often not the reality of the situation. His solution to that was to refactor functions into operating on an array of values, rather than a single value. However, why couldn't you simply run that function in a loop, rather than putting the loop into the function?

I can think of two reasons:
1) If you run a loop just to call a function, whatever register your loop iterator is stored in has to be pushed and popped from the stack at each iteration, requiring (at least) an L1 store and L1 load per iteration. If instead you run the loop inside the function, you can potentially save that load and store, which means as much as 2ns saved per iteration. In a tight loop with thousands of iterations, it is well worth investing the time just for that.
2) It would also increase cache locality of the data you're operating on, reducing potential for L1 and L2 cache misses or even a relatively expensive read from main memory.
quick edit:
3) Potential opportunities to unroll the loop with SIMD operations.

BUT it can be incredibly awkward to do this every single time in practice, and it's probably not too common for such a micro-optimization to pay off big time, especially when it comes to game development.


The famous Donald Knuth quote still applies. "Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."

(this is not aimed at you, frob, but as part of the answer to the OP, and also kudos on including the second sentence)
Knuth actually wrote this in an article discussing the future availability and utility of GOTO statements in high level languages back in 1974 (I think).
GOTO statements make code harder follow and look quite ugly compared to higher-level control flow constructs (for, while, do{}while, etc). At the time, a couple popular languages were considering removing GOTO from the next version of the standard; effectively requiring the use of the more attractive control flow constructs at the expense of code that really benefits from the performance advantages of using GOTO.
He was making the case that GOTO should not be removed from high level languages, and merely that programmers should avoid using it unless they know that the performance benefit is worth it. In the article, he showed many loop examples in which the use of GOTO could save several instructions from being executed (for context, the 8080 was released by Intel that year, clocked at 2MHz. Intel's previous offering, the 8008, was clocked at 200KHz. In tight code, saving handful of instructions could improve performance dramatically).
This principle also applies to data-driven programming. Make the effort where you know it'll count. The rest of the time, if it's a hassle, don't bother.

It seems that Data Oriented Design is commonly brought up as the antithesis to Object Oriented Design. To me, OOD has always been about encapsulation, and it seems having that here would only be an even greater advantage. To combine the two, couldn't you just shift your thinking from objects as the data, to being views of the data? That way you could reorganize your internal data structures around every day, and as long as your OOP interface remains stable the rest of your code will be perfectly happy.

You're not wrong at this. In fact I always say OOP and DOD are not necessarily opposites. One is about the relationship between constructs called "objects" while the other is about how data is layed out in memory and processed.

However, there is no such language currently that separates the two (in fact, current language makes them almost antagonistic in most cases); and there are several challenges a language would have to solve if it would want to make reconile OOD & DOD; probably would translate in having different trade offs we see now. Nothing is free.

Can anyone here give some advice on how/where the best ways/places to apply this model would be, while still maintaining clarity and sanity in the rest of the code base?

Others have done a good job explaining it so I won't repeat it. Basically boils down to hotspots where performance becomes the highest priority; and not being an idiot. I once optimized the performance of a 3D visualization tool by 3x just by changing some per-vertex data they were using every frame from being stored inside an std::map to an std::vector.

There's one more thing I will add: DOD is applied best to known problems. Problems that are well understood and/or you've already written an implementation. To apply DOD you need to know the data and its usage patterns. To know that, you already need to have had something working to analyze. Or have read it somewhere.

While the general notions can apply to OOP (i.e. keep data contiguous, don't use convoluted indirections, avoid excessive allocations) applying the fun stuff of DOD is usually a bad idea for an algorithm you know little about, have little experience, or is in prototype stages.

at its root, data oriented design is a optimization method.

you build the code from the hardware out.

IE you start with hardware friendly code and data organization at the lowest level, and let the design of that low level code drive the design of the entire system. so ideally, no converting for processing, as you always keep things "data oriented" at all times.

John Carmack's alt.plan.B for quake works well: "ok, what CAN the PC do really really fast?" (1)

then write things so it does it that way - THAT'S data oriented design.

(1) when developing the quake engine, Id hit some bottleneck, asked the above question, and came up with some slick data oriented optimization like line caching or something like that. can't recall the details, but Abrash mentioned it during his presentation on the quake engine at CGDC '96.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

This topic is closed to new replies.

Advertisement