Data oriented design in games

Started by
4 comments, last by Hodgman 10 years, 4 months ago

I'm learning about data oriented design, and I've already figured it out a bit. I decided I'd start a discussion about it because I've never really seen anyone talking about it. For me, component based entity systems are easy to understand in a data oriented way. It basically turns game entities and components into dumb containers of data, that are operated on by systems. However, what about other systems in games? How can data oriented design be applied to them? Resource loading seems like a good easy candidate, you could group resources together in the same areas in memory based on the type of resource, the same way that you group components based on types.

One of the ones I can't figure out though is the event system/message pump. With message pumps you have a bunch of objects (messages) being sent around to a bunch of other polymorphic types (listeners) which seems like it's against the whole idea of data oriented design. The only idea I have would be if an event system was made so that the same event types were all grouped into the same memory areas, with each event type being processed in groups and sent to listeners. Does this violate data oriented principles?

What other systems can you think of that present interesting challenges for data-oriented design? Networking would be one that I can think of off the top of my head. And what about parallelism? I've heard it's a major benefit of data-oriented design, but with objects having dependencies on each other and systems being able to operate on multiple components (the rendering system would need to access the transform component) don't you just end up with a synchronization nightmare like with regular object oriented design?

Advertisement

a.) Don't use DOD just because it is "nice". If it doesn't fit, use something else. Make the decision on a per-problem basis.

b.) Ideally you apply the same execution (read once from memory and then often from instruction cache) onto sequentially organized data (some reads from memory and a great percentage of data cache hits). When you have inhomogeneous data, e.g. the command stream for the low-level rendering system, i-cache hits may be less frequently, but storing the commands as blobs utilizing a stack/linear/sequential allocator helps greatly for data cache hits.

c.) An event oriented design is data-driven but not data-oriented, so to say. You cannot handle events by sorting by type, because events usually need to / should be processed in timestamped order. However, treating them as blobs and using a ring buffer allocator as storage still allows you to use an approach very close to those introduced in b.)

d.) Consider to drop the use of an event system if possible. An event system is a tool to deal with asynchronous, err, events. A game is an application that can be well structured (see e.g. Jason Gregory's book except on Game Engine Architecture about the game loop) in an overall synchronous way.

e.) The above has another effect, too: You can rely on certain sub-systems to have completed when calling other sub-systems to begin, weakening / removing some need for high-level synchronization.

f.) Parallelism can be done using vector processing (SSE, GPU) on the one hand or multi-threading on the other hand. Vector processing does not need more synchronization than scalar processing, but depends heavily on data structuring.

g.) Multi-threading, on the other hand, is a totally other beast. I second the meaning that multi-threading is complicated in general. However, multi-threading can be better controlled if being done for example so: The main thread does nothing than collecting input events from the OS / hardware and pushes them into a synchronized pipe (this is mainly done so to guarantee that no input gets lost, and some OS's require their event loop to be processed on the main thread). Another thread runs the game loop (inclusive reading input from said input pipe), and at last as a result fills in a render command queue. Another thread reads the render command queue and feeds the graphics API. To be efficient, the render queue is double buffered, of course.

What about the cache performance of more complex game systems? For example: an object with a Health component is hit by one with a Damage component. If the components are updated by type, then when we check collisions for all the Health components, we also have to check how much damage to apply (from the Damage component attached to the game object that hit the Health component's object). Is this still a predictable enough access pattern for the CPU cache? Or would jumping back and forth between accessing the Health and the Damage components cause lots of cache misses?

Is there a reason you wouldn't just have the damage component send a message to the health component telling it to subtract some amount? Or could you just get around the back-and-forth by updating the damage components first, and then the health components?

I've never worked with a component-entity system (always just used standard OOP), but it's something that I'm curious about and starting to think about incorporating.

Inspiration from my tea:

"Never wish life were easier. Wish that you were better" -Jim Rohn

soundcloud.com/herwrathmustbedragons

What about the cache performance of more complex game systems? For example: an object with a Health component is hit by one with a Damage component. If the components are updated by type, then when we check collisions for all the Health components, we also have to check how much damage to apply (from the Damage component attached to the game object that hit the Health component's object). Is this still a predictable enough access pattern for the CPU cache? Or would jumping back and forth between accessing the Health and the Damage components cause lots of cache misses?

IMHO this cannot be answered clearly because "it depends"! How much does it cost to bring data in a well defined order opposed to what performance can be gained?

Let's think of an FPS. There are up to 10 characters, damage is applied suddenly by shooting. Your game loop runs 60 times per second. Even if each character would shoot 2 times per second you get an average of 1 shot per 3 loop iterations. That is hardly something worth any effort for optimization. That doesn't mean that e.g. Health isn't stored well organized in a table inside a Soundness sub-system.

Now let's think of an RPG with 80+ NPCs, and poisoning, illness, bleeding, curses, and what else that can be applied by both the players as well as NPCs. Further there are buffs and de-buffs and such. In other words, damage may occur not (only) suddenly but lastingly ((or need it to be "lasting" only?)). We can think of tables for health values as well as different sorts of lasting damage, hosted in a damage sub-system and ordered the same way, and being processed on an update() invocation basis. This still will probably not make much a difference in performance, but it is relatively easy to implement that way.

Notice that the detection of sudden damage should not belong to Damage and Health directly but to the more low-level CollisionVolume stuff. Damage is then an effect caused as collision response, so to say. If you need to decouple collision detection from damage application, you can again use queues that will be written by the collision detection and read by the aforementioned update() method.

My 2 €-Cents.

Firstly with DoD, keep in mind that it's not a formal term (yet). There's no wikipedia page for it, and there's no consensus for what it means. Basically, a bunch of different dev's started preaching something like "stop thinking about your current design methodologies for a moment, and just have a raw look of what you're actually doing with your data" or "at the base level, all programs are just input->process->output -- forget about your abstractions for a moment and think about what you're asking the computer to do please!".
Because it's not formally defined, trying to practice DoD is necessarily a vague art wink.png

However, what about other systems in games? How can data oriented design be applied to them?

You break down the operation into a flow of input->process->output nodes, and try to maximize the size and contiguity of those input/output blobs.

One of the ones I can't figure out though is the event system/message pump. With message pumps you have a bunch of objects (messages) being sent around to a bunch of other polymorphic types (listeners) which seems like it's against the whole idea of data oriented design.

Sometimes the simple answer might just be: you're doing it wrong, throw it out and start again because this whole design stinks.
Is a generic/polymorphic event pump really required and/or the best solution for your problem? tongue.png

And what about parallelism? I've heard it's a major benefit of data-oriented design, but with objects having dependencies on each other and systems being able to operate on multiple components (the rendering system would need to access the transform component) don't you just end up with a synchronization nightmare like with regular object oriented design?

The issue is that you're thinking about mutable objects. This is typical for OOP code: you make an object, then you call lots of functions on it, and it's state changes over time. If that object is shared between many processes, then ordering and synchronization becomes extremely important.

If you'd come from a functional programming background, then this would seem very strange and alien, and you'd prefer to write systems that rely on immutable objects. When you don't have mutable objects, then this problem can't even arise! You don't have to make sure that the physics-system has written to the transform array before the renderer reads from the transform array, because you don't even have a mutable transform array -- instead, the physics system returns a brand new (now immutable) array, which is consumed by the renderer (and no further sync'ing is requried because the array is read-only). You can't get the ordering wrong either, because the renderer can't possibly begin until after the physics has actually created it's input data!

If you can arrange your program as a DAG of processes (like in the dataflow paradigm) then a lot of these problems go away. You don't have to strictly use immutable state, but the flow-graph makes the synchronisation obvious. One of the main pillars of DoD is to do bulk-processing on large amounts of data, rather than the OOP approach of working on only one object at a time.
e.g. if you write code that calls two systems like this, then you can see the bulk data coming out of one system and being sent to the other:


DoPhysics( &transforms );//write new transforms
DoRendering( &transforms );//draw stuff using new transforms

If you want to use many threads, then it just looks like this instead:


DoPhysics( &transforms );//write new transforms, using many threads
DoSomethingElse();
WaitForPhysics();//ensure all threads have finished with the physics system
DoRendering( &transforms );//draw stuff using new transforms

What about the cache performance of more complex game systems? For example: an object with a Health component is hit by one with a Damage component. If the components are updated by type, then when we check collisions for all the Health components, we also have to check how much damage to apply (from the Damage component attached to the game object that hit the Health component's object). Is this still a predictable enough access pattern for the CPU cache? Or would jumping back and forth between accessing the Health and the Damage components cause lots of cache misses

You can have the collision system, the spell system, the bullet system, etc, all write into an array of damage events (with a target specifying which health component will be affected, a damage type and an ammount).
After all those systems have run, you've then got an array of damage events. You can then pass this to the Damage system, along with the array of Health components, the array of DamageType components, the array of Armour components, etc...
As for performance, it depends tongue.png If all your groups of components involved in the current process are smaller than the L1 cache then you're pretty good! If they're smaller than the L2 cache, you're still pretty ok. If they're larger than that, then you'll have to actually pay attention and try and be sensible in your access patterns.

When you're working with more than one array of components, often you can iterate one in a linear order, but you have to use random-access indexing of the other array(s). Often, if the process is complex enough, you can unroll the loop slightly and use prefetching to make these random-accesses perform better, but all of that is micro-optimizing that can be done later.

IMHO the most important thing is just that you don't simply end up updating each group of components in some arbitrary order, allowing them to magically communicate with each other -- instead there should be some deliberate ordering to the update functions, that's logically determined by the data-flow, as in the above code snippet.

This topic is closed to new replies.

Advertisement