Component Entity and Data-Orientated Design

Started by
8 comments, last by Norman Barrows 9 years, 12 months ago
I have some spare time and have decided to play around with Component-Entity solutions. One of the main 'advantages' that is often touted is the ability to organize data in a cache friendly manner by grouping components of similar type in the same block of contiguous memory. The oft used example being processing the 'position' component of an entity.

Thing is most usable systems will require multiple components. A rendering system will require not just the position component, but facing, perhaps size/scaling, a model component, perhaps a shader component, and a host of others. Likewise a physics system will require position, facing, velocity, momentum, and another half dozen components. When working with many entities, all with varying number of components, there's no way a given entities components will fall in a nice order. I can't see how cache coherency can be maintained in any but the simplest of scenarios.

Any ideas how memory management of a CE system should be approached in a real program?
Advertisement

I can't see how cache coherency can be maintained in any but the simplest of scenarios.

i only found one real-world example of C-E implemented for speed reasons during my research on C-E systems. it was a AAA game (playstation i think) they were not fast enough. they'd done everything else already. they redesigned their data structures and code to be more cache friendly where they could. it was just enough. the "C-E" system they came up with bears little resemblance to the "C-E" systems usually seen. all other cases were C-E so you didn't need source code access to define an entity type in the game (IE as a level designer tool).

C-E has 2 definitions:

1. the common definition: a level designer tool that lets you define entity types without source code access.

2. a last ditch desperation optimization method involving reorganizing code and data into a more cache friendly format. "Cache friendly optimization" would be a more accurate term. the term C-E is used, because the basic C-E concept of "all components of a given type together" is similar to the approaches used in "cache friendly optimization". Note that the level designer tool aspect of C-E can be an _unintentional_ side benefit of "cache friendly optimization". but a typical level designer tool C-E implementation is, as you note, not necessarily any more cache friendly than other methods of organizing code and data. in fact it adds complexity, because you're implementing a relational database app in the game that lets users define entity types (define database record layouts), and manages all those databases of different component types, as well as the entity types and entities databases.

one fundamental weakness of C-E systems is that you still need code access to add new component types. so you're limited to existing hardcoded, predefined component types when defining a new type of entity. just like a database app, they only have ints, floats, strings, text fields, images, etc. if you want to add say, an entire file, or a compete webpage as a field, it has to be one of the types already supported by the database app. same idea with a c-e system (IE a relational database system) in a game.

so it would seem that the only real reason for C-E is as a level designer tool.

and if you're desperate enough to need it for speed, god help you! <g>.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php


so it would seem that the only real reason for C-E is as a level designer tool.

You really don't understand them if you think that's the case. There have been a lot of recent posts/articles here talking about the benefits of a component entity system, you should look them up.

I don't think either of your two reasons are the driving reasons people go for component entity systems. But, #2, having a naturally cache-optimized memory layout, is certainly a good one.


Any ideas how memory management of a CE system should be approached in a real program?

The only real solution to this that I have heard of involves duplicating the (e.g. position) data in many other components where it is needed. And I guess try to minimize sync costs by keep track of when position has changed for a particle component, and only sync it then.


You really don't understand them if you think that's the case. There have been a lot of recent posts/articles here talking about the benefits of a component entity system, you should look them up.

i did both here and on google, spent a few days on it. may still have the links.

all references to C-E systems seemed to lead to the conclusion above.


The only real solution to this that I have heard of involves duplicating the (e.g. position) data in many other components where it is needed. And I guess try to minimize sync costs by keep track of when position has changed for a particle component, and only sync it then.

this sounds familiar. this may have been what they did in that one case i mentioned.

let me check right quick and see if i still have the link...

hmm, didn't see any C-E links, guess i didn't save them.

Trying google right quick...

no luck.

it was a really good write up. a post mortem. timing results and everything. probably the only article on C-E i ever saw with actual timing results to back it up.

but like i said, it was really "cache friendly optimization" first, and something "C-E-ish" simply happened to be the sort of architecture that the solution evolved into.

which makes sense, if you think about how you'd reorder code and data to make it flow through the cache smoothly.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

My first Google search on Entity-Component was a while ago, but it still turns up this wiki which, as before, leads me to this chap. (I think that chap is the author of that wiki, but I'm not certain.)

Both of those sources see ECS design that basically boils down to a relational database (and the flyweight pattern) in order to speed things up. It's not just level design: speed is the entire reason ECS exists. It's why it was pitched as "the future of MMOs" by Scott Bilas. It's weird to wrap your brain around (esp. if you're not a db programmer) because it totally discards OOP (throwing the baby out with the bathwater) seeing ECS as an entirely new paradigm for program org.

It's a lot of reading, haha. :) But the gist is: the in-memory org. is going to be a database. The blog there ("this chap" link above) explains it pretty well.

In my engine, systems work on single components. When a system traverse a pool of its component type, after each component update it sends an event to the object owner of the component, interested components (attached to the same object) will receive the event and update its data accordingly.

For example, my Transform system traverse all transform components, updates its final matrix and send a transform event to its object.

Right now I have 2 components that register to transform components, the collider and the sprite component.

The collider will update its pos by getting the trafo received and adding its own offset.

The sprite will update its "renderWorld" matrix. (its more complicate than that, I actually need to see if the render and current differ, and than interpolate, since its a fixed time step loop. This happens on the sprite system loop, not on the event receiving.)

As you can see, I need to update systems in order, this order is decided when I add the system to the game.

With this event technique I dont need to care if the object have or not components that need to be informed, I could have pointers on components pointing to interested components, but that would be a pain in the ass to manage. The object can change in real time and it keeps working, no need to change anything.

The bad thing is that when I create a new component class, I need to be aware of all components and make the appropriate eve registration on the VOnAttach method.

Say I want to create a rigidBody compo, I will probably want the transform component to register for rigidbody update events, and make sure the rigidybody sys updates before the transform sys. The good thing is that if I want to create a component that need to receive transform events, I just need to register, no fancy interaction.

When it comes to entity talks around 'google', theres always a mention about hacking the gameobject by putting pointers to most used components, to speed up things. For a specific games where you know your objects will always have components x, y, z, you could just create a special case game object that speed up interaction between x, y and z.. (You can see that in Unity, you cant take out the transform component.)

I dont believe much in the "no game object actualy needed", making it just an abstract concept. It just introduces lots of complications, and Ive never seem anything like it other than chit chat (by ppl who actually never implemented it).

--

Each component have a factory, the factories are the ones holding the pools. The systems grab a reference to the pools on construction.

A factory need a createCompo method that can create an un-initialized compo and one that receives a "gfig" by param. gfigs is something like a parsed xml file that I invented.

Theres a object factory that have access to all compo factories and it can build entire game objects from a single gfig file.

1- duplicate data where required. Nothing wrong with a physics position and a render position.

2- sort your pools using a common ID (e.g. The entity ID), so that the iteration order of multiple pools will be the same.

Some interesting viewpoints in this post. I am at the tail end of converting my entire project from traditional OO-inheritance to component-entity, and my primary reason for doing so has nothing to do with speed or performance. I ran afoul of the diamond problem and the codebase was getting ridiculously complicated. Special cases hardcoded everywhere to handle unrelated entities having similar properties... yuck

If well implemented, CE systems can be slightly faster, yes. But certainly for me, better code organisation and partitioning of data was *the* reason for the switch.

[size="2"]Currently working on an open world survival RPG - For info check out my Development blog:[size="2"] ByteWrangler

2- sort your pools using a common ID (e.g. The entity ID), so that the iteration order of multiple pools will be the same.

Ya that was my thought as well...

Thanks for the info guys.


If well implemented, CE systems can be slightly faster, yes. But certainly for me, better code organisation and partitioning of data was *the* reason for the switch.

ah yes, forgot to mention this as reason #3: to avoid class hierarchy hell. don't really do the official objects thing myself (i use ADTs), so i tend to forget about OO syntax related issues sometimes.

but yes, this was another major driving force for many cases to switch to C-E. probably right up there with level designer tool. and far ahead of "cache friendly optimization" as the primary motivating factor in a team's choice to use C-E, in all the cases i could find.

note that almost all cases tended to mention all three benefits: level designer tool, cache friendlier, and avoids class hierarchy hell. but avoids class hierarchy hell and level designer tool were the usual reasons cited for going C-E. only saw that one case where it was all about the need for speed. and as i said, their C-E was not quite the same cup of tea as your typical implementation seen. it was more along the lines of the optimizations mentioned by phil_t and hodgman.

as an alternative to class hierarchy hell, it seems to work well for many people. i personally find that a "flat file" (IE non-relational) entity database is sufficient, simpler, and fast enough.

one question in my mind about C-E systems is, what if different entity types update a given component type in different ways? that would seem to tend to thrash the code cache a bit. or would the list of components be further split up into multiple lists based on update method? so you always had a contiguous array of identical components in the data cache, and code for a single update method in the code cache. it seems that this arrangement ( a contiguous array of identical components in the data cache, and code for a single update method in the code cache ) is what a cache would want to see.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

This topic is closed to new replies.

Advertisement