Was hoping for some help with entity component systems and data oriented design

Started by
7 comments, last by Dirk Gregorius 8 years, 8 months ago

Hello, I have been reading about entity component systems and through that I heard about data oriented design. It all sounds very interesting and sounds like it would be very useful to know. However, I am having some trouble implementing this and understanding certain parts of it. As I understand it:

An entity is comprised of components and can effectively just be an ID
A component is the data of an entity and also implicitly labels the entity as having some behavior
A system provides the logic and methods of the components

Based on here: http://t-machine.org/index.php/2009/10/26/entity-systems-are-the-future-of-mmos-part-5/
and here: http://www.dataorienteddesign.com/dodmain/node5.html

It seems the components should be a table/map of some sort using the entity ID as the index/key, as so:


map<int, PositonComponent> positions;
map<int,VelocityComponent>velocities;

positions[test1] = {0.0f, 5.0f};
positions[test2] = {2.0f, 2.0f};

velocities[test1] = {5.0f, 1.0f};

The two big things I'm having trouble understanding are:
What is a good way to have systems act on data from two different components? For example, if the velocity system is to update the position of an entity by adding the velocity to it.
One way I had thought to do it is to iterate through the map of velocity components and check if the entity id being check also exists in the map of position components. So:


for(map<int,Velocity>::iterator Vel= velocities.begin(); Vel != velocities.end(); Vel ++)
{
    int key = xVel->first;
    if(positions.count(key) >0 )
    {
        //update position

    }
}

This doesn't seem like it would be very efficient though. Another idea I had heard was to also maintain a list of your entities and use a bitfield to keep track of what components they have. Something like:


enum Components
{
    POSITION = 1 <<0,
    VELOCITY = 1 <<1,
};

So the above would become:


for(map<int,Velocity>::iterator Vel= velocities.begin(); Vel != velocities.end(); Vel ++)
{
    int key = xVel->first;
    if((componentList[key] & POSITION) )
    {
        //update position

    }
}

I don't know how much better or worse of a solution this would be though. Any input or further suggestions here would be appreciated.

The other problem I'm having trouble figuring out is how to destroy an entity. If an entity is destroyed/removed from play, what is a good way to let the component maps know so that they don't continue trying to update it? I can't imagine checking every map to see if the entity exists there and then removing it would be a great idea. If I used the bitfield idea from above, I could add another enum to represent having no components, and then when the systems are updating, have them check if the component they are trying to update should be removed. So:


for(map<int,Velocity>::iterator Vel= velocities.begin(); Vel != velocities.end(); Vel ++)
{
    int key = xVel->first;
    if((componentList[key] & NONE))
    {
        if((componentList[key] & POSITION))
        {
            //update position
        }
    }
    else
    {
        //remove this entity from velocity map
    }
}

But again, I don't know if this is the best or even a good way to go about this. Any suggestions here would be very appreciated.
Of course, If I am off or mistaken on any other part of this based on this post please tell me. First time posting here so I apologize if there are any formatting/etiquette mistakes in this post

Advertisement

You can't talk about "data-oriented design" and use map<> at the same time. That's like saying you really fast Geo Metro. tongue.png

If you're serious about data-oriented design, drop the map. You want compact, contiguously-allocated components. You'll need some form of indirection in order to keep things packed, e.g. like bitsquid's packed_array (I've always called that design "slot maps" but that's me). If you're serous about the ECS approach, just read through that whole blog; I think it's brilliant, even in the parts I disagree with. smile.png

Second, get rid of the branches in your loop. Modern CPUs have deep pipelines and don't like branches, especially _unpredictable_ branches. This one at least is relatively predictable... probably, because there's no reason to check if you have a position if you're looping over velocity components since velocity is completely (mathematically) meaningless without position. Don't even let users have velocity components without position. Make it a load-time error and have your editor enforce the constraint that velocity requires position.

For cases where there isn't a strict dependency, keep a completely separate list of entities that have all the requires components. That way you don't have to loop over all of one component and check each time if the entity has the other. It also removes the extra memory dereference of having to check the entity itself which components it has. Just a nice simple branch-free loop over all the pairs of components that you already know are valid.

That said, ECS is, in my not so humble opinion, quite silly. There's a reason you only find them crowed about on Reddit and hobbyist blogs but can barely find any evidence of ECS in the professional games segment. Yes, data-oriented design is super handy, but use it where you need it (graphics, physics, AI, and other big data computations) and stick to simpler and easier to use models for the rest of the "glue" code that your game logic uses. It's telling that pretty much every ECS engine ends up saying something like "and then we just bolted on Lua and let people do whatever they want there" (rather than integrating their scripting with the core of the component model).

Sean Middleditch – Game Systems Engineer – Join my team!

The 'component system' described in the OP can be described in 10 lines of code smile.png
struct Position { Vec3 data; };
struct Velocity { Vec3 data; };
struct VelocityBasedMovement { Position* pos; const Velocity* vel; };
typedef std::vector<VelocityBasedMovement> VelocityBasedMovementSystem;
void UpdateVelocityBasedMovementSystem( float dt, VelocityBasedMovement* begin, VelocityBasedMovement* end )
{
  for( VelocityBasedMovement* c=begin; c!=end; ++c )
    c->pos->data += c->vel->data * dt;
}
IMHO it's silly to try and build a big, complex framework to solve such simple problems. Composition of components is already a language-level feature.

You only need a fancy "component" framework when you want to use data-driven entities (i.e. entities defined by text files, not by code/scripts) -- so that game/level designers can construct entities, and so entities can be dynamic (e.g. temporarily adding and removing a 'poisoned' component to your hero). For core logic like physics (or anything written by a programmer), there's no reason to encumber yourself with a designer-oriented framework.

P.S. I hate those t-machine articles because the author basically states that OOP==deep inheritance hierarchies and large "do everything" classes, which is bad, therefore you should use components. However, OOP actually teaches that you should prefer using composition (i.e. component based design) instead of inheritance by default, and that you should use small classes that each have a single responsibility, which can be composed into larger entities... So he's making a straw-man argument about OOP and jumping to a false conclusion based on his experiences with anti-OOP code. It's not just his article either -- the vast majority of pro-ECS blogs that I read make the exact same straw-man argument and jump to the same false conclusion.
The "component based" code that I've posted above is also an OOP design.

OP, I propose that you reorganize your thinking about how to implement ECS before you go further. Your implementation has entities and it has components, but it doesn't really have systems - instead of real "systems", it has collections of components, which look like "objects" in this implementation, and it methods to act on the collections of components. Try:

  • applying the "object" thinking to the higher level. Each "component" should be a system, a collection of raw data that is indexable by entity ID and has a set of methods to perform operations on subsets of the data
  • not using map, as suggested - use a sparse array, something you can iterate through without dealing with conditionals for each bit of data.
  • thinking about membership in a system as a statement that the component exists, so if an entity doesn't have a velocity, an entry for the velocity simply does not exist and does not get processed for that entity. This means you can pretty much eliminate the binding data (the "component list") you have that exists for no other reason than to tell each system whether it has an entry for a particular entity. That should be up to each system, not some otherwise-useless object that needs to be accessible to all the systems in order for this to work, with all the problems that come with that. Hint: we avoid globals for a reason!
  • Read everything in this (not quite complete) book on data-oriented design, as ECS grew out of some of these principles: http://www.dataorienteddesign.com/dodmain/

So he's making a straw-man argument about OOP and jumping to a false conclusion based on his experiences with anti-OOP code. It's not just his article either -- the vast majority of pro-ECS blogs that I read make the exact same straw-man argument and jump to the same false conclusion.
The "component based" code that I've posted above is also an OOP design.


That's because the "straw-OOP" is what some people actually think OOP is, or at any rate what they think OOP leads to. The impression I get is that a lot of code "in the wild" both inside and outside the game industry reflects that warped vision of what object-orientation means. I'd imagine a lot of people are so pro-ECS because they're reacting against the (bad) code they've actually had deal with, that was written by real programmers working on real projects. I've been exposed to late '90s/early '00s "OOP" and I have to say, if that was my only exposure to "OO" production code, I'd be frothing at the mouth about ECS even when it isn't necessary, too. I myself was very enthusiastic about ECS after dealing with the pervasive pattern-thinking and focus on overengineering that my classmates tended to use when I was in school, and that I myself was guilty of using.

I blame the schools. The first example every damn student learns of OOP is something terrible like "Circle derives from Shape" or "Dog derives from Mammal derives from Animal." Those exemplify very early the idea that OOP is all about relationships between nouns instead of being about abstraction of interface.

This is one of the reasons I like the 'trait' systems popping up in languages the last decade or so. Syntactically and logically, a trait is something a type _does_ or _has_ or even _models_ and isn't something that the type _is_, which much more closely models the "has-a not is-a" lesson we try to teach in remedial C++ development. :p

Sean Middleditch – Game Systems Engineer – Join my team!

You can't talk about "data-oriented design" and use map<> at the same time. That's like saying you really fast Geo Metro. tongue.png

If you're serious about data-oriented design, drop the map. You want compact, contiguously-allocated components. You'll need some form of indirection in order to keep things packed, e.g. like bitsquid's packed_array (I've always called that design "slot maps" but that's me). If you're serous about the ECS approach, just read through that whole blog; I think it's brilliant, even in the parts I disagree with. smile.png

I kind of suspected this was the case. I've heard a lot of complaints about the STL in general while reading up on this, but I figured for at least demonstrating what I wanted to do it would be ok for now and could be changed once I got to work on a real implementation. It is nice to have confirmation on this point though.

Second, get rid of the branches in your loop. Modern CPUs have deep pipelines and don't like branches, especially _unpredictable_ branches. This one at least is relatively predictable... probably, because there's no reason to check if you have a position if you're looping over velocity components since velocity is completely (mathematically) meaningless without position. Don't even let users have velocity components without position. Make it a load-time error and have your editor enforce the constraint that velocity requires position.

For cases where there isn't a strict dependency, keep a completely separate list of entities that have all the requires components. That way you don't have to loop over all of one component and check each time if the entity has the other. It also removes the extra memory dereference of having to check the entity itself which components it has. Just a nice simple branch-free loop over all the pairs of components that you already know are valid.

I think I understand this and it sounds like a good idea. So, using this example, when trying to add a velocity component to an entity, that entity should first be checked to make sure it has a position component?

That said, ECS is, in my not so humble opinion, quite silly. There's a reason you only find them crowed about on Reddit and hobbyist blogs but can barely find any evidence of ECS in the professional games segment. Yes, data-oriented design is super handy, but use it where you need it (graphics, physics, AI, and other big data computations) and stick to simpler and easier to use models for the rest of the "glue" code that your game logic uses. It's telling that pretty much every ECS engine ends up saying something like "and then we just bolted on Lua and let people do whatever they want there" (rather than integrating their scripting with the core of the component model).

Perhaps I will end up agreeing with you, but the concept really intrigues me, and I would like to try it out a few times first.

IMHO it's silly to try and build a big, complex framework to solve such simple problems. Composition of components is already a language-level feature.

You only need a fancy "component" framework when you want to use data-driven entities (i.e. entities defined by text files, not by code/scripts) -- so that game/level designers can construct entities, and so entities can be dynamic (e.g. temporarily adding and removing a 'poisoned' component to your hero). For core logic like physics (or anything written by a programmer), there's no reason to encumber yourself with a designer-oriented framework.

I get that for my example it's probably a bit superflous, but I was hoping that such a simple example would help me understand the basics of it.

OP, I propose that you reorganize your thinking about how to implement ECS before you go further. Your implementation has entities and it has components, but it doesn't really have systems - instead of real "systems", it has collections of components, which look like "objects" in this implementation, and it methods to act on the collections of components. Try:

  • applying the "object" thinking to the higher level. Each "component" should be a system, a collection of raw data that is indexable by entity ID and has a set of methods to perform operations on subsets of the data
  • not using map, as suggested - use a sparse array, something you can iterate through without dealing with conditionals for each bit of data.
  • thinking about membership in a system as a statement that the component exists, so if an entity doesn't have a velocity, an entry for the velocity simply does not exist and does not get processed for that entity. This means you can pretty much eliminate the binding data (the "component list") you have that exists for no other reason than to tell each system whether it has an entry for a particular entity. That should be up to each system, not some otherwise-useless object that needs to be accessible to all the systems in order for this to work, with all the problems that come with that. Hint: we avoid globals for a reason!
  • Read everything in this (not quite complete) book on data-oriented design, as ECS grew out of some of these principles: http://www.dataorienteddesign.com/dodmain/

I'll have to think about this a bit but I think I understand what you're saying. And yes, I've read that book, though I definitely do not grok everything in it and have been planing to reread it at some point. It was one of the things that initially got me interested in this subject.

Thank you all for responding!

I get that for my example it's probably a bit superflous, but I was hoping that such a simple example would help me understand the basics of it.

Hopefully my framework-less code is still useful.
In my example, it's possible to refer to components themselves instead of referring to entities -- so I made that VelocityBasedMovement component, which links to the Position/Velocity components by their ID's. You can then iterate through the VelocityBasedMovement components to find the Positions that need to be updated.

Alternatively, if your framework does not allow components to have IDs/handles (instead only allows entities to have handles), then VelocityBasedMovement can still exist, but just be an empty component. Basically, the existence of the component is "tagging" the entity that it should be processed by the VelocityBasedMovementSystem. The update logic would then be something like:
void UpdateVelocityBasedMovementSystem( float dt, VelocityBasedMovement* begin, VelocityBasedMovement* end )
{
  for( VelocityBasedMovement* c=begin; c!=end; ++c )
  {
    Entity* e = c->GetOwner();
    Position* pos = e->GetComponent<Position>();
    Velocity* vel = e->GetComponent<Velocity>();
    assert( pos && vel && "Only add VelocityBasedMovement to entities that have positions and velocities" );
    pos->data += vel->data * dt;
  }
}
That's a more common approach within bulky ECS frameworks.

That's because the "straw-OOP" is what some people actually think OOP is, or at any rate what they think OOP leads to. The impression I get is that a lot of code "in the wild" both inside and outside the game industry reflects that warped vision of what object-orientation means. I'd imagine a lot of people are so pro-ECS because they're reacting against the (bad) code they've actually had deal with, that was written by real programmers working on real projects. I've been exposed to late '90s/early '00s "OOP" and I have to say, if that was my only exposure to "OO" production code, I'd be frothing at the mouth about ECS even when it isn't necessary, too.

Yeah it's just self-harming though, and teaching it to others is... other-harming.
It's throwing the baby out with the bathwater. These people are letting their ignorance rob them of an opportunity for self improvement by actually learning OOP for reals. Just because a friend once tricked you into eating a whole spoonful of Wasabi paste doesn't mean that you have to go the rest of your life avoiding adding the condiment to your sushi smile.png

I kind of suspected this was the case. I've heard a lot of complaints about the STL in general while reading up on this


The STL isn't bad, it just has gotchas and isn't necessarily ideal for every use case. For basic day-to-day needs it works fine. For special needs like _the core data structures of a game engine_, you have to put in a little more work yourself. smile.png

but I figured for at least demonstrating what I wanted to do it would be ok for now and could be changed once I got to work on a real implementation. It is nice to have confirmation on this point though.


Sure. You're just not learning anything at all about data-oriented design if you start out with a map. The whole _point_ of data-oriented design is that you think about your data and how it's used (in terms of how the CPU accesses it) first and foremost. So since you said you wanted to learn data-oriented design... smile.png

That does rather imply that you must have a strong grasp of data structures, computer architecture, algorithms, and what your game actually does, all before you even start _thinking_ about a game object model.

I think I understand this and it sounds like a good idea. So, using this example, when trying to add a velocity component to an entity, that entity should first be checked to make sure it has a position component?


Right. Velocity has no purpose without position so just don't allow it.

With a pure ECS approach, the entity would just be "invisible" to the Movement System because it doesn't have all the require components. You still wouldn't need a branch in your loop, however...

That just means that you can have an object that is mysteriously broken and doesn't move with no clear indication of why. You can add diagnostics to the system saying which entities it ignores and why. However, as your game grows, you'll end up with a lot of systems that very intentionally only work with separate but overlapping combinations of components. Actual problems will get drowned in the log noise.

Like many things, the pure approach looks good on paper while the reality is a lot more complicated.

Perhaps I will end up agreeing with you, but the concept really intrigues me, and I would like to try it out a few times first.


Wise. Keep that attitude up. smile.png

Sean Middleditch – Game Systems Engineer – Join my team!

That said, ECS is, in my not so humble opinion, quite silly. There's a reason you only find them crowed about on Reddit and hobbyist blogs but can barely find any evidence of ECS in the professional games segment.

I tend to agree with this. It feels going from one extreme to the other. Interestingly in the Bitsquid engine (which you mention earlier) the ECS was actually an afterthought according to their blog. Before that they relied entirely on scripting.

What 'easier models' do you suggest? Roughly speaking smile.png

This topic is closed to new replies.

Advertisement