# How should systems pick which entities to process? (Entity Component Systems)

This topic is 998 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hey all,

I'm writing a basic 2d isometric tile-based game engine, and trying to follow the entity component system design pattern. My understanding is that Entities are generic containers that contain Components. Components are the different types of data that you associate with that entity. Systems are the game logic, and loop through entities that have the correct combination of components that the system needs to process. So, my question is: How do these systems pick which entities have the correct combination of components? I can think of a few ways, that I'll suggest below in pseudocode, but I was wondering how other people do it.

#1: Each system loops through all of the entities in the game, and explicitly checks if they contain the correct components.

function RenderSystem.process(entityList) {
// Loops through all entities (maybe inefficient if there are a lot of entities in the world?)
foreach(entity in entityList) {
if (entity.componentList.contains('PositionComponent') && entity.componentList.contains('SpriteComponent')   {
graphicsContext.render(
entity.componentList['PositionComponent'].x,
entity.componentList['PositionComponent'].y,
entity.componentList['SpriteComponent'].image
);
}
}
}

#2: Have some sort of bitmap that represents the components an entity can have. Each bit position represents a component type.

function RenderSystem.process(entityList) {
// Loops through all entities (maybe inefficient if there are a lot of entities in the world?)
foreach(entity in entityList) {
// Let's say that position 1 and 3 represent the position and sprite components.
if ((entity.componentBitMap & '1010') == '1010') {
graphicsContext.render(
entity.componentList['PositionComponent'].x,
entity.componentList['PositionComponent'].y,
entity.componentList['SpriteComponent'].image
);
}
}
}

#3: Have strongly typed "Nodes" that represent collections of components.

class RenderNode {
public PositionComponent position;
public SpriteComponent sprite;
}

function RenderSystem.process(renderNodeList) {
// Loops through all nodes that have been added to some linked list of nodes.
foreach(node in renderNodeList) {
graphicsContext.render(
node.position.x,
node.position.y,
node.sprite.image
);
}
}

#4: Have separate hash maps of components where the entity that they belong to has it's id as the key. Do an intersection on the hash maps required by the system, and only process those entities whose id was in all required component hash maps.
I have seen some posts alluding to this sort of approach, but I haven't seen any articles or tutorials implementing it.

function RenderSystem.process(entityList, componentManager) {
// Get the set of entityIds that we want to process by intersecting the required component hash maps
entityIdList = componentManager['PositionComponent'].keys.intersect(componentManager['SpriteComponent'].keys);
foreach(entityId in entityIdList) {
graphicsContext.render(
componentManager['PositionComponent'][entityId].x,
componentManager['PositionComponent'][entityId].y,
componentManager['SpriteComponent'][entityId].image
);
}
}

Some notes:
• #1 through #3 have components on the entity object itself, while #4 has components stored in a separate data structure.
• #3 would require some sort of registration with other data structures upon entity creation. This would have some overhead when attaching components dynamically at run time (for example, what happens if your character gets frozen by an ice monster and can no longer move, and you want to remove its "MovementComponent," you'll have to find all of the nodeLists that have movement components as one of their properties, and remove the entity from the list).
• #3 allows the system to just process the required components. No need for filtering entities, or performing intersections on lists of components. By increasing overhead in attaching components to entities at run time, you've decreased the overhead of filtering within the system processing.
• #3 allows for strongly typed collections of components. My pseudocode above is for a system that is dynamically typed, but in a static language, I would have to explicitly cast the components before accessing their properties. For example...
entity.componentList['PositionComponent'].x
...would become...
(PositionComponent)(entity.componentList['PositionComponent']).x
...because componentList would just contain IComponent or something like that. It wouldn't know about x.

##### Share on other sites

I'm writing a basic 2d isometric tile-based game engine, and trying to follow the entity component system design pattern. My understanding is that Entities are generic containers that contain Components Components are the different types of data that you associate with that entity. Systems are the game logic, and loop through entities that have the correct combination of components that the system needs to process.

That's a very typical way to approach a component-based entity system. Note that the containment is conceptual. Entities need not physically contain components (and often it's better if they don't). There are other ways to approach the data and behavior separation as well; there's no single correct approach.

That said, as to your specific examples:

#1, #2 #4 are basically the same (iterate all entities, check for appropriate components). The difference is mainly in the efficiency of the check for components (#4 iterates a more scoped set but also performs the costly intersection test every update... for every system). This approach is can be extremely problematic if you have lots of entities with wildly differing component combinations since you're wasting time looking through entities you'll never process, and you're looping over all entities for every "system" in the game. #3 is, I think, more reasonable (although I'm not sure a linked list is great idea -- it rarely is). I didn't follow your link, though.

The ideal way to structure the iteration of a entity system like the one you've described is to simply put the components in the system. That way each system loops over all its components, which it knows are the only components it has to process. You loop over only what you need for each kind of behavior, and you do so in a cache-efficient manner (unless you happen to have to ask another system for a related component for an given entity).

An entity here is simply an ID or a handle; a lightweight reference or proxy that can be passed around. Systems know about eachother if they know they're going to need data from other systems to function, and can ask other systems for components belonging to particular entities. Each component knows which entity it's associated with (directly, or through an association made in the system), thus enabling the correlation. Systems that are wholly independent of others simply iterate all their components in the most-efficient manner for their behavior and perform the appropriate logic.

Edited by Josh Petrie

##### Share on other sites

Hey Josh,

Thanks so much for replying! I believe I understand what you're saying.

What if there is a component that is used by several systems, but doesn't necessarily belong to any particular system. Like, say for instance there is a PositionComponent that is used by the PhysicsSystem, RenderSystem, AiSystem, ControllerSystem, etc.  You suggest placing components on the system, and then having systems communicate to get components that belong to other systems. Which system would you place the PositionComponent in, in this case?

##### Share on other sites

What if there is a component that is used by several systems, but doesn't necessarily belong to any particular system. Like, say for instance there is a PositionComponent that is used by the PhysicsSystem, RenderSystem, AiSystem, ControllerSystem, etc.  You suggest placing components on the system, and then having systems communicate to get components that belong to other systems. Which system would you place the PositionComponent in, in this case?

Systems are not required to be independent. You might very well have a TransformSystem that owns and manages positional information. Other systems then would use that system. Systems might even have their own copies of some data for parallelism reasons and basically just need to copy updates in to or out of other systems, e.g. Physics might just blast simulation updates into a TransformSystem (via a single memcpy if you're doing things the data-oriented way).

It's not uncommon in a component-based architecture to have more concepts like a "component mapper" (sometimes just part of the component's factory), so you end up with:
• Game Object contains references to Components
• Component contains independent dependency-free data, sometimes logic
• Mappers own collections of specified components and provide accessors, but have no update logic
• Systems reference Mappers and contain all of the update logic for high-level concepts (Targeting, Physics, etc.)
Things then get a little messier still because of third-party libraries or even just good module design. For example, your renderer should probably not have a dependency on game objects, components, or systems; the renderer's sole job is to render whatever you throw at it. This might include high-level graphics concepts like the scene and camera configuration, animation states, etc. Which is important, because you'll easily end up in situations where you need to render things that have no game object (e.g. a game object might have multiple models, or you might have several very different things that all need a Model but can't reasonably all use the same ModelComponent/ModelSystem, etc.).

These needs lead to a "parallel hierarchies" structure where most of your components and game systems are basically nothing more than dumb glue that holds together different modules like your third-party physics library, your renderer, etc. Rendering has its own set of data and logic and just doles out handles to persistent objects. Likewise for physics. A higher "game glue" layer then contains all the code for game objects and components and ECS-like systems that is responsible for associating handles with game objects, update orders, etc. Something like
• Title-specific Game Logic (highest layer)
• High-level Game Glue
• Physics, Rendering, Audio, Input, Low-level Modules
• Shared Core Infrastructure (lowest layer)
The bottom two or three layers represent your "engine" and the upper one or two layers represent your "game." It's rather subjective which bits end up in the top two layers.

##### Share on other sites
What if there is a component that is used by several systems, but doesn't necessarily belong to any particular system. Like, say for instance there is a PositionComponent that is used by the PhysicsSystem, RenderSystem, AiSystem, ControllerSystem, etc.  You suggest placing components on the system, and then having systems communicate to get components that belong to other systems. Which system would you place the PositionComponent in, in this case?

There are various ways to handle this, but in the system you initially described (components are data, systems are behavior that operate on data), data without behavior is useless. Further, one system must be the authority for position (et cetera), and thus everybody shares that position data by asking the authoritative system for it. Typically this would be the physics system; there's no reason to have a "transform component" since there is no behavior associated only with transformations (or at least that is not also wholly encapsulated in physics behavior), so you simply make the physics component the authority. Anybody who needs position information asks for it from the physics system.

You can create a no-op transformation system to store transformations in if you need to abstract away the possibly presence of a physics system though, but otherwise you're functioning exactly the same as any other piece of shared data. Somebody must own that data, if for no other reason than that somebody is the first somebody who gets to process it in a given frame. So ask for the data from that system.

Further, it's worth noting that the idea of position being "shared" isn't always applicable. Logically, many things have position, but in practice the same form of position isn't always the most optimal. Physics might use a big structure of arrays, but rendering may convert to tile positions. So there's that to consider.

##### Share on other sites

In my ECS, the systems themselves know which components they need to work with. I'm working with C#, so I've implemented that through Attributes.

When an entity is created, the EntityManager notifies each system that a new entity was created, and the systems decide if they should care about it. If they do, they add the entity to their internal membership list. Similarly, whenever a component is dynamically added or removed from an entity, the same process is performed, and in some cases results in the entity being removed from the membership list of the system.

Finally, I have two different lists to specify the relationship between the component and the system. "AllOf" or "AnyOf". "AllOf" means the entity has to have all of the components in the list to be processed by the system. If any are missing, it is ignored. "AnyOf" means the entity will be included if any of the components are present on the entity.

##### Share on other sites

That's why I use a TilePosition component and ScreenPosition component.  A Tile system is responsible for finding the closest Tile position for a sprite's screen position. It also looks for any static sprites that have a Tile Position set at load time but not a Screen Position. The system sets the screen position for them according to their tile position. Additionally, you may also want the tile information to exist in an array or hash map if you want the fastest lookup times for handling collisions. If your level has tens of thousands of tiles it is a lot better to check collisions against moving entities by looking up tiles in a hash map or array, than to loop through the thousands of tiles for every entity that can collide.

I did what Phil_t does and put all components in an array, with their index indicating what entity they are a part of, and register at load time. Although I still tend to bitmask/null check at every frame, I eventually want to register very large quantities of similar Entities early. There really is no Entity structure defined in the code because they're implicitly defined by the array indices.

##### Share on other sites

I'm seeing some comments in this thread suggesting that like components should be in an array (and not stored on the entity that they "belong to"), so that they will be laid out sequentially in memory to take advantage of spatial locality when they are processed each game loop. Is this a premature optimization, or is this an important design decision to make? If the data is stored in component objects, will it make a difference anyway (won't the references to those objects be stored sequentially, and not the actual data)?

If I'm planning on this engine to be used primarily for mobile platforms does this matter more, or less (I know in Java, you are kind of at the mercy of the JVM for stuff like this, objective-c, I'm not sure)?

I'm a noobie game developer, so how important is being "cache-friendly," and should it affect how I implement my entity-component-system engine?

EDIT: If anybody is interested, I found an older thread with a similar topic (the thread starter uses method #1 from OP). Similar answers are given there before the thread goes a bit off topic: http://www.gamedev.net/topic/617256-component-entity-model-without-rtti/

Edited by DapperCam

##### Share on other sites

Is this a premature optimization, or is this an important design decision to make? If the data is stored in component objects, will it make a difference anyway

It can be. I argue strongly that developers should use much simpler component architectures and utterly avoid ECS. Having a vector of pointers to IComponent objects with a virtual Update() method works well for even many big AAA games.

However, once you need to start squeezing performance out, architectural choices can leave you painted into a corner. Converting a million lines of engine, tools, and game from a naive component architecture into something with decent memory access patterns is a months-long process that is simply infeasible to carry out for a game company with multiple developers and a real timeline or budget.

(won't the references to those objects be stored sequentially, and not the actual data)?

The idea is that you store the _actual objects_ in contiguous memory, not just references. e.g. you have a vector<FooData> and not just a vector<FooData*>. This of course is not an option in languages like Java.

If you're in a language like Java _and if you need the peformance_ then you're in for a hard time. There's a reason that Java isn't used often for high-performance anything. The Android games where performance matters don't use Java; they use the NDK.

If you're just writing a small/simple Android game or just learning for the first time, neither Java nor memory access patterns will be your bottleneck, though. Ignore the performance advice and just stick to simple, easy-to-understand, easy-to-modify code. Your big bottlenecks are going to revolve around how you draw, how your AI works, etc. Optimization is usually algorithmic; it's not until you have a ton of experience and are truly pushing the hardware that you'll really _need_ to start optimizing memory access patterns.

That said, lazy coding and bloated runtimes are why smartphone batteries still only last for a handful of hours or why your desktop in 2015 feels no more responsive than they did in 1995...

If I'm planning on this engine to be used primarily for mobile platforms does this matter more, or less (I know in Java, you are kind of at the mercy of the JVM for stuff like this, objective-c, I'm not sure)?

Objective-C is a native (non-interpreted) language; it has some higher-level primitives than most other native languages, but you can avoid those for the most part. You can of course write the vast majority of your game in C or C++ and then just use Objective-C for the few platform integration bits where Apple forces you to use it.

Likewise on Android you can use the NDK (Native Development Kit) and completely avoid Java/Dalvik and instead write C or C++ for the majority of your game.

There are cross-compilers for other languages like C# that target these native layers, so you can use "improved Java" languages like C# or modern high-level languages like Rust if you aren't into C/C++.

##### Share on other sites

Cache-Aware Components by Randy Gaul is a great article: http://www.randygaul.net/2014/06/25/cache-aware-components/

You can view an example of it in use in the SEL repository:  SelHandleManager is the one to look at. ( https://bitbucket.org/rgaul/sel/src/32fd2e5fb1085cd1ad789ac3a5d68111b03dc450/src/SEL/Containers/?at=default )

##### Share on other sites

My systems grab the components factories they need on constructor. The factories are the ones holding the compos pools.

Systems work on compos only, not entities (till now, they work on a single compo pool). The system can send an event to the entity in the case other compos need update (this can be optimized by having pointers to other components in the dependent component).

Example:

Entity

-ColliderCompo

-TrafoCompo

-SpriteCompo

CollisionSys(ColliderCompoFactory)

-CollisionSys.Update -> solves all penetrations trough ColliderCompo data, send trafoupdate event to the entity (trafo and sprite compo get pos updated)

Edited by Icebone1000

##### Share on other sites

I've read all of the responses/articles/source posted in this thread and it's been very helpful. I didn't even know there was a such thing as a CPU cache until yesterday (I develop business apps in managed languages). It's clear to me that no matter which design you choose, it's important to keep homogeneous data packed tightly in arrays that are looped over to apply game logic. Thanks to everyone who responded!

This excellent article was posted by its author that was very informative: http://t-machine.org/index.php/2014/03/08/data-structures-for-entity-systems-contiguous-memory/

##### Share on other sites

I'm surprised that so many people in this thread treat the choice of array vs vector as a major early-time design decision. Array vs vector is implementation detail; expose methods that will allow you to iterate through components, add and delete them, so that the user of this system won't even know if it's array, vector or experimental probabilistic data structure on the inside.

##### Share on other sites

I'm surprised that so many people in this thread treat the choice of array vs vector as a major early-time design decision.

I don't see anybody in this thread suggesting that the important decision is "array or vector." In fact, no posts use both the term "array" and "vector" at the same time until yours. Did you mean to reply to a different thread?

The general discussion here is about where to put the array (or vector) and what to put in it, which is a more significant decision than "array or vector" since it may have nontrivial impacts on the performance tradeoffs of the API being constructed.

Edited by Josh Petrie

##### Share on other sites

I'm surprised that so many people in this thread treat the choice of array vs vector as a major early-time design decision. Array vs vector is implementation detail; expose methods that will allow you to iterate through components, add and delete them, so that the user of this system won't even know if it's array, vector or experimental probabilistic data structure on the inside.

? I don't see anyone making that claim. Vector is a c++ thing, the OP is using java, and "array" is kind of a broad term that without context doesn't refer to any specific data structure.

There was discussion around whether or not components are stored in a contiguous per-type array, or within heterogeneous arrays inside each entity. Quite a different thing, with very different memory layout, which can be time-consuming to change later on. (Interestingly, Jon Blow is designing a new game-oriented programming language where one of the features is that you can change memory layouts (AOS vs SOA) without changing the code that operates on this data).

Edited by phil_t

##### Share on other sites

I'd say #3 is my preference, but I think it can be taken farther - instead of prioritizing "components" per se, design your architecture around operations linked together by APIs. Here, an "entity" just becomes something that associates otherwise-unrelated data together and "components" become the "data context" needed by some particular operation. This allows you to focus a particular bit of code on what actually needs to be done, so that when you read a particular bit of code you know exactly what it is intended to accomplish.

IMO, The main reason to use this layout is not premature optimization, but is to encourage an efficient structure. You dont want to do option 1-3, because you will be wastefully iterating through all entities for every system.

But option 3 lets you do existence-based processing - instead of having a render node for every single object, and looping through render nodes for all the entities that exist, you can only store the ones that you actually need to draw. For example, you might be building a sprite-based game where a "render node" consists of a sprite ID, a position, and an animation frame index to draw (the actual animation state would be in a separate subsystem). You could associate one of these with everything that's renderable, and loop through them all, apply some kind of visibility test, and draw them if they're visible. That would be the quick, easy way to do it.

Or, you could keep your entities in some kind of spatial partitioning data structure, and when the player moves into their "zone" send them a message to wake them up, which would include putting a render node into the render subsystem's active list, have the zone system remove their render nodes when the player leaves the zone, and therefore only ever process the render nodes that are actually relevant.

Some useful resources:

Data-oriented design eBook

Casey Muratori on "Semantic Compression"

Edited by Oberon_Command

##### Share on other sites

Or, you could keep your entities in some kind of spatial partitioning data structure, and when the player moves into their "zone" send them a message to wake them up, which would include putting a render node into the render subsystem's active list, have the zone system remove their render nodes when the player leaves the zone, and therefore only ever process the render nodes that are actually relevant.

What about entities that don't have a position component? Do you mean keeping the identifiers of the entities that do have a position component in a separate data structure outside of the entity component system pattern?

##### Share on other sites

Or, you could keep your entities in some kind of spatial partitioning data structure, and when the player moves into their "zone" send them a message to wake them up, which would include putting a render node into the render subsystem's active list, have the zone system remove their render nodes when the player leaves the zone, and therefore only ever process the render nodes that are actually relevant.

What about entities that don't have a position component? Do you mean keeping the identifiers of the entities that do have a position component in a separate data structure outside of the entity component system pattern?

Yes. Though, if it doesn't have a position, how will you know where to render it? ;) Alternatively, you could keep that data structure within the transform (I wouldn't put a "position" by itself - normally when you have a position, you also have a rotation or physics parameters) system.

The entity-component system pattern is not a silver bullet. It's just one way you can organize the data relating to a subsystem. Your "components" don't even have to be associated with any particular entity for some applications. Within the subsystem you can organize the data however is best for what you're trying to accomplish with that subsystem. You should also try to get away from the notion of designing your code by applying "patterns" to things. A "design pattern" is just a way of explaining to other programmers what you did after the fact. Just determine what your code needs to do, exercise your critical thinking skills and write your code.

Edited by Oberon_Command

##### Share on other sites

You could take a look at artemis (C++ port) https://github.com/vinova/Artemis-Cpp .

What they do is similar to how phil_t described it earlier.

• Each entity has an id, a type bitstring and a system bitstring.
• Everytime you add a component to an entity, its type string is updated, but not the system string. Also, that entity is added to a "refreshed" array
• At the start of your main loop, you need to call a loopstart() function ( world.loopStart() ). This function iterates over all the entities in the refreshed array
• In loopstart, each entity that was refreshed will be acted upon by all the systems registered with the world object
• For each system, the system's bits will be compared with the entity's type bits ( the system's bit indicates what all components/types will this system act upon)
• EDIT: The previous point is a bit incorrect. The systems themselves will have a system bit and type bits. The system bit will identify that particular system, while the type bits will identify what all components are required if an entity wants to be acted upon by that system
• If the entity is interested in that system(i.e. the corresponding type bit(s) is/are set), then its system bitstring will be updated, and that entity will be added to that system's actives list
• When you call any system's update() method, it will only iterate over the entities in its actives list.
• The only downside I see here is that your changes to an entity will not be available in the same loop. This is actually a good thing, as explained here
Edited by SpaceRoach

##### Share on other sites

I'm seeing some comments in this thread suggesting that like components should be in an array (and not stored on the entity that they "belong to"), so that they will be laid out sequentially in memory to take advantage of spatial locality when they are processed each game loop. Is this a premature optimization, or is this an important design decision to make? If the data is stored in component objects, will it make a difference anyway (won't the references to those objects be stored sequentially, and not the actual data)?

The "premature optimization" quote is the quote which is most often misquoted and taken out of context, here.

Making a high level architectural choice of where to put some data (in this case because the code gets simpler to write, when you wont discard the type information for components and there is no useless typechecking in the entity to make up for it), is a completely different thing than obfuscating a function to microoptimize it before you know it was needed (what the quote is about and in later sentences is relativating).

This is the main reason why I am storing components in an array. It's not so much for cache optimization, but to avoid type casting on every frame. At initialization each System gets the array of Components it needs to work on. They get cast to arrays of the derived Component classes which are stored in the System. When entities are processed I can access the properties of each Component without doing any casting.

Edited by CC Ricers

##### Share on other sites

This topic is 998 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.