Best way of storing components in ECS

Started by
8 comments, last by frob 4 years, 6 months ago

Hi!

I am trying to build a small 2D engine, and I was trying to model my entities using an (sort of) ECS (entity component system). Every entity is just an index, and all entities are represented by components of various kinds.

All my components live in a struct of arrays, something like:


struct {
  TransformComponent transforms[N];
  SpriteComponent sprites[N];
  ColliderComponent colliders[N];
  ...
} entities;

Now, my question:

Should I go for

  • arrays on the stack, as in the example,
  • or for arrays on the heap (e.g. just having `TransformComponent * transforms` and then creating new components with `new TransformComponent{...}`),
  • or for a preallocated buffer of raw memory -either on stack or on heap- (e.g. `unsigned char transforms_buffer[N * sizeof(TransformComponent)]`) and then creating new instances using placement new (`new(&transforms_buffer) TransformComponent{...}`),
or for using smart pointers/arrays?

I feel there are so many different ways one could go, and I don't know which one I should prefer really.

Thanks!

MP

Advertisement
Quote

arrays on the stack, as in the example

What you posted won't necessarily reside on the stack (and the stack might not be the best place for it if N is sufficiently large). But it seems like you could certainly use arrays with size specified at compile time if you're ok with N being fixed.

Quote

or for arrays on the heap (e.g. just having `TransformComponent * transforms` and then creating new components with `new TransformComponent{...}`)

I'm not sure what you have in mind here, but I often see contiguousness of data emphasized when discussing ECS, so you probably wouldn't want to be creating individual instances using 'new'. But, you could certainly use arrays with size specified at run time (perhaps using an RAII container like std::vector to handle, at minimum, cleanup).

Quote

or for using smart pointers/arrays?

If you mean smart pointers per component, then I would think that wouldn't really fit the paradigm. If you mean smart pointers or some sort of RAII wrapper per array, then sure (typically you'd use std::vector or similar for this).

I haven't implemented an ECS system myself, so I won't comment further on that part of things (there might be considerations specific to ECS systems that might push you towards or away from some of these options).

You definitely don't want to be storing components on the stack since it's pretty small (defaulting to 8mb on Linux and 1mb on Windows maybe?). As mentioned above, what matters isn't whether you're storing your components on the stack or the heap, it's that the data for each type of component is contiguous such that the CPU can predict that it should pull data into cache before you actually access it. Something like:


struct Components {
	PositionComponent *positions;
	VelocityComponent *velocities;
}

// ....

Components components;
components.positions = new PositionComponent[NumEntities];
components.velocities = new VelocityComponent[NumEntities];

// ....

for (unsigned int entity=0; entity < numEntities; entity++)
{
	positions[entity].x += velocities[entity].x;
	positions[entity].y += velocities[entity].y;
}

Now looking at this, you might notice a couple of problems you're going to run into very early on. The first is that you are probably going to want to dynamically add/remove entities on the fly, so you'll need some way of expanding the component arrays. If you're using C++, just use std::vector or some homebrew equivalent, if you so prefer.

Another issue is that not every single entity is going to be equipped with every kind of component. It's possible you'll have components that only 20% of your entities have, and while you can wrap each component in a std::optional (or, again, some homebrew equivalent) you'd still be allocating memory for components that go unused by the other 80% of your entities.

You'll want to consider a mix of storage strategies. The Specs crate (an ECS library written in Rust) conveniently has some pleasant documentation that covers the storage types it offers out of the box: https://slide-rs.github.io/specs/05_storages.html. There are likely similar libraries for C++ that you could look into the workings of if you want to learn from something more language-specific! I haven't done so myself, but it could be worth looking into Unity's approach for its ECS.

I think this is a pretty good resource for learning about ECS: https://kyren.github.io/2018/09/14/rustconf-talk.html. It's another Rust-focused piece that's laced with more propaganda than the Specs documentation, but if you can mentally filter out that noise you can get some solutions to problems you will inevitably encounter while working on your own ECS. At the very least, the section discussing generational indexes is quite helpful.

We are handling this as follows:

  • Every component has it's own memory region of the same kind of components. If the region is too small, it expands itself thread-safe. This way we keep the performance impact on adding new component instances of a type to an entity instead of where the systems aquire them
  • Entities don't contain pointers to anything, they instead contain an index for each component where it is stored in the global array and a mask.
  • Entities are collected once before they are processed for every system when an entity updates or is created. This array contains only indices to the global entity store and just those entities who's bit mask matches the bit mask of the system (a system for example depends on the transform and mesh component so every entitiy where (system_mask & entity_mask) eq system_mask is a valid candidate for processing)
  • The system acquires locks to the component arrays it operates on and the global entity store for as long as it is processing them. Locks can be taken for parallel-read or eclusive write, depending on what the system wants to perform on the components. This way multiple systems can in parallel proess the same components or subset of components if they only consume but not create data

Alternatives:
* Thread-safe methods without a state: Don't store anything for the caller, just give an API of separate calls taking types by reference. This is the low-level API targeting C and C++ that's difficult to port into higher languages. For C++ only, reference counted handles can be used to avoid leaking memory. Recommended for large projects.
* Calls without custom types: Make buffers for dynamic sprites and light sources to be filled, used and cleared in each frame. Only resize when a buffer overflows and reuse the allocation for the next time. Keep a quadtree(flat 2D) or octree(rotated isometric) of static sprites that can be pre-drawn to 512x512 background tiles while the camera moves. Background tiles can be drawn quickly using memcpy if you're using the CPU and the remaining dynamic sprites should only cover around 10% of the screen. The main drawback is having a global state, but it's okay for a small retro game that only has a few modules. The plus sides are not having to expose internal types to other languages in the API and you can't leak any memory. Recommended for tiny projects.
* Integer handles: Make a sparse list of objects and let them hold the same index until freed. The good side is to allow more powerful manipulation of objects across languages while having memory integrity encapsulated in a single language. The bad is giving up the ability to reference count automatically. Recommended for medium size projects written in different languages.

Thank you everyone for your answers! :D

@Zakwayda What do you mean "what I posted won't necessarily reside on the stack"? Aren't arrays local and therefore created on the stack? (Unless one creates a dynamic one with ptr = new Component[N])

Regarding the instantiation of components, you would suggest for something like


struct {
  vector<TransformComp> transforms;
  vector<ColliderComp> colliders;
  vector<SpriteComp> sprites;
  ...
} components

where all components are created (as defaults) just once at the beginning (and new ones are created whenever the vectors expand), and I then reference to them for "populating" an Entity? (Meaning, an entity is something like a list of indexes that refer to which components belong to that entity, and instantiating an entity just means instantiating this list, without any creation of new components, and maybe setting some members of its components, since the default ones may not work).

For the third point, yes, I was pondering about what to go for: simple arrays? pointers? vectors? (smart) pointers to vectors? other collections, like std::array?)

 

@DixiE Thank you for the comprehensive snippet! I am not very familiar with Rust. I had a look at the library and at the conf transcript, but I haven't been fully able to grok the contents. What seemed interesting in the conference transcript was the Generational Indexing. To me, it seemed a way to solve the "need for instantiation of new components" problem, since it's a way of marking unused components so that they are reused later for new entities (correct me if I'm wrong). I'll have to read everything again, because I am not sure I got any idea on how to solve the "not all entities use all components" problem (which is about addressing the inefficiencies due to allocating N components of all types at the beginning).

Any reference for C++ on these topics?

 

@Shaarigan Nice! What data structure do you use for the first point? (To keep all components of one kind, and expanding the collection in a thread-safe way. By the way, do you create N new components when you have the need for expanding one of those collections? Or do you just allocate additional memory and copy all old components to the new buffer, and then create components when you need them?)

What do you mean with " This way we keep the performance impact on adding new component instances of a type to an entity instead of where the systems aquire them" ? I'm not sure I understand.

Your technique (with entities that are just masks for which components they use) is similar to the ECS implementation described here:

is it? Or is it much different?

What do you mean exactly by "collecting entities before they are processed"? That you do something like:


// running update for system S
relevant_entities = vector
for entity in entities {
  if entity.mask matched S.mask
    relevant_entities.push(entity)
}
for relevant_entity in relevant_entities {
  // do stuff on relevant_entity's components that are important for S
}

instead of just looping once over all entities and updating their components if their mask matches the system's mask? (I.e. everything in one loop instead of 2 loops, one for "filtering entities" and one for updating components).

Interesting addition the one about locks and systems updating components in parallel! Any reference for that? (For ECS with parallel systems).

@Dawoodoz I am sorry I did not understand a single word of you answer. Is it supposed to be an answer to this thread? Please don't take that as an offense, it is probably me not understanding what you said because it was too complicated for me! (Just starting out)

 

2 hours ago, mujina said:

@Zakwayda What do you mean "what I posted won't necessarily reside on the stack"? Aren't arrays local and therefore created on the stack?

What you posted could also be (for example) at file scope, in which case it would reside elsewhere.

2 hours ago, mujina said:

Regarding the instantiation of components, you would suggest for something like



struct {
  vector<TransformComp> transforms;
  vector<ColliderComp> colliders;
  vector<SpriteComp> sprites;
  ...
} components

where all components are created (as defaults) just once at the beginning (and new ones are created whenever the vectors expand), and I then reference to them for "populating" an Entity? (Meaning, an entity is something like a list of indexes that refer to which components belong to that entity, and instantiating an entity just means instantiating this list, without any creation of new components, and maybe setting some members of its components, since the default ones may not work).

For the third point, yes, I was pondering about what to go for: simple arrays? pointers? vectors? (smart) pointers to vectors? other collections, like std::array?)

Not sure if this addresses your questions at all, but unless you want to use statically sized arrays (which is an option), I'd use std::vector unless you have a compelling reason not to. Just keep in mind that std::vector can reallocate in response to certain operations, which can be expensive and can lead to dangling pointers/references/iterators. (This isn't a drawback of std::vector - you'd likely face similar issues managing the memory yourself. It's just something to be aware of.)

It depends, if you have just a hand full of components, then allocating them in one big array will squeeze the one or other ms out of the processor but out implementation is a static templated collection of cuntions that maintain a data pointer. It can for example look like this


namespace Components
{
    template<typename Component> inline Array<Component>*& Instances()
    {
        static Array<Component>* instances;
        return instances;
    }
    template<typename Component> inline ReadWriteLock*& Lock()
    {
        static ReadWriteLock* lock;
        return lock;
    }
  
    template<typename Component> inline Component* AddComponent()
    {
         //naive implementation, grow only
         Lock<Component>()->AcquireWrite();
         if(!Memory::IsZero(Instances<Component>()->Last, sizeof(Component))
            Instances<Component>()->Resize();
            
         //... add component here
            
         Lock<Component>()->ReleaseWrite();
    }
}

So we have, due to the static local variable in our functions, a single instance of our component array and lock per component type. The compiler will create a separated unit for each of our template parameters and stores it on the heap. There is no need for us to manage it anyways and the inline calls can be optimized away so that we just access the static variable.

Array as same as ReadWriteLock are custom types we created for our needs. Resize dublicates the array and does a memory swapping with some extra space. The space is doubled in capacity each time we resize the array. This operation is costly and may involve the memory manager, so we put it into adding a new component. This is what I meant with the performance impact, doing memor allocations makes the AddComponent function costly but as you don't do it very frequently, you keep the bigger performance impacts away from the update loop. Otherwise you would need to allocate space for every entity and manage it's components there, this would involve some kind of performance impact on every access for caching for example.

By collecting entities once, you create a principle where entities are maintained in lists on the moment they are changed or created, you won't frequently loop through all entities just to see if they match certain component pattern. If a component is aded to an entity, the collector is noticed and refreshes it's internal list. Then if systems want to access certain entities with certain components attached, they can grab the entities that match exactly these mix of components from their requested list. This will however increase performance because entity IDs have to be stored in different arrays

ECS by itself is a rather generic concept, and there is no universal "best".  What works great in one system may work terribly in another system.

At the core ECS is a form of composition, you are composing things and building them up out of other things

You can use independent objects, you can use IDs to objects, handles to objects, pointers to objects, intermediate proxies, or many other ways to compose your things.  What matters much more is the access patterns and usage patterns.  Simple sequential linear access takes advantage of the hardware cache and tends to get the best processing throughput. Processing large blocks using the overhead of a single call tends to be better than processing many small blocks with the overhead of many small calls. Data locality matters, process on the CPU the data on the CPU, process on the GPU the data on the graphics card, process on remote hardware the data on remote hardware.  But those are all implementation details of ECS that are unique to the problems you are trying to solve.

 

This topic is closed to new replies.

Advertisement