C++ Component Based Engine Design

Started by
9 comments, last by phil_t 9 years, 10 months ago

a friend of mine and me are building a game. Therefore we decided to use a Entity Component system.

What do we want to achieve?

  • Cache friendliness
  • Fast iteration times
  • Code as small and clean as possible

Our overall design we got until now:

  • Handles instead of raw pointers
  • Entites are small classes with an array of componenthandles
  • Components are just data structs
  • Systems that got one job

To get this done we got a so called SlotMap that holds the components internally in a inner array linearly packed. You can get the component pointers via one lookup into this structure from outside. To work on the component data the systems just have to iterate over this inner array.

I want to share the main ressources where we got our inspiration from:

Some Code:


struct Entity
{
ComponentHandle componentIDs[64];

template<typename T>
ComponentHandle getComponent()
{
return componentIDs[TypeID::value<T>()];
}

template<typename T>
void addComponent(ComponentHandle ID)
{
componentIDs[TypeID::value<T>()] = ID;
}
};

This is the entity class. The systems get registered via the scene and can get back via the TyeID template.


class Scene
{
public:
Scene();
~Scene();

EntityID& createEntity();
void deleteEntity(EntityID ID);

template<typename T>
void registerSystem(void* dataStorage)
{
m_systems[TypeID::value<T>()] = dataStorage;
}

template<typename T>
ComponentHandle createComponent()
{
SlotMap<T, ComponentHandle>* storage = static_cast<SlotMap<T, ComponentHandle>*>(m_systems[TypeID::value<T>()]);
return storage->add();
}

template<typename T>
T* getComponent(ComponentHandle id)
{
SlotMap<T, ComponentHandle>* storage = static_cast<SlotMap<T, ComponentHandle>*>(m_systems[TypeID::value<T>()]);
return storage->get(id);
}

Entity& getEntity(EntityID ID);

private:

void* m_systems[64];
ECS::FreeListTable<Entity, MAX_ENTITIES>* m_entities;
};

There we got the scene. Here comes my first question: Is it a good idea to store the Systems this way as pointers and have them to register in the scene or could the be a better way via some macros or templates?


typedef SlotMap<Mesh, ComponentHandle> MeshMap;

class MeshRenderSystem
{
public:
MeshRenderSystem(Renderer* renderer, Scene* scene) : m_renderer(renderer)
{
m_meshArray = new MeshMap();
scene->registerSystem<Mesh>(m_meshArray);
m_effect = new GreenDrawEffect(m_renderer);
}
~MeshRenderSystem()
{

}

void update(RenderMatrixMap& matrices)
{
MeshMap::iterator i = m_meshArray->begin();
MeshMap::iterator j = m_meshArray->end();

RenderMatrixMap::iterator k = matrices.begin();

for (; i != j; i++)
{
m_renderer->m_paramManager->setWorldMatrixParameter(&k->worldMatrix);

m_renderer->m_paramManager->setShaderResourceParameter(L"objTexture", i->texture.resourceView);
m_renderer->m_paramManager->setSamplerStateParameter(L"samplerState", i->texture.samplerState);

m_renderer->drawIndexed(i->indexCount, i->vertexBuffer, i->indexBuffer, m_effect);

k++;
}
}

private:
MeshMap* m_meshArray;

Renderer* m_renderer;
GreenDrawEffect* m_effect;
};

There we got a system example. You can see the system iterates linearly over the data it has to render.

Okay so now, what do we want to ask?

  • Is it a good idea to have many of this small systems that all got one job and prepare the data for the next system?
  • Sometimes i think there are systems that need a big load of lookups to work on special data. That could kill the performance, when we got a lot of data. For example: we want to calculate the worldmatrices for the rendering. So we need the positions of the meshes. Therefore we can see which entities got meshes and lookup for these the positions and calculate the worldmatrix from. That way we got the number of meshes as lookups every frame.
  • The Slotmap holds the innerarray with the data that is packed and two indices array that are 32 bit per index. So the arrays are everytime (4 * 2 + sizeof(componentData)) * 2 ^ handle::index (usually 16 bit). even if there is maybe just the half of the components used. Would it be a good idea to make this arrays dynamically sizeable?

Thanks for reading! I'm very excited to see your opinions. I'm willing to share all knowledge we got until now, so if you want to know something that goes into this topic feel free to ask.

Alex

Advertisement

I think the idea of many small systems will work fine for any small game, but really you can make any small game out of anything. This always reminds me of the "building a doghouse" vs "building a house" phrase. You can make a doghouse out of anything! Real houses require good planning and execution.

So then I ask, what is the goal here? Are we trying to finish some small scope game, or are we trying to learn about some skills that might apply to a longer term code base? If the former I'd say "looks good"; try making the game and just experiment. Note what works well and what is causing problems, and solve problems that keep the game from finishing.

If the latter I'd say you will probably not really care about fine-grained components and systems. A code base of larger scope will have dedicated systems that solve specific tasks and they almost surely won't be decomposable into small components. So plan to have some broad components, where the components are really just interfacing between systems. For the gameplay and game design related components anything goes and whatever gets the job done the best should probably just be implemented. After the hacks are in, if needed, refactoring into a more general purpose piece of technology can occur.

In any case I think you're spending too much time worrying about the cache. The cache is pretty big on PC, and even if you are indirecting yourself to hell with pointers I bet you won't even see the performance hit for a long time in your game development. As long as you understand memory access, memory layouts and data structures I don't see how you could go wrong performance wise, at least for a hobby or 2 dev game. If you're using third party libraries then the cache is going to be appropriately utilized internally by whatever libraries you're using.

Hi,

thanks for your reply. We're doing this as a learning phase at the moment but the longterm goal is a bigger game.

As you wrote the gameplay stuff is planed to be done hacky.. We want to include some scripting language where the performance won't be that important.

I've read a lot about this caching stuff and thats not the only thing. We plan to introduce multithreading to the engine where I think this sets of systems could be very nice to use.

What did you mean by

So plan to have some broad components, where the components are really just interfacing between systems

struct Entity
{
ComponentHandle componentIDs[64];

template
ComponentHandle getComponent()
{
return componentIDs[TypeID::value()];
}


This immediately stood out to me as non-ideal. This limits you to only have 64 component types, period, limits you to only have one of each type of component (not good for more generic behavior components like `ScriptComponent`), and wastes a fair bit of memory. At least move to using a run-time-sized vector of sorted typeid/handle pairs (so you can have any number of components, duplicated, and any number of component types).

* Is it a good idea to have many of this small systems that all got one job and prepare the data for the next system?


It can be. The devil's in the details, of course, but this approach both gives you good cache behavior (read from one stream and write updates to another) as well as great threading opportunities (same reason).

Sometimes i think there are systems that need a big load of lookups to work on special data. That could kill the performance, when we got a lot of data. For example: we want to calculate the worldmatrices for the rendering. So we need the positions of the meshes. Therefore we can see which entities got meshes and lookup for these the positions and calculate the worldmatrix from. That way we got the number of meshes as lookups every frame.


Think through what you're actually operating on. You can have an array of all scene nodes that just iterate over them pulling matrices from animation data or physics (or pushing rather than pulling). Keep the arrays sorted in the smae order (e.g. by EntityID) and you can do a completely linear scan of both the source and destination arrays.

That said, lots of complex modules need to work with trees of some sort. You still store the data in contiguous memory regions preferably, of course. But the modules typically are not working on components but rather their own completely separate data structures. Rendering needs some kind of tree (DBVH trees, BSP trees, octrees, whatever you're using; maybe more than one type) that just stores rendering state information, physics/raycasts need trees and physical/filter data, even AI often needs trees.

This is one of the reasons I don't usually recommend people to follow the ECS guidelines; it's a set of principles based on assumptions that (in my experience, at least) aren't always true. Not that ECS is entirely invalid, I just don't think that applying ECS principles to every single component/system is a good use of your time.

The Slotmap holds the innerarray with the data that is packed and two indices array that are 32 bit per index. So the arrays are everytime (4 * 2 + sizeof(componentData)) * 2 ^ handle::index (usually 16 bit). even if there is maybe just the half of the components used. Would it be a good idea to make this arrays dynamically sizeable?


The slot map pattern I'm familiar with is definitely dynamically-sized. I'm not sure if that's what you're asking, though.

Sean Middleditch – Game Systems Engineer – Join my team!

This immediately stood out to me as non-ideal. This limits you to only have 64 component types, period, limits you to only have one of each type of component (not good for more generic behavior components like `ScriptComponent`), and wastes a fair bit of memory. At least move to using a run-time-sized vector of sorted typeid/handle pairs (so you can have any number of components, duplicated, and any number of component types).

Yeah there would a vector of pairs maybe fit a bit better.

So The idea of many small systems does'nt have to be that bad? .. Is it also a good idea to give only one job to one system?

Keep the arrays sorted in the smae order (e.g. by EntityID)

Don't I kneed to lookup by the ID for sorting them?

This is one of the reasons I don't usually recommend people to follow the ECS guidelines; it's a set of principles based on assumptions that (in my experience, at least) aren't always true. Not that ECS is entirely invalid, I just don't think that applying ECS principles to every single component/system is a good use of your time.

Yeah I already had something like this: The Camera.. For example there is just one in the scene at the moment so its easier to implement it as a single object.


template<typename T, typename HandleType>
class SlotMap
{
public:
    SlotMap() : m_firstFreeIndex(0), m_activeElements(0)
    {
        for (uint32 i = 0; i < Pow<2, HandleType::index_bits>::result - 1; ++i)
        {
            m_outerArray[i] = OuterHandle(i + 1);
        }

        m_outerArray[Pow<2, HandleType::index_bits>::result - 1] = OuterHandle();
    }

    ~SlotMap()
    {

    }

    HandleType add()
    {
        uint32 newIndex = m_firstFreeIndex;

        m_firstFreeIndex = m_outerArray[newIndex].m_nextFreeIndex;
        //m_outerArray[newIndex].m_nextFreeIndex = 0;
        m_outerArray[newIndex].m_validation++;
        m_outerArray[newIndex].m_innerIndex = m_activeElements;

        m_denseToSparse[m_activeElements++] = newIndex;

        return HandleType(newIndex, m_outerArray[newIndex].m_validation);
    }

    void remove(HandleType handle)
    {
        m_outerArray[handle.m_index].m_nextFreeIndex = m_firstFreeIndex;
        m_firstFreeIndex = handle.m_index;

        m_outerArray[m_denseToSparse[m_activeElements - 1]].m_innerIndex = m_outerArray[handle.m_index].m_innerIndex;

        m_denseToSparse[m_outerArray[handle.m_index].m_innerIndex] = m_denseToSparse[m_activeElements];

        m_innerArray[m_outerArray[handle.m_index].m_innerIndex] = m_innerArray[m_activeElements--];
    }

    T* get(HandleType handle)
    {
        if (has(handle))
        {
            return &m_innerArray[m_outerArray[handle.m_index].m_innerIndex];
        }
        else
        {
            return nullptr;
        }
    }

    bool has(HandleType handle)
    {
        if (m_outerArray[handle.m_index].m_validation != handle.m_validation)
        {
            return false;
        }

        return true;
    }

    /* Accessors stuff for the inner array. We don't want the user to operate on the index arrays. */
    typedef Iterator<T> iterator;
    typedef Iterator<const T> const_iterator;

    iterator begin()
    {
        return iterator(&m_innerArray[0]);
    }

    iterator end()
    {
        return iterator(&m_innerArray[m_activeElements]);
    }

    const_iterator cbegin()
    {
        return const_iterator(&m_innerArray[0]);
    }

    const_iterator cend()
    {
        return const_iterator(&m_innerArray[m_activeElements]);
    }


private:
    struct OuterHandle
    {
        OuterHandle()
        {
        }
        explicit OuterHandle(uint32 nextFreeIndex) : m_nextFreeIndex(nextFreeIndex), m_validation(0)
        {
        }

        uint32 m_nextFreeIndex : 16;
        uint32 m_validation : 16;

        uint32 m_innerIndex;
    };

    /*
     * The size of the array depends on the set handle.
     * If the index of the handle is set to 12 for example the array can hold 4096 structs.
     */
    T m_innerArray[Pow<2, HandleType::index_bits>::result];

    OuterHandle m_outerArray[Pow<2, HandleType::index_bits>::result];

    uint32 m_denseToSparse[Pow<2, HandleType::index_bits>::result];

    /*
     * The number of active elements.
     * This is used to get the index of the last active elements in the array.
     */
    uint32 m_activeElements;
    uint32 m_firstFreeIndex;
};

That's the SlotMap. I know the one you posted on your blog. But I didn't understand why its dynamically sized, because i've read often this one "You're a game programmer, you know about your data!"

So the map should be made that big as it needs to, or not?

Yeah there would a vector of pairs maybe fit a bit better.


That's what I meant, yeah. See http://www.boost.org/doc/libs/1_55_0/doc/html/container/non_standard_containers.html#container.non_standard_containers.flat_xxx for what is a pretty great data structure for this kind of stuff (flat_map).

So The idea of many small systems does'nt have to be that bad? .. Is it also a good idea to give only one job to one system?


Evaluate it on a case-by-case basis. If you have a bunch of inter-dependent systems, you might have a problem. If you have a system that does several radically different things, you might have a problem. There is no single rule that you can just blindly follow.

Keep the arrays sorted in the smae order (e.g. by EntityID)

Don't I kneed to lookup by the ID for sorting them?


Maybe. Again, see flat_map. Dependending on how many items you have (and if you remember to keep PODs as PODs and trivially movable/copyable/destructible) you can get by just fine with the flat_map.

Yeah I already had something like this: The Camera.. For example there is just one in the scene at the moment so its easier to implement it as a single object.


I still highly recommend using components and generic game objects. Just not trying to force one architectural pattern (ECS) onto every single component type.

But I didn't understand why its dynamically sized, because i've read often this one "You're a game programmer, you know about your data!"
So the map should be made that big as it needs to, or not?


_Do_ you know about your data? Did you pick the right size you need after careful analysis and tuning? "64" in the component list case seemed like an arbitrary programmer number to me. smile.png

If level A needs a different size than level B, can you apply those? If so, do you force level designers to pick the right sizes? At the very, very least, avoid _compile-time_ sizes.

Do you have metrics in place to find out if you're under-utilizing the size you picked? If you have said metrics, couldn't you just as easily detect and warn if you're going over a soft limit rather than enforcing a hard one?

Are you targeting a system with constrained memory? 32-bit PC apps have 2GB (more if you set LAA mode in Windos)and 64-bit apps typically have 4+GB available for use without any paging, XBone and PS4 are both 64-bit w/ 4GB of application-addressable memory (maybe more down the road given they have 8GB). Do you really need to compromise your iteration speed and development time to make sure you don't use a few more kilobytes of memory than you expected?

Sean Middleditch – Game Systems Engineer – Join my team!


So plan to have some broad components, where the components are really just interfacing between systems.

This seems like good advice to me - you should make your system flexible so that you can change it later on without rewriting everything. No matter how much you plan ahead, no matter how much advice you get here, and no matter how much you try to optimize ahead of time, you will always find something that you want to change later. If you make flexibility a priority from the start, you will be much better off than if you try to optimize for cache access now...

The other thing that I would like to stress is that you already have a sample implementation. Why don't you use it for a while and see what you like and don't like about it? Do some profiling and see if your cache assumptions are paying off. Try having one task per component system, and then try it out with three or four per system. You will learn more about the way the system works, you will learn about how to reason about your design, and you will know what to change going forward (for this project or the next one). Doing it yourself is going to be the better learning experience, so don't be afraid to take a shot!

Hi,

thanks for your reply. We're doing this as a learning phase at the moment but the longterm goal is a bigger game.

As you wrote the gameplay stuff is planed to be done hacky.. We want to include some scripting language where the performance won't be that important.

I've read a lot about this caching stuff and thats not the only thing. We plan to introduce multithreading to the engine where I think this sets of systems could be very nice to use.

What did you mean by

So plan to have some broad components, where the components are really just interfacing between systems

That means that if you plan to use some third party libraries you'll be interfacing between the libraries, and your components can do this. If you plan to make everything from scratch, you probably shouldn't shove all implementation into the guidelines of "ECS" or "Component Base Architecture" simply because the assumptions these methodologies make are just not going to be true in the general purpose. There is no general purpose. A physics engine will have physics problems. Networking code will have networking problems. Multi-threaded code will have threading problems. All of these problems will require specification solutions, and the best code will probably not arise from trying to make it all components.

I think no matter what advice anyone gives you, you won't really be able to absorb it in its fullest without implementing your engine to find where your understanding can be enhanced. I'm referring to the type of "high level perspective", or the bigger picture, where you can really see the far reaching implications of different decisions. Perhaps this sort of learning experience is a more personal one based on specific implementation experience. I can imagine if you don't have anyone in real life that you work with to glean this sort of knowledge and experience from, you'll have to go build it yourself.

So even though you explicitly state you want your code base to grow and perform long term, I think you really should just go with what I originally said: just start implementing things and start small. Note what works and what doesn't, and fix things that prevent the game from being finished. A good API/Engine/Library is one that has a lot of users. Your engine won't be the best it can be until it's "battle tested", where you yourselves are the users that provide feedback, and the feedback is used to reach the next iteration. Perhaps reading about the history of the language Lua can provide some perspective on this topic. I recommend reading it.

Hey,

Thanks for your replies! I've thought a bit on your answers.

@Sean: I think a half half approach could be fine maybe working with some reserve mechanism, that uses a fixed length but haveing some more space is not bad.

The 64 is just so i got an array your right ;)

And yeah we are developing for PC only. Multiplattform would be to much work at the moment.

@Randy: We are trying this do it and see if it works approach a lot but there is such a big amount of stuff we have to think of that's why I asked if somebody has some tips.

But you are right as longer we work with it, there more it comes out where we have to go.

I came up with a new question as my friend is implementing the graphical part and makes use of textures meshes and other resources. So at the moment I have to think about resource management and can't figure out if I should do a "resource manager" or if the resources are managed by the ECS.

To show what i mean:

a resource manager should load, cache, give resources and so on. Lets see it for textures for example: The textures could be stored linearly withing our ECS systems or there is a manager that stores it in its structures and the ECS holds a pointer on it. That sounds somewhat against this stuff I've read. But maybe I'm again to much fixed on this linear memory problem?


… So at the moment I have to think about resource management and can't figure out if I should do a "resource manager" or if the resources are managed by the ECS.

To show what i mean:
a resource manager should load, cache, give resources and so on. Lets see it for textures for example: The textures could be stored linearly withing our ECS systems or there is a manager that stores it in its structures and the ECS holds a pointer on it. That sounds somewhat against this stuff I've read. But maybe I'm again to much fixed on this linear memory problem?

Resource management is responsible for the lifetime of resources, while ECS (or the involved sub-systems, to be precise) is a client of resources. There can be several entities that use the same resource by their respective components. It must not be the job of a component to know that another component is using the same texture, even if the behavior is outsourced into sub-systems. So these are two totally different concerns, and hence should be handled by different systems.

If you have many small objects, then storing them linearly will give you an advantage if you are able to process them to some extent in order of storage. Textures are definitely not small in footprint, and you never process them in storage order. Furthermore, most textures reside in VRAM after they have been uploaded. All you need to store in CPU RAM is a reference and metadata about the texture.

This topic is closed to new replies.

Advertisement