# Making a Data-Oriented ECS More Extensible

## Recommended Posts

Hello everyone.

I've been working on my game for a while now, and I am beginning to dislike my means of adding new component tables and such to my game. My components at the moment are hard-coded classes formatted as structures of arrays; you can basically think of it as a relational database. Each variable of the component has its own dedicated vector in the table, meaning I can loop through them all in a cache-friendly manner.

When it comes to the function in-game, it works nicely. Development, on the other hand, is a little annoying. As I am trying to remain data-oriented, I am avoiding inheritance as much as possible. So each component table is an entirely standalone class. However, the underlying idea of the component table is the same; some tables do have some other tricks (which I will cover later), but the general idea is similar. So I essentially have to copy a lot of my code, add it to my giant World class that has all the component tables, make sure its garbage collection gets called, etc. I feel like this should be made easier. Especially if I wanted to support modding in the future.

I do have an idea as to how to possibly improve things, but I figured I should bounce it off people who might have more experience in this field. Essentially my plan is to replace all these component table classes I've made with a single class that stores a contiguous array of bytes with some metadata. So if I have a component table that was modeled something like this:

//...

std::vector<Entity> entities;
std::vector<vec3> positions;
std::vector<quat> rotations;
std::vector<vec3> scales;

//...

I would instead model it something like this:

//...

std::vector<uint8_t> data;
size_t entitiesIndex;
size_t positionsIndex;
size_t rotationsIndex;

//...

Essentially, I would flatten my structure to a single array, with indexes into that array indicating where each variable can be accessed. Why would this confer any sort of advantage? Because I could then theoretically create a file format that creates these tables based on my specifications. All data fundamentally can be represented as a contiguous array of bytes; all I need to know is the size of the datatype and how many are present in order to get those offsets to point to the correct memory. It would not need any sort of inheritance based structure; its an array of bytes, and an array of offsets. I can just create them and store them in an array in the World class. If I want to get a specific table, I can just index into that array. If I want to use the data in that table, I can just do a memcpy or a reinterpret_cast and process as desired.

I've been mulling over this idea for a little while, but I still have some reservations. This has to do with the tricks I mentioned earlier. I don't use enums or bools in the component tables. I use the indices of the components to indicate what their condition is (if the index is less than x, the condition is false, for example). This allows contiguous processing with minimal branching. I would need to make the format aware of this sort of trick. Theoretically, not that hard; I just need to indicate a number subdivisions for the component. However, if I were to create an entity in this fashion, I would need to be aware of runtime information in order to put it in the right place. I suppose I could have some intermediary tables that I put those sorts of components into and then handle it as a follow-up step, but that just seems sloppy. The way I've handled that runtime problem in the past was to have a series of factory classes for each component type that would specifically be passed in that information. I am also hoping to get rid of these; if I am thinking of completely redoing my ECS, this would be the time to lose them.

I've yet to think much on how systems would be implemented better; Right now they are implemented as classes that have references to the component tables they are interested in, and a single method. I suppose I could just pass the World a series of function pointers and have them be executed in some given order. But seeing as I am going for something data-oriented, function pointers seem a bit off the path. But maybe I'm being silly.

Has anybody here implemented an ECS in a similar fashion? Is it worth my time to try to rewrite my entire ECS structure in this way? Is what I want even feasible or wise? How would you tackle this problem? I hope I expressed myself clearly enough; the thought of this has wracked my brain for a bit, so I might be a little caught in my own world here.

Thanks for the help.

##### Share on other sites
5 hours ago, DrDeath3191 said:

As I am trying to remain data-oriented, I am avoiding inheritance as much as possible.

Inheritance also could achieve the so-called DOD, while it's more flexible to choose a composition approach. e.g.:

struct A
{
int foo;
};

struct B
{
A a;
int bar;
};

struct C : public A
{
int bar;
};

5 hours ago, DrDeath3191 said:

So I essentially have to copy a lot of my code, add it to my giant World class that has all the component tables, make sure its garbage collection gets called, etc.

Would you consider to use some sort of the Singleton Pattern for the aggregation of the component tables?

5 hours ago, DrDeath3191 said:

Essentially my plan is to replace all these component table classes I've made with a single class that stores a contiguous array of bytes with some metadata﻿.

If you could implement any kind of mechanisms that ensures different std::vector<T> could be allocated adjacent to each other in heap memory, then they would achieve the same result. Don't build the wheel by yourself if the language and the standard library has already built it for you.

5 hours ago, DrDeath3191 said:

All data fundamentally can be represented as a contiguous array of bytes; all I need to know is the size of the datatype and how many are present in order to﻿ get those offs﻿﻿﻿﻿﻿﻿﻿﻿ets to point to the correct memory. It would not need any sort of inheritance based structure; its an array of bytes, and an array of offs﻿ets﻿﻿.﻿

What you are talking about is almost a custom heap memory management module. Take a look at how to implement your own std::allocator if you want to keep sticking with STL.

5 hours ago, DrDeath3191 said:

I use the indices of the components to indicate what their condition is (if the index is less than x, the condition is false, for example). This allows contiguous processing with minimal branching.

How do you know this approach actually minimizes branching? Is branching hurt your performance? Is there any practical profiling results that support your design? Are you sure this kind of design approach suitable enough for your target CPU architecture?

5 hours ago, DrDeath3191 said:

However, if I were to create an entity in this fashion, I would need to be aware of runtime information in order to put it in the right place. I suppose I could have some intermediary tables that I put those sorts of components into and then handle it as a follow-up step, but that just seems sloppy. The way I've handled that runtime problem in the past was to have a series of factory classes for each component type that would specifically be passed in that information. I am also hoping to get rid of these; if I am thinking of completely redoing my ECS, this would be the time to lose them.

There is no silver bullet for a one-in-one-out factory, at least hard to design. My personal idea about the object instantiation is tended to keep things simple, that I would parse the object creation information to the final stage when they could be put into the component's data directly, until then I won't care about the immediate data because they are temporary garbage, I'd use some RAII or other things to get rid of the footprint. Again, it's a situational-oriented solution for me, I don't always SoA so much if they are non-sense for certain business, just let the L2-cache hit-rate misses like hell until it really hurts the performance significantly.

5 hours ago, DrDeath3191 said:

But seeing as I am going for something data-oriented, function pointers seem a bit off the path﻿.

Don't be a fundamentalist of DOD, function pointers, lambda, callable objects and whatever, they are also a kind of data, the code is data, the procedure is data, everything is data when they are consumed by a certain module. If you're targeting C++11 or later, why not take a look at std::function and std::packaged_task? Functional Programming is a naturally good friend of DOD!

Finally, ECS is just ECS, it's not a worthwhile solution if it is not worthwhile for a certain problem.

##### Share on other sites
5 hours ago, zhangdoa said:

11 hours ago, DrDeath3191 said:

So I essentially have to copy a lot of my code, add it to my giant World class that has all the component tables, make sure its garbage collection gets called, etc.

Would you consider to use some sort of the Singleton Pattern for the aggregation of the component tables?

He should just write and use a few templates, if he always needs to copy and paste code, which only changes cause of different component types. Singletons woud only add more useless boilerplate code.

Edited by wintertime

##### Share on other sites
5 hours ago, zhangdoa said:

Inheritance also could achieve the so-called DOD, while it's more flexible to choose a composition approach. e.g.:

Ehh, I'm not too sure about that. I also have some other methods to the components like accessors, registry and removal. These may need to be slightly different based on the specific needs of the data, meaning I have virtual function pointers to worry about which are not so great performance wise, and definitely to be avoided in data-oriented stuff.

5 hours ago, zhangdoa said:

Would you consider to use some sort of the Singleton Pattern for the aggregation of the component tables?

No. There's no need for this to be a singleton.

5 hours ago, zhangdoa said:

If you could implement any kind of mechanisms that ensures different std::vector<T> could be allocated adjacent to each other in heap memory, then they would achieve the same result. Don't build the wheel by yourself if the language and the standard library has already built it for you.

Not sure I follow you here. What mechanisms are you talking about?

5 hours ago, zhangdoa said:

What you are talking about is almost a custom heap memory management module. Take a look at how to implement your own std::allocator if you want to keep sticking with STL.

Pretty much, I guess. I can look into it, although I could easily replace the functionality with pointers instead of vectors. Would probably increase my debug build's performance, too.

5 hours ago, zhangdoa said:

How do you know this approach actually minimizes branching? Is branching hurt your performance? Is there any practical profiling results that support your design? Are you sure this kind of design approach suitable enough for your target CPU architecture?

By its very design. Let's say for example I have an attack originating from Team 1 that can hit only teams 2, 3, and 4. I could store an enumeration in the hitbox table, loop through all of them and ask "what team are you?". This would result in a bunch of branches by necessity. However, if the table is sorted in such a way that teams are contiguously aligned, I don't actually need to ask; I know by the index that this is a valid target to check. No branches, and cache friendly to boot.

6 hours ago, zhangdoa said:

There is no silver bullet for a one-in-one-out factory, at least hard to design. My personal idea about the object instantiation is tended to keep things simple, that I would parse the object creation information to the final stage when they could be put into the component's data directly, until then I won't care about the immediate data because they are temporary garbage, I'd use some RAII or other things to get rid of the footprint. Again, it's a situational-oriented solution for me, I don't always SoA so much if they are non-sense for certain business, just let the L2-cache hit-rate misses like hell until it really hurts the performance significantly.

I suppose putting it off to a follow-up step is how I would have to do it, if I pursue this idea further. Ah well, I suppose it isn't too terrible.

6 hours ago, zhangdoa said:

Don't be a fundamentalist of DOD, function pointers, lambda, callable objects and whatever, they are also a kind of data, the code is data, the procedure is data, everything is data when they are consumed by a certain module. If you're targeting C++11 or later, why not take a look at std::function and std::packaged_task? Functional Programming is a naturally good friend of DOD!

Finally, ECS is just ECS, it's not a worthwhile solution if it is not worthwhile for a certain problem.

Well the idea behind DOD is to maximize the efficiency of data traversal. Just because stuff is 'data' doesn't make it DOD. However, I think you may have a point in regards to the systems; a function pointer is not that much more expensive that a direct function call, and it would be rather convenient to add systems that way.

If you might be wondering where I'm getting some of my ideas, I modeled my ECS after this article. I of course stopped following before doing the binary blob, but am currently rethinking that decision.

Thanks very much for your input!

##### Share on other sites
1 hour ago, DrDeath3191 said:

Ehh, I'm not too sure about that. I also have some other methods to the components like accessors, registry and removal. These may need to be slightly different based on the specific needs of the data, meaning I have virtual function pointers to worry about which are not so great performance wise, and definitely to be avoided in data-oriented stuff.

As @wintertime has already mentioned, template is a good idea to solve the polymorphism problem in compile-time. Basically, if you could combine composition over inheritance, function overloading and template specialization in an elegant way then you would get a similar or even better performance than vptr and vtable old school games, DOD is not always against OOD 😀.

1 hour ago, DrDeath3191 said:

Not sure I follow you here. What mechanisms are you talking about?

1 hour ago, DrDeath3191 said:

Pretty much, I guess. I can look into it, although I could easily replace the functionality with pointers instead of vectors. Would probably increase my debug build's performance, too.

If you want to use std::vector efficiently, then manage heap memory allocation "by yourself". It's not about how to implement some complex scary stuff, just design some allocation tracker or wrapper classes at least (the default std::allocator implementation of MSVC 19.xx on my computer is just a wrapper around the global new()), and use some factory to instantiate your component vectors, then you could fully ensure that they would be put into coherent heap memory ranges.

Another choice is using a pre-allocated std::array if your component's maximum count is finite (but it's already doesn't matter whether it's an std::array or std::vector). Or you could implement an object pool with pre-allocated raw heap memory. Again, all options are possible, just you need to find the best one for your situation. Main memory is typically (almost actually) a DRAM, while cache memories inside the CPU chip are typically (almost actually) SRAM, just be aware that whether how you represent the array structure in your C++ code, the physics of the hardware would always have the same behavior, all the overburdens come from the abstraction.

1 hour ago, DrDeath3191 said:

However, if the table is sorted in such a way that teams are contiguously aligned, I don't actually need to ask; I know by the index that this is a valid target to check. No branches, and cache friendly to boot.

Who paid the bill of your sort operation 😀? Any std::sort stuff? Did you have any overloaded operator==() or operator<()? Do you sort without any comparison? If so, that's awesome! But if not, then you still have the cost of branching at a certain moment when comparison happens 😀.

I'm fully agreed with DOD as far as Mike Acton broadcasted the idea louder in his cppcon talk, also I've read the Bitsquid's blog post long time ago while I was also ECS-ing at that moment (His post is awesome btw!). Just my experiences told me that nothing in reality is really as simple and elegant as the example codes they are. When you move your head to the products, you always need to complex things up or compromise things down. One major "con" of DOD, or of ECS, is it quite rely on a well designed Software Model (Or a brain blew up programmer 🤯), you must model the Domain Model into computer-friendly rather than programmer-friendly tasks, then you have to profile often in order to make sure that your design is really cache-friendly. If you have a further interest you could read chapter 6 of Computer Systems: A Programmer's Perspective, it discussed the memory and cache related topics thoroughly 🍺.

##### Share on other sites
56 minutes ago, zhangdoa said:

Who paid the bill of your sort operation 😀? Any std::sort stuff? Did you have any overloaded operator==() or operator<()? Do you sort without any comparison? If so, that's awesome! But if not, then you still have the cost of branching at a certain moment when comparison happens 😀.

Not at all! Enums can essentially just be thought of as integers, so I keep track of the indices where each value of the enum's segment ends in an array. Cast the enum to an index into that array, insert the entity's data into that index of the table, and then increment all later indices by 1. Easy peasy, constant time, no comparisons. I do need to take care that order is maintained when removing entities or they change their enum value, but that just means rotating the vector (std::rotate for people who may be confused) and patching up the index values.

1 hour ago, zhangdoa said:

As @wintertime has already mentioned, template is a good idea to solve the polymorphism problem in compile-time. Basically, if you could combine composition over inheritance, function overloading and template specialization in an elegant way then you would get a similar or even better performance than vptr and vtable old school games, DOD is not always against OOD 😀.

Yeah, its the elegant part that tends to be an issue. If I'm reading the two of you correctly (and I'm very likely not), are you perhaps suggesting using the curiously recurring template pattern to create tables with variadic templates, one template per component type? And providing accessors, registers and removers in similar fashion? I just want to be sure, as I'm a little rusty with template magic.

1 hour ago, zhangdoa said:

If you want to use std::vector efficiently, then manage heap memory allocation "by yourself". It's not about how to implement some complex scary stuff, just design some allocation tracker or wrapper classes at least (the default std::allocator implementation of MSVC 19.xx on my computer is just a wrapper around the global new()), and use some factory to instantiate your component vectors, then you could fully ensure that they would be put into coherent heap memory ranges.

Another choice is using a pre-allocated std::array if your component's maximum count is finite (but it's already doesn't matter whether it's an std::array or std::vector). Or you could implement an object pool with pre-allocated raw heap memory. Again, all options are possible, just you need to find the best one for your situation. Main memory is typically (almost actually) a DRAM, while cache memories inside the CPU chip are typically (almost actually) SRAM, just be aware that whether how you represent the array structure in your C++ code, the physics of the hardware would always have the same behavior, all the overburdens come from the abstraction.

Okay, but it's really not the actual form of the data I'm concerned with right now. The reason I was planning on changing the structure was to facilitate creating new tables, possibly without having to write any code. I don't care if the data is a series of separate vectors, a binary blob, a fixed-size array, etc. Hell, if I understood you two correctly, that could just be another templated thing as well. I just want to make adding new component tables easier, since the underlying functionality is mostly similar, without having to lose too many of the advantages of Data Oriented design.

2 hours ago, zhangdoa said:

Just my experiences told me that nothing in reality is really as simple and elegant as the example codes they are. When you move your head to the products, you always need to complex things up or compromise things down.

You're telling me. Thankfully I can rely on people who have gone before me to provide some useful direction. Thanks again, by the way.

##### Share on other sites
On 9/12/2019 at 1:56 AM, DrDeath3191 said:

Because I could then theoretically create a file format that creates these tables based on my specifications. All data fundamentally can be represented as a contiguous array of bytes; all I need to know is the size of the datatype and how many are present in order to get those offsets to point to the correct memory. It would not need any sort of inheritance based structure; its an array of bytes, and an array of offsets.

• You can "create tables" with well-designed templates (a "table of T" with assorted extra template parameters to represent metadata, behaviour variations, etc.) that are going to grow more sophisticated, and be upgraded for all components, as you develop your game.
• In case of allergy, you can "create tables" with all sorts of code generation tools instead of C++ templates.
• In any case, a std::vector<T> without any indexing and management is a very rudimentary data structure for the purpose of holding and accessing components, and handling multiple arrays "by hand", throwing away normal C++ facilities like allocators and names, is a complication for no benefit and not a simplification.

## Create an account

Register a new account

• ### Game Developer Survey

We are looking for qualified game developers to participate in a 10-minute online survey. Qualified participants will be offered a \$15 incentive for your time and insights. Click here to start!

• 12
• 14
• 10
• 33
• 23