Some problems with ECS design

Started by
36 comments, last by Oberon_Command 7 years, 8 months ago

Hi! Now I am developing my first Entity Component System.

My ECS must be:

1)fast (no vtables and random memory access)

2)cache-friendly

Now I have this design:

1. Entity - pointer to the World + unique id

2. Component - pure struct without logic

3. System - storage and processing logic of components with type T. All components stored in a continuous memory area (aka array or vector)

4. 1 Component = 1 System

5. Intersystem interaction occurs via messaging

6. If I need share some components I need a new component + system. Example:


struct Position
{
  Position(float x = 0, float y = 0): x(x), y(y) {}
  float x, y;
};


struct Rotation
{
  Rotation(float angle = 0): angle(angle) {}
  float angle;
};


struct Size
{
  Size(float x = 1.0f, float y = 1.0f): x(x), y(y) {}
  float x, y;
};


struct Transform
{
  Transform(Position* position,
            Rotation* rotation,
            Size* size):
            position(position),
            rotation(rotation),
            size(size) {}
  Position* position;
  Rotation* rotation;
  Size* size;
};

Problems:

1. How can I ensure the safety of pointers? std::vector can break my pointers while resizing.

(P.S. I don't want have array with a static size)

2. Is it okay to use std::unordered_map very often? (insert/remove/find. My key is unique uint32_t number)

Advertisement

1. One way to do it is to not store pointers to your components, but IDs and having a second vector that maps IDs to pointers (or indices into your "actual" vector). This way you can shuffle objects around in the component storage vector without caring about the outside world, as long as you update the mapping vector.

This kind of design does not only protect you from pointers becoming invalid on resize, but also allows you to reorder objects in your vector to avoid empty spots, which is good for locality and therefore one step further towards a cache friendly architecture.

By using smaller indices than 32bit (do you really need more than 64k components of one kind?) you may be able to store your mapping in only 128k bytes and store 32 of them in a single cache line.

2. Using an unordered_map is often faster than a "normal" std::map, but it still uses dynamic allocations, since it is probably implemented using a list for each bucket. It touches less memory than a map, but it's still far from free. If you really want fast accesses and can find a way to do it, then limit the range of your keys and use a vector or array.

One way to do it is to not store pointers to your components, but IDs and having a second vector that maps IDs to pointers (or indices into your "actual" vector). This way you can shuffle objects around in the component storage vector without caring about the outside world, as long as you update the mapping vector.

I think this may negatively affect on beauty of components:


struct Transform
{
Transform(Handle<Position> position,
            Handle<Rotation> rotation,
            Handle<Size> size):
            position(position),
            rotation(rotation),
            size(size) {}
  Handle<Position> position;
  Handle<Rotation> rotation;
  Handle<Size> size;
};

And mapping stage seems a little overhead. In my design the system is not very often use Entity (it only used when i must remove component/destroy entity/attach another component). Most often, the system updates the components that are in its possession: iterates by it's vector<T>.

Are there more ways to fix bug?

Using an unordered_map is often faster than a "normal" std::map, but it still uses dynamic allocations, since it is probably implemented using a list for each bucket. It touches less memory than a map, but it's still far from free. If you really want fast accesses and can find a way to do it, then limit the range of your keys and use a vector or array.

But what if unordered_map used only if i need attach/dettach some component? It will not used very often...

Cache-friendly HashMap exist?

Cashe friendly hash map : google::dense_hash_map

You can either use indices in to the vectors, or you can use direct pointers if you can gaurentee the component vectors stay below some capacity that you set at startup.

If your using vectors to store you comps, and are using the indice method, you have to make sure your comps are still valid when copied.

A dense hash map with comp type id to comp index per entity would perform just fine.

@[member='EarthBanana'], how your implementation of ECS works? If it is not a secret :)

google::dense_hash_map

Looks nice. Thanks!

It looks like these are all premature attempts at optimizations. The things you are discussing are not problems in the real world.

1)fast (no vtables and random memory access)


If you need virtual dispatch then vtables are currently the fastest available way to implement that. You're going to need some way to call the function.

On x86 processors since about 1995 virtual dispatch has approximately zero cost, the very first time they are accessed the CPU caches them, and assuming you touch them occasionally they'll stay on the CPU. Since you should be touching them around 60+ times per second, they CPU will happily keep them around for zero cost to you.

In other words, you say you don't want vtables as you think they are not fast; but vtables are the fastest known solution to the task. Use them.



As for random memory access, unless you can somehow organize your world and scene graphs so data traversal of components is linear, you'll need to live with some of that. Be smart about it so the jumping around lives in L2 cache.

Random memory access can be amazingly fast, or it can be tortuously slow. The only way to know is to run it on a computer, profile it with cache analysis tools, and determine how your real-world memory patterns are working.

2)cache-friendly



While you have clearly minimized size in your data (good), you have decreased cache friendliness by making then non-contiguous. You have a continuous array of structures. The cache is generally most happy with a structure of arrays; parallel instructions are best when operating on a batch of elements at once rather than operating on a single item alone.

There is far more to cache friendliness than size of the data. The only way to know for certain how your program interacts with the cache is to run it with cache analysis tools to determine how your real-world memory patterns are working.

1 Component = 1 System

I need a new component + system



You keep using the word "system" in a way I'm not familiar with. A system is any group of connected things. Any time you take any series of actions with any object you have created a system. The interfaces you create define how the system is used.

Did you create a process or class or structure and give it a name "system"?

Intersystem interaction occurs via messaging



This can work reliably, but the thing most developers think of with messaging tends to add performance overhead, not remove it.

If you need to make a function call on an object then do so, that is the fastest way. Going through a communications messaging service adds a lot of work.

There are many excellent reasons to use messaging services: allow extension by adding message listeners, resolving threading issues and non-reentrant code, and processing load balancing are a few of them. Faster execution time is not one of those reasons.

1. How can I ensure the safety of pointers? std::vector can break my pointers while resizing. (P.S. I don't want have array with a static size)

If you use any type of dynamic array and you add or remove items, you cannot avoid it. If you want the addresses to remain constant you cannot use a dynamic array. I suggest learning more about fundamental data structures.

Some other options are to use a different structure (perhaps a linked list style) or to store a reference to an object (such as a container of pointers). This will break your continuous access data pattern, but give you stable addresses. You need to decide which is more important.

Alternatively you can design your system to not store addresses of items, to work on clusters of items at once, and to not hold references to data they don't own.

If you really are concerned about performance you need to start up a profiler and look at the actual performance. You need to measure and compare with what you expect, you need to find the actual performance concerns. Performance is not something you can observe by just looking at the code alone. You need to actually see how it is moving through the computer. Generally the best performing code looks complex and is larger than you first expect; the solutions that are simple and small tend to perform poorly as data grows because they don't fit the CPU's best features.

The things you are mentioning are tiny performance gains by themselves, and if you do have any performance concerns on your project they're not coming from these choices mentioned in the thread.

You keep using the word "system" in a way I'm not familiar with. A system is any group of connected things. Any time you take any series of actions with any object you have created a system. The interfaces you create define how the system is used. Did you create a process or class or structure and give it a name "system"?

In classic ECS System - is a logic, Component - is a data.


struct Drawable
{
  Drawable(Transform* transform, Texture texture):
          transform(transform), texture(texture) {}
  Texture texture;
  Transform* transform;
};


struct Renderer: public System<Drawable>
{
  void update(float dt)
  {
    for_each
    (
      [](Drawable& d)
      {
        sprite.set_texture(d.texture);
        sprite.set_transform(*d.transform);
        sprite.draw();
      }
    );
  }
  /*...*/
};

In my implementation all systems has ONE type of components. So, in System<Drawable> its components can be stored like vector<Drawable>.

All systems work only with those components for which they are responsible. It is very good, because:

1. Logic rigidly separated from the data

2. All components stored in a continuous memory area

3. All components are updated consistently (with minimum cache-misses)

4. Usually processing logic can be parallelised (by thread pool or something else)

5. Almost always, the components are std::is_standard_layout, so it can be easily serialized in binary form (if inside structs no pointers)

Alternatively you can design your system

I am thinking about custom memory pool... Maybe, it can help me alot

In other words, you say you don't want vtables as you think they are not fast; but vtables are the fastest known solution to the task. Use them.

vtables used only inside my System<T> (virtual void update(float) = 0;)

Since components are pure data there is no needs to look at vtable when I want to update component state. It is little overhead, because it can be done without vtables

My implementation of "ECS" is not really any one thing i can just describe - its pretty much an amalgamation of code ive written making past games.
It happens to use indices in to component arrays - but as frob mentioned these are arrays of structs not structs of arrays - i dont know whether ot makes a difference or not i have never had any performance issues there.

Originally i had just a bunch of comps newed in to each ent - i went through a huge ordeal switching to the comp in arrays thing - honestly kind of a waste of time. I never had enough objs in any of my games to make a noticeable difference. I was swayed by the public opinion in to doing something that i didnt profile, see if it was really necessary, and as mentioned above ended up being pointless.

The only issue I had performance wise came when I was making a hex tile game - I needed to draw about 100000 tiles at once - broken between a few different tile types. I ended up adding an array to my transform comp that had matrix for each tile of that type - and used a mapped opengl buffer and a single instanced draw call per tile. This worked nicely but had has its limitations - each tile instance must have the same mesh and material and shader properties - but this was exactly what i needed anyways.

How does that fit in to ECS? Should i have made a instance component or something? I dont know but it worked for that game and it works for some other similar things i have done.

While I'm a huge fan of ECS systems (through reading, not through use), I'm with frob in thinking that you might be jumping the gun.

You're using ECS because you think it might be "better" in areas you don't need, because it sounds cool and because everyone is talking about it and it's new and trendy (to most people anyway - I've been reading about it since 2006, and it's been around (in various forms) since at least 1996).

My advice would be to forget about writing fast code, optimize for code that is easy to read; optimize for code that is easy to develop with; optimize for actually finishing a project.

Don't use ECS because you think it'll make your code faster. Use ECS only if you think it'll make your code cleaner, your development quicker, your life easier.

And by "development easier" I don't mean some complex file-driven assembling of entities in an infinite combination of pointless possibilities that you'll rarely actually need.

And if you are making an 'engine' - don't. Make a game instead (from scratch or otherwise). Unless you want an engine instead of a game, which is also a fun project, but one that doesn't lead to actual games getting made. What I mean is, if you are making an engine with the illusion that you'll use it to make a game, then you are shooting yourself in the foot. But if you make a game by necessity you'll also make a engine - albeit an adhoc one; and you can later refine and reuse that adhoc engine.

If you insist on ECS, then I have to ask these questions:

1. Entity - pointer to the World + unique id

Why does it have a pointer to the world? Why does the Entity even exist?

Entities don't own the world (as much as they might wish :wink:), the World owns the entities.

Further, Entities shouldn't own their ID, IDs are used to locate the Entity (or rather, the components belonging to that entity).

2. Component - pure struct without logic

Good.

3. System - storage and processing logic of components with type T. All components stored in a continuous memory area (aka array or vector)

Why are all components stored in an array or vector?

I hope you have double-indirection of your IDs, or you'll be wasting alot of performance and memory trying to filter which entities you need to process.

4. 1 Component = 1 System

Why? What if a System logically makes sense managing two or three related component types?

Why does the System own the components?

Some Systems should own some components, hidden behind-the-scenes, other sets of components should be more public and shared by many systems.

5. Intersystem interaction occurs via messaging

Function calls are "messaging". What kind of messaging are you talking about?

If SystemX needs to talk to SystemY, why doesn't SystemX just have a reference to SystemY - why over-complicate it?

6. If I need share some components I need a new component + system.

Why? If you need to share some components, why not just hand the components to all the systems that need it?

This topic is closed to new replies.

Advertisement