• 9
• 9
• 10
• 10
• 9

# Creating Entities from XML in a Data-Oriented ECS

This topic is 656 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hello,

I come mostly from an object-oriented world, but I did some reading on data-oriented design and decided to take a stab at it in an ECS using C++. My entities are just unsigned ints (with some bit magic on the last 8 bits to quickly determine if the entity is in use), my "components" are actually component managers; simple structures of arrays, plus a map to link an entity to an index in the arrays (essentially, think of it as a table in a database). Systems operate on these managers, iterating through their arrays linearly. You can take a look at this link, and the following posts to see on what I am basing this (except for doing it in blobs) : http://bitsquid.blogspot.com/2014/08/building-data-oriented-entity-system.html

I created a very simple demo for this, and it appears to work fine. However, I am attempting to suss out how I would set this system up to read an entity specification file in XML/JSON/whatever. If I were going in the more object-oriented fashion of ECS this would be a bit simpler; I could create a bunch of factories that take as a parameter the current entry in the file and return a new instance of a component object. However, I am attempting to approach this in a more data-oriented fashion. I need to put values into tables, not return a separate instance of a class. As such, I would need a reference to the table in question.

My first thought is to create a class that acts as a sort of "central database"; it essentially just holds an instance of every component manager, and my entity manager and hands out references to whoever needs it. The class that reads the file could hold a reference to this object, access whatever table it needs and add to it whatever values are specified. This strikes me as not a great idea; there could be quite a few components to manage at the end of this, and that class would become quite large. There's also the issue of having to go into that class every time I need to add or remove components. This may be my rather amateurish opinion, but that doesn't strike me as being particularly extensible over time.

So, what are your thoughts? Do you have a better idea, or should I just buck up and make this central database class? Or am I a complete fool for pursuing this design pattern in the first place?

##### Share on other sites

I'd try it like this: Each system stores its components privately/internally in whatever way that system sees fit. I wouldn't hand out the internal table. Instead, you could query the system for a specific component by ID.

Something like this:

class MeowSystem
{
public:
struct Component
{
std::string purringNoise;
};

public:
//Creates and returns a component for 'entityID'.
Component &CreateComponent(EntityID entityID);

//If 'entityID' has a Meow component, deletes it. If no such component exists, safely does nothing.
void DeleteComponent(EntityID entityID);

//If 'entityID' has a Meow component, returns a pointer to it. If no such component exists, returns null.
Component *GetComponent(EntityID entityID);

//Processes all the components in a data-oriented (i.e. consumption-optimized) way.
void Purocess(...parameters...);

private:
std::vector<Component> components;

//By decoupling array indices from entity IDs, it gives each individual system greater flexible for optimizations.
//We optimize for processing the components, rather than accessing the components by ID.
std::unordered_map<EntityID, ArrayIndex> indexMap;
};

typedef MeowSystem::Component MeowComponent;

class Systems
{
public:
MeowSystem meowSystem;
PhysicsSystem physicsSystem;
ScriptSystem scriptSystem;

EntityIdPool idPool; //GetNewID(), ReleaseID(id), IsActive(id)

public:
EntityID CreateEntityFromWrongFileFormatForTheJob(std::string xmlFilepath)
{
//...blah blah make sure file exists, loads, and parses fine...
//Actually I wouldn't even pass in a filepath, but that's a different discussion.

EntityID newID = idPool.GetNewID();

if(file.Has("inheritance"))
{
//...create the components that are being inherited, first, and apply the inherited values.
}

if(file.Has("meow"))
{
MeowComponent &meowComp = meowSystem.CreateComponent(newID); //The file format's 'meow' node could even be passed into 'CreateComponent(id, &node)'.
meowComp.purringNoise = file.Get("meow.noise").AsString();
}

if(file.Has("scripting"))
{
ScriptComponent &scriptComp = scriptSystem.CreateComponent(newID);
scriptComp.scriptName = file.Get("scripting.scriptName").AsString();
scriptComp.parameters = file.Get("scripting.parameters").AsScriptParameters();
}

if(...)
{
//...etc...
}

return newID;
}

};

One example of optimization made possible by hiding internal tables

Why I wouldn't pass in a filepath (actually this doesn't really explain the 'why' but it's something I agree with)

As for XML, it's not really designed for this. It's intended to be a document markup (like RTF or HTML), even if it's widely popular to twist it for other uses (even by major corporations - Microsoft <_<). It's also not really optimized for human legibility, so it's harder to spot mistakes as there is unnecessary visual clutter (even if you get used to it). But that's a nitpick. The pro is that everyone else overuses XML also, so there's plenty of tools to help you improperly use it :P.

Edited by Servant of the Lord

##### Share on other sites

I created a very simple demo for this, and it appears to work fine. However, I am attempting to suss out how I would set this system up to read an entity specification file in XML/JSON/whatever. If I were going in the more object-oriented fashion of ECS this would be a bit simpler; I could create a bunch of factories that take as a parameter the current entry in the file and return a new instance of a component object. However, I am attempting to approach this in a more data-oriented fashion. I need to put values into tables, not return a separate instance of a class. As such, I would need a reference to the table in question.

My first thought is to create a class that acts as a sort of "central database"; it essentially just holds an instance of every component manager, and my entity manager and hands out references to whoever needs it. The class that reads the file could hold a reference to this object, access whatever table it needs and add to it whatever values are specified. This strikes me as not a great idea; there could be quite a few components to manage at the end of this, and that class would become quite large. There's also the issue of having to go into that class every time I need to add or remove components. This may be my rather amateurish opinion, but that doesn't strike me as being particularly extensible over time.

I don't see this as a fundamentally any different problem than it would be with "the more object-oriented fashion of ECS".

In either case, you still need to implement some kind of reflection logic to map properties in the XML/JSon/whatever to actual code. In the traditional case, these would map to properties on a struct. In your case, these would map to method calls on a particular component manager. Or am I not understanding the design in the linked blog properly?

##### Share on other sites

I'd try it like this: Each system stores its components privately/internally in whatever way that system sees fit. I wouldn't hand out the internal table. Instead, you could query the system for a specific component by ID.

One example of optimization made possible by hiding internal tables

Why I wouldn't pass in a filepath (actually this doesn't really explain the 'why' but it's something I agree with)

As for XML, it's not really designed for this. It's intended to be a document markup (like RTF or HTML), even if it's widely popular to twist it for other uses (even by major corporations - Microsoft <_<). It's also not really optimized for human legibility, so it's harder to spot mistakes as there is unnecessary visual clutter (even if you get used to it). But that's a nitpick. The pro is that everyone else overuses XML also, so there's plenty of tools to help you improperly use it :P.

I never thought of having systems be the direct owners of the tables. I may have to mull that one over a bit. But, for your interesting design and subtle cat puns, you appear to reach the same conclusion I do, albeit via a differing path; a massive collection of all your systems, and a parse that becomes a massive collection of ifs. If there isn't a better way than this, so be it, but I was under the impression there was something more clever to be done. Perhaps I was just overthinking it.

And of course the file format is irrelevant; most tutorials on ECS say you can use XML to instantiate entities with components (although none I have seen have gotten into how precisely, which I suppose made me think there was some clever trick I was missing). As such, it was used for the title. I will probably just use a text file in my own format.

I don't see this as a fundamentally any different problem than it would be with "the more object-oriented fashion of ECS".

In either case, you still need to implement some kind of reflection logic to map properties in the XML/JSon/whatever to actual code. In the traditional case, these would map to properties on a struct. In your case, these would map to method calls on a particular component manager. Or am I not understanding the design in the linked blog properly?

I think you probably understand the blog fine. But it seems that the idea of a central class is necessary in some capacity to perform that mapping. Which I suppose is fine, I was just thinking maybe someone more clever than I managed to figure out something better.

Thank you both for your help!

##### Share on other sites

you appear to reach the same conclusion I do, albeit via a differing path; a massive collection of all your systems,

Someone somewhere has to own the systems. Unless you make them inherit from a base class for no reason (which has its own downsides), some class will have a concrete instance of the system.

and a parse that becomes a massive collection of ifs. If there isn't a better way than this, so be it, but I was under the impression there was something more clever to be done. Perhaps I was just overthinking it.

There's always a balance between too concrete and too abstract - and that balance might vary depending on the project.

You could make your properties all be 100% scriptable. It's not what I would personally code, but whatever floats your ints.

Thief used that method, IIRC (see Doug Church's slides (as archived on Chris Hecker's site)).

Personally, I'd hardcode alot of important values, but also enable purely scriptable values (for rapid design), and if certain values get used frequently, I'd make them hardcoded.

You can break up your massive if()'s in at least two different ways:

1) You can make each system handle its own property loading by passing the property node to the CreateComponent() function.

or 2) You can design a generic serializable property system for the components, that works even with hardcoded variables.

##### Share on other sites

I dont like the "system owns its components" approach. More often enough turns out that going that way, multiple systems end up needing to "own" the same component store (everything needs to know an entity's position for example). And managing those dependencies becomes messy.

I prefer to separate it in two:

1. Component stores know one component type, and manage how they get stored and how a component is mapped to an entity.
2. Systems request the store they need, they dont "own" the components, they just operate on them based on the entities they're processing.

In my ECS a "World" instance knows all the entity systems, the entity manager (that shells out entity ids) and the component manager (that shells out component stores). That way in an "init" step on the World instance, all the dependencies (both component stores and other systems) can be injected in all the systems that request them.

This way a system doesn't has to do all the steps of managing the component's store memory, the entity -> component mapping and whatever processing step it needs to do (input, ai, physics, etc). Stores manage memory the way they see fit (sometimes you need a possibly wasteful but cache-friendly flat array, sometimes an unordered map works well enough), and the system just knows how to ask the stores "hey, give me the components of this entity I have here" to process them.

I've made it so I can have stuff like:

// Beware of Xtend code
class RenderQueuingSystem extends EntitySystem {
/* All of these get injected at runtime. */
// This is a dependency on other entity system.
var RenderSystem renderSystem;
// These are dependencies on component stores.
var ComponentHandler<Geometry> geometries;
var ComponentHandler<Spatial> spatials;
var ComponentHandler<Material> materials;

override processEntities(IntBag entities) {
for(int i = 0; i < entities.size(); ++i) {
val entityId = entities.get(i);
// Fetch components of the entity.
val g = geometries.get(entityId);
val s = spatials.get(entityId);
val m = materials.get(entityId);
// Do the required processing.
// Later when RenderSystem gets processed, it will be drawn.
}
}
}

Actual filtering of entities that the RenderQueuingSystem is done in a pre-process step, where the system gets notified each time an entity is added/removed, so it can build a list of entities it knows they meet the required component criteria. Thus why no "if it really has" check when fetching a component from the store.

Beyond that, entity loading goes beyond the scope of the system. I use a YAML lib to load entity template definitions from a file, compose a new entity out of the components, then add it to the World instance (which feeds it into the game). Notice I never mention any "system" in this process, the system only gets notified of the entity's existence when its added to the World. I guess in your case, if you want fine grained control on component allocation, the de-serializer should know the component stores so it can ask for new component instances when loading an entity template/prefab.

I think all of those concerns (component stores, entity-component mapping, component serialization and deserialization) should be separate. Otherwise you'll end up with fat systems with too many responsibilities.

For reference this is how a YAML entity template looks like:

- !Geometry
name: cube.internal
- !Material
diffuse: mossy_rock_02.dds
normal: mossy_rock_04.n.dds
shininess: 127.0
- !Spatial
scale: 2.0
type: STATIC
- !StaticPhysics
param0: 1.0
param1: 1.0
param2: 1.0
shape: BOX


From there a list of ComponentTemplate is loaded, each knows how to create an instance of a component with those specific parameters. So an "entity template" is just a list of ComponentTemplate objects.

Its not a very good reference since my serialization is very, very green (ie, thats just prefabs/templates, not even dealing with scene serialization yet). Although it might give you ideas (like for instance, not using XML :P ).

EDIT: Ugh editor eating text after that last code tags. Its awful.
Edited by TheChubu