Pointers and defragmentation

Started by
9 comments, last by matt77hias 6 years, 4 months ago

I wanted to store more entities and components by value instead of unique pointers in contiguous memory. Every entity has a pointer to each of its components and to each of its child entities. Every component has a pointer to its owning entity. When I delete entities and components, wholes appear and fragmentation occurs. After n frames, I want to move the content of my arrays which is still alive, to close the gaps. This of course breaks all relations since data is moved around in memory.

Each entity and component has a GUID, so I can technically use these instead of pointers. Unfortunately, this requires a search operation for every GUID that needs to resolved. Irrespective of some clever search algorithm, possibly combined with sorting the entities and components, this approach will always be too expensive. This especially becomes a bottleneck for transform components.

Alternatively, it is possible to use the GUID to restore the pointers during defragmentation. But this manual labour does not sound very clever.

So is it possible to use some alternative handle instead of a plain pointer that can be automatically and globally updated during defragmentation?

 

🧙

Advertisement
1 hour ago, matt77hias said:

Irrespective of some clever search algorithm, possibly combined with sorting the entities and components, this approach will always be too expensive. This especially becomes a bottleneck for transform components.

That approach is extremely common in my experience, except typically with as-small-as-possible ID's instead of GUIDs (even u8 if you know you won't instantiate a particular class 256 times!). You typically don't do a search, but a simple lookup into an array that holds the indices.
e.g.


struct NonMovableComponents
{
   int alloc() {  return /*next unused component slot*/; }
   T& operator[]( int i ) { return component[i]; }
   T component[SIZE];
};

struct MovableComponents
{
   int alloc() { int slot = /*next unused indices slot*/; int index = /*next unused component slot*/; indices[slot] = index; return slot; }
   T& operator[]( int i ) { return component[indices[i]]; }
   T component[SIZE];
   int indices[SIZE];
};

However... whether this is an issue depends a lot on how you're using your "components".

In some games, the primary way that components are accessed is by iterating over all of them -- for each component, do x... If this case, the cost of a lookup from an entity's handle doesn't matter, because you don't do it very much.

In other games, the primary way that components are accessed is from their parent entity -- entity->GetComponent<Transform>()->DoX()... In this case, sure, you want those lookups to be fast... but you've probably also got bad access patterns anyway, so you're not going to get as much benefit from defragging, so you could just, not defrag :)

1 hour ago, matt77hias said:

So is it possible to use some alternative handle instead of a plain pointer that can be automatically and globally updated during defragmentation?

After the defrag operation, you could go and patch all your component pointers. Entities could listen for an "on defrag" event from systems that they have pointers into.

1 hour ago, Hodgman said:

struct MovableComponents { int alloc() { int slot = /*next unused indices slot*/; int index = /*next unused component slot*/; indices[slot] = index; return slot; } T& operator[]( int i ) { return component[indices[i]]; } T component[SIZE]; int indices[SIZE]; };

Makes me actually think if defragmentation is actually necessary?

If you have a limit on the number of components of a certain type, you can keep iterating the array (containing both dead/garbage and alive components). If you need to add a component, you just need to search for the first free dead/garbage component and replace it. This still allows the use of pointers (to the elements of that array).

🧙

This is exactly why my components don't have any knowledge of which entity they're attached to.  Only the component system has that information.  It can return a handle (when a component is created) into the front facing buffer (public side of the system) which can then if it needs it, have a second buffer which holds the component pointer.


using Handle = uint32_t;

struct TransformComponent
{
	glm::mat4 m_Transform;
}

class TransformSystem
{
public:	
	//...
	Handle CreateComponent( const Entity& parent_entity, ... );

private:
	// m_CurrentHandle can just get ++'d in a simple system, or could be more complex reusing freed handles, etc...
	Handle m_CurrentHandle = 0;
	std::map<Handle, std::size_t> m_TranslationMap;
    // This Handle is the parent_entity ID from the CreateComponent for example.  Just using map as an example.
    std::map<Handle, TransformComponent*> m_Components;
};

Now if we defragment m_Components we just have to update the m_TranslationMap's std::size_t value to point to the new location.  The handle provided from CreateComponent() never needs to change.

"Those who would give up essential liberty to purchase a little temporary safety deserve neither liberty nor safety." --Benjamin Franklin

8 hours ago, matt77hias said:

If you have a limit on the number of components of a certain type, you can keep iterating the array (containing both dead/garbage and alive components). If you need to add a component, you just need to search for the first free dead/garbage component and replace it.

Not saying that this is right for you/everyone, but a nice property of those styles is that you can end p with a fairly predictable/constant  performance cost.

Often as programmers we focus on best case or average case algorithmic complexity, but in games, it's the worst case algorithmic complexity that's going to cause headaches - occasionally framerate stuttering, or performance that's fine until some point in a level, where it suddenly nosedives of a cliff... So just using dumb solutions, such as always updating every component regardless of whether it's in use or not,  where the best/average/worst performance are all the same is sometimes actually a very valid approach. 

Why cant you use erase+ remove_if? Take the element you want to delete, replace it with the last active element in the array, no resizing or frag.

Overall the pattern is extremely common in games.  Often the pattern isn't used out of defragmentation concerns, instead it is used out of concerns over the cost of allocations and relocations.

 

As Hodgman pointed out, since you can control what the handles actually are you can nearly eliminate their cost. Since you can provide any handle value you wish you can guarantee it is a direct lookup rather than the double-indirection lookup, and thanks to inline optimizations in many languages you can even guarantee there is no function call overhead.

I thought it was interesting you mentioned an assumption all components must be in the same container or be a general component. You call out transforms, many engines I've worked on (including both Unity and Unreal) embed a single transform in the GameObject or a special base component because it is near-universal for everything living in the world.  Thus the often-used transform is not only in a different location, with careful design the transform is already prefetched in memory with other object data.

As for dead components or out of world objects there are easy ways to handle that depending on the systems involved.  Some use a Boolean flag do indicate a dead object.  Many engines a special location that is automatically tested that gets filtered out.  The most common I've seen are QNaN values which will be automatically filtered out due to natural properties of the math: QNaN silently compares as false for comparison operations so it can always be culled or always be bypassed during a sort.  They're easy enough to generate, either store the value somewhere as a compile-time constant or store 0.f/0.f which the compiler can compute at compile time. You need to be mindful not to set compiler optimizations that automatically exclude NaN tests such as the x64 /fp:Fast optimization, but that is easily addressed as well.

8 hours ago, h8CplusplusGuru said:

Why cant you use erase+ remove_if? Take the element you want to delete, replace it with the last active element in the array, no resizing or frag.

Works as well, but it (approach 2) seems overkill (in terms of the number of moves, not the number of predicate checks) compared to looking for a single dead object to replace for adding an alive object (approach 1). If you do not do this every frame, there is still a chance that some objects will be dead while iterating. So you need to perform the same checks for approach 1 and 2 while iterating any way, making the defragmentation of approach 2 more expensive. Furthermore, approach 2 actually moves data around in memory, while approach 1 does not, requiring to fix pointers/handles.

 

I realize though that I need to have a move assignment operator which I explicitly deleted in most cases, because in these cases I see no logical use for assignment. Is in place construction (instead of assigning) possible as well with std::vector for instance?

 

🧙

3 hours ago, frob said:

I thought it was interesting you mentioned an assumption all components must be in the same container or be a general component. You call out transforms, many engines I've worked on (including both Unity and Unreal) embed a single transform in the GameObject or a special base component because it is near-universal for everything living in the world.  Thus the often-used transform is not only in a different location, with careful design the transform is already prefetched in memory with other object data.

My Transform does not inherit Component (though it shares some similarities) and is stored at my Entity itself (by value), since practically everything needs to be aware of the position and orientation. All my Components are self contained with the exception of transformation data. I used this design, since retrieving the Transform associated with a Component is a very common operation of the engine. From the perspective of the engine, I see no actual use for retrieving sibling Components given a Component, I just provide such functionality for the scripting side.

🧙

7 hours ago, frob said:

As for dead components or out of world objects there are easy ways to handle that depending on the systems involved.  Some use a Boolean flag do indicate a dead object.

Currently, I am not really sure of everything I am going to face in the future, so I use a state enum value which can be:

  • Active: Entity/Component should be processed.
  • Passive: Entity/Component should not be processed.
  • Terminated: Entity/Component maybe overriden.

Note that the Entity, itself is not really being "processed". It merely contains a state to flush it down to its components and child entities.

 

I also just realized that it would be better to replace the pointers by indices into the std::vector. Even if you do not rearrange the components, you could still need to make the std::vector larger.

🧙

This topic is closed to new replies.

Advertisement