• 12
• 10
• 10
• 13
• 10

# Implementing an internally linked list via inheritance

## Recommended Posts

I'm currently building a lightweight ECS system (keeping it hard-coded to the game for now, will look into separating into an engine once it's up and running). I'm taking some inspiration from Unity, as I have a lot of experience working with it, so I decided to keep track of components in an internally linked list, where the Transform component is the head. A GameObject is an empty container with a Transform component, and since Transforms can't (shouldn't) be removed, I figured it would be efficient to use it to keep track of the component linked list head.

However, instead of following the conventional data pattern of defining a Node class that contains a pointer to its data, I decided to try and integrate the linked pointers directly into the runtime node types through inheritance. Something tells me this is a bad practice, but it seems to satisfy data-oriented design philosophies, and I can't see any explicit pitfalls yet. There is a little lack of safety, but there are a couple asserts to ensure polymorphic compatibility. Also, having the nodes delete themselves without controlling their own allocation is definitely a bad smell, but is there really an issue if it's made explicit to the programmer to always dynamically allocate nodes that get pushed into the list?

Anyway here's the code, feel free to rip it apart and expose all of my bad C++ practices:

#ifndef _LINKED_NODE_H_
#pragma once

#include <vector>
#include <typeinfo>

{
private:

protected:

{

while (curr->next != nullptr)
{
curr = curr->next;
}

curr->next = node;
curr->next->prev = curr;
curr->next->next = nullptr;
}

// returns first found node of specified runtime type
template <class DerivedType>
{

while (this != nullptr)
{
if (typeid(curr) == typeid(DerivedType))
{
return curr;
}

curr = curr->next;
}

return nullptr;
}

// returns all nodes of specified runtime type
template <class DerivedType>
{

while (curr != nullptr)
{
if (typeid(curr) == typeid(DerivedType))
{
nodes.push_back(curr);
}

curr = curr->next;
}

return nodes;
}

// removes this item from list
void remove()
{
if (prev)
prev->next = next;

if (next)
next->prev = prev;

delete this; // IT'S DEFINED BEHAVIOUR JUST ALLOCATE IT DYNAMICALLY
}

// delete all nodes
void clear_list()
{

while (curr->next != nullptr)
{
curr->next->remove();
}

delete this; // IT'S DEFINED BEHAVIOUR JUST ALLOCATE IT DYNAMICALLY
}
};

#endif


And yeah, I'm not using smart pointers, I just want to get it functioning with raw pointers first then I'll work on ownership and safety.

The component class is just a container base that inherits from LinkedNode. The rest of its information should be arbitrary to this pattern so I won't include it.

The GameObject class then implements component functions like so:

#ifndef _GAME_OBJECT_H_
#define _GAME_OBJECT_H_
#pragma once

#include "Object.h"
#include "StandardTypes.h"

class Component;
class Transform;

class GameObject final : public Object
{
public:
GameObject();
~GameObject();

// All GOs need a transform, so it is the head of Component LinkedList
Transform* transform;

// Add new component by type
template <class ComponentType>
{
bool isComponent = std::is_base_of<Component, ComponentType>::value;
ASSERT(isComponent, "Trying to add a non-Component to a GameObject");

// this won't work unless GameObject knows all complete component types
transform->push_back(new ComponentType());
}

// Get first component of runtime type "type"
Component* GetComponent(std::string type); // TODO: implement (RTTI class table)

// Get first component of specified runtime type
template <class ComponentType>
ComponentType* GetComponent()
{
bool isComponent = std::is_base_of<Component, ComponentType>::value;
ASSERT(isComponent, "Calling GetComponent<> with a non-Component type.");

return static_cast<ComponentType*>(transform->get_node_by_type<ComponentType>());
}

// Get all components of specified runtime type
template <class ComponentType>
std::vector<ComponentType*> GetComponents()
{
bool isComponent = std::is_base_of<Component, ComponentType>::value;
ASSERT(isComponent, "Calling GetComponent<> with a non-Component type.");

// is there a better way to do this???
// does this even work???

std::vector<ComponentType*> components;
components.reserve(nodes.size());

for (int i = 0; i < components.size(); i++)
components[i] = static_cast<ComponentType*>(nodes[i]);

return components;
}

};

#endif 

But doesn't static_cast<> require a complete type? Should dynamic_cast<> be used here? I was trying to be as lean as possible with this structure so I wanted to avoid inheritance tree searching, but I think it might be the only option excluding some sort of bonkers C++ witchcraft.

And it shouldn't be too relevant, but for completeness, GameObject.cpp:

#include "GameObject.h"
#include "Component.h"
#include "Transform.h"

GameObject::GameObject()
{
transform = new Transform();
}

GameObject::~GameObject()
{
transform->clear_list();
}

{
transform->push_back(component);
}

Component* GameObject::GetComponent(std::string type)
{
// TODO: implement (RTTI class table)
return nullptr;
}



TLDR; is this approach a terrible idea that should be scrapped and replaced with a conventional LinkedList structure, or is it acceptable assuming it's used correctly? Also please pick apart any bad C++ practices I've used (except lack of smart pointers).

Edited by cmac

##### Share on other sites

Yeah, this makes more sense. I was referencing an engine I built in college that used a LinkedList and trying to optimize it, but thanks for the reminder to review the standard library containers before reinventing the wheel :p

##### Share on other sites

Also just use #pragma once instead of the include guards AND pragma, every compiler that you'll ever use supports it at this point and it's a lot less typing leading to fewer issues.

##### Share on other sites

Also just use #pragma once instead of the include guards AND pragma, every compiler that you'll ever use supports it at this point and it's a lot less typing leading to fewer issues.

I looked this up a couple months ago because I was curious about the differences, and read there are still some compilers (legacy perhaps) that don't recognize #pragma once, so it's safest to use both. But if it's the industry standard to just go with pragma once these days I'll hop on the bandwagon

##### Share on other sites
If you want to find more info, this kind of list is called an intrusive linked list.
Likewise a ref counting pointer like shared_ptr that stored ref counts inside the object would be called an intrusive ref counting system.

These used to be very common back when game engines cared about optimizing everything, but didn't bother optimizing for cache coherency.

##### Share on other sites

From Wikipedia, all major compilers support it (except some long-dead compilers that don't matter here).

I have read that there are some obscure custom compilers for things like DSPs and microcontrollers that don't support it, but even that is rare these days since most of them are based on GCC and other multiplatform compilers.  I've worked on some custom hardware and FPGA boards but they used GCC. Few companies are willing to spend the time to write a custom compiler when others are freely available and easily modified.

There are active official proposals to include #pragma once or a variant #once in the C++ standard. If they are accepted they would like use the #pragma once version because of the existing universal support.

And as for linked lists, they are almost always the wrong choice.  Several people (including Bjarne Stroustrup who designed C++) have called for them to be deprecated from the language standard.  There are still some hardware systems and some extremely rare (typically contrived) scenarios where they are a good solution, but it is almost never the right solution.  Memory speed and CPU speed have been diverging for decades, and the farther apart they get the worse linked lists become.

Edited by frob
Swap HTML italics for ip.board italics.

##### Share on other sites

Unity uses a tree, not a list. Nodes can have more than one child.

The transform hierarchy in Unity effectively creates a scene graph - the each transform is added to its parents. If this is the behavior you're looking for then it would be good to focus more on how to effectively implement this behavior rather than on this specific data structure. (Please specify if this is the case.)

There's no reason to use intrusive linkage. Prefer composition over inheritance.

##### Share on other sites

There's no reason to use intrusive linkage. Prefer composition over inheritance.

While I agree in general with the principle that composition is better than inheritance for most purposes, I want to clarify one thing: you can totally build an intrusive container/refcounter/etc. using just composition.

##### Share on other sites

A linked list is the wrong data structure to use here (a linked list is rarely the right data structure to use these days). [...] You're mainly concerned with adding/retrieving components by type id, right?. So an unordered_map seems to be the obvious choice. If you need to support multiple components of the same type on a game object, then use unordered_multimap.
Allow me to add that std::unordered_map (and even more so std::map) is also rarely the right data structure.

A std::unordered_map (empty, of any type) has an overhead of 56 bytes on my platform, which is already not that neglegible for something that's added to every node just like that.

But it gets worse: As easily observed by overloading global operator new, adding one element to an unordered_map typically does two dynamic allocations (for std::unordered_map<int,int> 16 and 24 bytes, respectively, on my platform). Adding two elements does three dynamic allocations (16, 24, 16 bytes). Adding three elements does 5 allocations (16, 24, 16, 16, 56 bytes). Adding 25 kv-pairs does 29 dynamic allocations. Adding 2,500 kv-pairs does 2,510 allocations. It's asymptotic for sure, but for small N it's a bummer of overhead just for managing a few key/value pairs.

On the other hand side, for large N, the fact that std::unordered_map has a cache coherency only rivalled in its badness by std::map, the guaranteed-cache-miss-every-time factor comes into play. So... it's neither really a stellar performer for small N, nor for large N.

I don't quite get the whole ECS stuff anyway (well, I do get it insofar as I see how it's nice and flexible, I just don't get how it can actually ever, in practice, work with an acceptable performance, but let's ignore whether or not I get it... that's irrelevant) but if I was to do such a thing, my first choice, until proven wrong, would certainly be a small_vector (such as clang uses internally), the expectation being that the average node does exactly zero extra allocations and that linearly iterating over 5-8 elements is just as fast as doing one hash table lookup.

But that's just me.