Back to General and Gameplay Programming

Game Engine Containers - handle_map

Programming

General and Gameplay Programming

Published September 15, 2016 by Jeff Kiah, posted by y2kiah

Do you see issues with this article? Let us know.

Introduction

This article explores the creation of a data container for game programming. This container is meant to take the place of C++ standard library containers such as std::map and std::unordered_map, with an alternative that stores data contiguously in memory.

Data should be contiguous to maximize cache locality when iterating over the data, resulting in faster performance vs. containers that individually allocate elements on the heap. The goal here is not to produce a drop-in replacement for the STL containers. That being said, whenever practical I tried to remain consistent with the conventions and semantics established in the STL. The std::vector is my poor-man's memory allocator of choice.

What sets this container apart from other associative containers like std::map and std::unordered_map, is that we do not specify the key when inserting data. This is why I'm referring to the key as a handle. We receive a handle back from the container upon inserting.

Quite often when using associative containers, I find that the values of keys are not actually important. We just end up storing the keys and using them only for lookup purposes anyway. Often we use hash maps purely for the associative semantics of the container, and not because we actually care about the key values. We're still forced into supplying a key, which can lead to convenient but flawed choices like std::string.

This is where handles can be a great alternative. We still have to store the handle somewhere, but the burden of type selection and supplying the value is relieved from the programmer. Handles also hold information internally that make this container smarter and safer than bog standard hash maps.

So now that I've given an overview of the container and why you might want to use it, let's outline the goals for the implementation.

Goals

Store data in a contiguous block of memory
This is key to our performance goals. We will separate our container into a dense set and a sparse set, where the data itself is kept in the dense set, tightly packed to ensure best cache locality during traversals. Even though this is an associative container, semantically it doesn't make sense to traverse in the order of the handles' values. That would also incur a performance penalty due to the double-indirection of handle lookups (more on that later). More likely, what you'll want to do is either look up a single entry by handle, or traverse all entries by their in-memory order just as you would an array/vector, completely bypassing the use of handles. This is not an ordered container, so traversal of the dense set is not guaranteed to visit elements in the same order each time.
Create an associative mapping between a handle and the data
The sparse set will play two important roles. First, it is the handle index and remains stable with insertions and deletions. This allows us to adjust the memory location of elements in the dense set to keep them tightly packed, without invalidating our lookup value. The second role of the sparse set is to internally store a freelist. Gaps in the sparse set resulting from deletions will be filled by subsequent insertions. The sparse set will always fill gaps before growing, and thus will itself try to remain dense.
O(1) lookups
Constant time lookups are achieved because handles contain information about their location in the sparse set. This makes lookups into the sparse set nothing more than an array lookup. The information contained in the sparse set knows the item's exact location in the dense set, so again just an array lookup. One consideration for lookups by handle vs. a direct pointer dereference is the cost of an extra indirection, so traversal "by handle" (over the sparse set) is not recommended. Therefore, iterators to the dense set are provided by the handle_map for traversals.
O(1) insertions
Constant time insertions are done by pushing the new item to the back of the dense set, and using the first available slot in the sparse set's freelist for the index.
O(1) removals
Constant time removal is made possible by using the "swap and pop" trick in the dense set. That is, we swap the item being removed with the last item in the array, and then pop_back to remove it. We also have to update the index of the last item referenced in the sparse set. We have to discover the location in the sparse set of the corresponding inner id for the (former) last item that we are swapping with. The index field must be changed to reference its new location in the dense set. One option would be to linearly search through the sparse set until we find the match, but to achieve O(1), we instead introduce a third storage set (in addition to the dense set and sparse set) which we'll term the meta set. The meta set is maintained in parallel with the dense set, and contains an index back to the sparse set, giving us a two-way link. We do not combine the meta information directly into the dense set because it's currently only used for deletion, and there is no sense in harming cache locality and alignment of the main object store for infrequently used data.

Implementation

Let's begin with an explanation of the handle structure itself. The handle is a 64-bit value, so there is no additional overhead compared to a pointer on 64-bit platforms.


/** * @struct Id_T 
* @var free 0 if active, 1 if slot is part of freelist, only applicable to inner ids 
* @var typeId relates to m_itemTypeId parameter of handle_map 
* @var generation incrementing generation of data at the index, for tracking accesses to old data 
* @var index When used as a handle (outer id, given to the client): 
* free==0, index of id in the sparseIds array 
* When used as an inner id (stored in sparseIds array): 
* free==0, index of the item in the dense items array 
* free==1, index of next free slot, forming an embedded linked list 
* @var value unioned with the above four vars, used for direct comparison of ids 
*/ 
struct Id_T 
{ 
    union 
    { 
        /** 
        * the order of this bitfield is important for sorting prioritized by free, then typeId, 
        * then generation, then index 
        */ 
      
        struct 
        {
          uint32_t index;
          uint16_t generation;
          uint16_t typeId : 15;
          uint16_t free : 1;
        };
      
        uint64_t value;
    };
}; 

typedef std::vector IdSet_T; 
#define NullId_T Id_T{}

It is worth mentioning the two ways that the handle structure is used, which I've termed "outer id" and "inner id." The outer id is an Id_T value that's given to the user of the container as a handle. The inner id is the related Id_T value that is stored in the sparse set. The two values differ in how the index property is used. The outer id stores an index into the sparse set, while the inner id stores an index into the dense set.

The typeId field is optional extra metadata, set in the container's constructor, that can introduce an element of type safety into your outer id handles. For example, say you have two handle_maps, one storing Apples and the other storing Oranges. The typeId would differentiate handles between the two, that would otherwise be equivalent. The programmer simply needs a mechanism to assign a unique 15-bit integer for each type contained within a handle_map. In many cases, you'll just bypass this and set zero. However, in cases where many types are tracked in an array of handle_maps, like components in an entity component system (ECS), or fields in a SoA inversion, the typeId can come in very handy. In addition to type safety, the typeId can be used as an additional O(1) lookup to first get you into the correct handle_map (within an array of handle_maps), before getting you to the final value. This is exactly how automatic storage for all component types is done in the griffin engine's ECS implementation.

The generation field is used as a safety mechanism to detect when a stale handle is trying to access data that has since been overwritten in the corresponding slot. Every time a slot in the sparse set is removed, the generation increments. Handle lookups assert that the outer id and inner id generations match. If debug asserts aren't safe enough for your purposes, an exception could be thrown and handled at runtime, but I view stale lookups as a bug and prefer to just crash hard and fix the issue.

The free field is a 1-bit flag to indicate if the sparse set entry represents a value, or if it is actually empty and part of the internal freelist. When free==1, the inner id's index field no longer stores a lookup index into the dense set, and instead stores the index of the next free slot in the internal freelist. This gives us an embedded singly linked list within the sparse set array, and as long as we store the front index of the freelist separately, it is an O(1) operation to find the next available slot and maintain the singly linked list.

The following free functions are used to directly compare Id_T types.


// struct Id_T comparison functions 
inline bool operator==(const Id_T& a, const Id_T& b) { return (a.value == b.value); } 
inline bool operator!=(const Id_T& a, const Id_T& b) { return (a.value != b.value); } 
inline bool operator< (const Id_T& a, const Id_T& b) { return (a.value < b.value); } 
inline bool operator> (const Id_T& a, const Id_T& b) { return (a.value > b.value); }

Let's jump into the handle_map itself. I've removed comments for brevity, so we can focus on the important parts. Links to the full implementation are provided at the bottom of the article.


template  class handle_map 
{
  public: 
    struct Meta_T 
    { 
      uint32_t denseToSparse; //!< index into m_sparseIds array stored in m_meta 
    }; 
                                        
    typedef std::vector DenseSet_T; 
                                        
    typedef std::vector MetaSet_T; 
                                        
    // Functions 
    T& at(Id_T handle); 
    const T& at(Id_T handle) const; 
    T& operator[](Id_T handle) { return at(handle); } 
    const T& operator[](Id_T handle) const { return at(handle); } 
                                        
    template  Id_T emplace(Params... args) 
    { 
      return insert(T{ args... }); 
    } 
                                        
    template  IdSet_T emplaceItems(int n, Params... args); 
                                        
    typename DenseSet_T::iterator begin() { return m_items.begin(); } 
    typename DenseSet_T::const_iterator cbegin() const { return m_items.cbegin(); } 
    typename DenseSet_T::iterator end() { return m_items.end(); } 
    typename DenseSet_T::const_iterator cend() const { return m_items.cend(); } 
    size_t erase(Id_T handle); 
    size_t eraseItems(const IdSet_T& handles); 
    Id_T insert(T&& i); 
    Id_T insert(const T& i);
    void clear() _NOEXCEPT; 
    void reset() _NOEXCEPT; 
    bool isValid(Id_T handle) const; 
    size_t size() const _NOEXCEPT { return m_items.size(); } 
    size_t capacity() const _NOEXCEPT { return m_items.capacity(); }              
    template  size_t defragment(Compare comp, size_t maxSwaps = 0);
    DenseSet_T& getItems() { return m_items; } 
    const DenseSet_T& getItems() const { return m_items; } 
    MetaSet_T& getMeta() { return m_meta; } 
    const MetaSet_T& getMeta() const { return m_meta; } 
    IdSet_T& getIds() { return m_sparseIds; } 
    const IdSet_T& getIds() const { return m_sparseIds; } 
    uint32_t getFreeListFront() const { return m_freeListFront; } 
    uint32_t getFreeListBack() const { return m_freeListBack; } 
    uint16_t getItemTypeId() const { return m_itemTypeId; } 
    uint32_t getInnerIndex(Id_T handle) const; 
                                        
    explicit handle_map(uint16_t itemTypeId, size_t reserveCount) : m_itemTypeId(itemTypeId) 
    { 
      m_sparseIds.reserve(reserveCount); 
      m_items.reserve(reserveCount); 
      m_meta.reserve(reserveCount); 
    } 
                                        
    private: 
    
    bool freeListEmpty() const { return (m_freeListFront == 0xFFFFFFFF); } 
                                        
    // Variables 
    uint32_t m_freeListFront = 0xFFFFFFFF; //!< start index in the embedded ComponentId freelist 
    uint32_t m_freeListBack = 0xFFFFFFFF; //!< last index in the freelist uint16_t m_itemTypeId; 
                                                   
    //!< the Id_T::typeId to use for ids produced by this handle_map 
    uint8_t m_fragmented = 0; //
}

The insert and erase functions do most of the heavy lifting to keep things consistent.


template Id_T handle_map::insert(T&& i) 
{ 
    Id_T handle = { 0 }; 
    m_fragmented = 1; 
    
    if (freeListEmpty()) 
    { 
        Id_T innerId = { (uint32_t)m_items.size(), 1, m_itemTypeId, 0 }; 
        handle = innerId; 
        handle.index = (uint32_t)m_sparseIds.size(); 
        m_sparseIds.push_back(innerId); 
    } 
    else 
    {
        uint32_t outerIndex = m_freeListFront; 
        Id_T &innerId = m_sparseIds.at(outerIndex); 
        m_freeListFront = innerId.index; 
        
        // the index of a free slot refers to the next free slot 
        if (freeListEmpty()) { m_freeListBack = m_freeListFront; } 
        
        // convert the index from freelist to inner index 
        innerId.free = 0; 
        innerId.index = (uint32_t)m_items.size(); 
        handle = innerId; 
        handle.index = outerIndex; 
    }
    
    m_items.push_back(std::forward(i)); 
    m_meta.push_back({ handle.index }); 
    return handle; 
} 

template  size_t handle_map::erase(Id_T handle) 
{ 
    if (!isValid(handle)) { return 0; } 

    m_fragmented = 1;
    Id_T innerId = m_sparseIds[handle.index]; 
    uint32_t innerIndex = innerId.index; 
    
    // push this slot to the back of the freelist 
    innerId.free = 1; 
    ++innerId.generation; 
    
    // increment generation so remaining outer ids go stale 
    innerId.index = 0xFFFFFFFF; 
    
    // max numeric value represents the end of the freelist 
    m_sparseIds[handle.index] = innerId; 
    
    // write outer id changes back to the array 
    if (freeListEmpty()) 
    { 
        // if the freelist was empty, it now starts (and ends) at this index 
        m_freeListFront = handle.index; 
        m_freeListBack = m_freeListFront; 
    } 
    else 
    { 
        m_sparseIds[m_freeListBack].index = handle.index; 
        
        // previous back of the freelist points to new back 
        m_freeListBack = handle.index; 
        
        // new freelist back is stored     
    } 
    
    // remove the component by swapping with the last element, then pop_back 
    if (innerIndex != m_items.size() - 1) 
    { 
        std::swap(m_items.at(innerIndex), m_items.back()); 
        std::swap(m_meta.at(innerIndex), m_meta.back()); 
        
        // fix the ComponentId index of the swapped component 
        m_sparseIds[m_meta.at(innerIndex).denseToSparse].index = innerIndex; 
    } 
    
    m_items.pop_back(); 
    m_meta.pop_back(); 
    return 1; 
}

In the spirit of the data-oriented programming mantra "where there is one, there are many" I have provided a couple of methods to emplace and erase items in batch. We also get to have some fun with C++11 variadic templates, which enable us to pass any constructor arguments needed for the items.


template  template  IdSet_T handle_map::emplaceItems(int n, Params... args) 
{
     IdSet_T handles(n); 
     assert(n > 0 && "emplaceItems called with n = 0"); 
     
     m_fragmented = 1; 
     m_items.reserve(m_items.size() + n); 
     
     // reserve the space we need (if not already there) 
     m_meta.reserve(m_meta.size() + n); 
     std::generate_n(handles.begin(), n, [&, this]()
        { 
            return emplace(args...); 
        }); 

     return handles; // efficient to return vector by value, copy elided with NVRO (or with C++11 move semantics) 
} 

template  size_t handle_map::eraseItems(const IdSet_T& handles) 
{
    size_t count = 0; 
    
    for (auto h : handles) 
    {
        count += erase(h); 
    } 
    
    return count; 
}

The at function shows our O(1) lookup by handle, plus typeId and generation asserts for added safety.


 template  inline T& handle_map::at(Id_T handle) 
 {
     assert(handle.index < m_sparseIds.size() && "outer index out of range"); 
     Id_T innerId = m_sparseIds[handle.index]; 
     
     assert(handle.typeId == m_itemTypeId && "typeId mismatch");
     assert(handle.generation == innerId.generation && "at called with old generation"); 
     assert(innerId.index < m_items.size() && "inner index out of range"); 
     
     return m_items[innerId.index]; 
}

Example:


  handle_map testMap(0, 1); 
  Id_T handle = testMap.insert(1);
  int i = testMap[handle];

Use Cases

There are several areas in a game engine where handle_map can be employed. This container is not a silver bullet, but given the performance characteristics, I often prefer to use it where a plain array or vector are unsuitable.
Some examples include:

Entity component systems - The components of an ECS could be stored in a handle_map, and use handles externally to store references to the entity or its components.
Scene graphs and embedded linked lists - An embedded list or tree structure is straightforward to represent by storing handles instead of direct pointers. Combined with a defragment sort, the flat representation can also be optimized for cache locality of adjacent nodes.
Resource caches - It is common practice to reference resources by handle rather than directly by pointer or shared_pointer, but the handle_map isn't ideal for directly storing large blobs since entries are constantly being swapped around. Consider this pattern instead: cache the resource in a memory location that doesn't change, and store a shared_ptr to it in the handle_map cache. There is more to it, but I am planning a future article covering my resource cache implementation in full detail.

Performance

I did some profiling to compare handle_map's performance to a vector of heap allocated unique_ptr's, and unordered_map. This should give a rough idea of the performance advantages over those two alternatives.

Creating 100,000 items:


handle_map: create 100000 items, time = 4.749955 ms  
unique_ptr: create 100000 items, time = 111.625944 ms  
unordered_map: create 100000 items, time = 89.358049 ms

Iterating over all items, summing each item into total:

handle_map: iteration over dense set, total = 100000, time = 0.263713 ms  
unique_ptr: iteration, total = 100000, time = 0.524748 ms  
unordered_map: iteration by iterator, total = 100000, time = 3.466641 ms

Iterating by lookup:


handle_map: iteration by handle, total = 100000, time = 0.653704 ms  
unordered_map: iteration by lookup, total = 100000, time = 15.921830 ms

Clearing all items:


handle_map: clear 100000 items, time = 0.377051 ms  
unique_ptr: clear 100000 items, time = 10161.480165 ms  
unordered_map: clear 100000 items, time = 7615.724425 ms

Considerations

Not lightweight
Under the hood, handle_map is implemented with three vectors plus (at least) 11 bytes of additional overhead.
Extra indirection on lookups
Lookups by handle require indexing into two arrays, which can add an extra cache miss over just dereferencing a pointer. This isn't something you should worry about too much, just make sure to iterate over the dense set using getItems() whenever possible. If handle_map is used to store components for an ECS, the "system" implementation should be able to bypass handles and operate directly on the dense set.
What to do when an obsolete key is used for lookup
My preference is to use this as a debug-time assert. The assumption is that I don't ever expect it to happen, and in a release build we'd have undefined behavior and possibly a crash. It would be trivial to instead throw an exception, and try to recover from the situation in much the same way you would recover from failing to lock a dangling weak_ptr.
Thread safety
This container is not thread-safe. Other than concurrent_queue, I'm not a big fan of implementing thread safety at the container level. I find that doing so often leads to two subtle problems. First, over-locking: where each function call is protected by a mutex, and calls to the container's functions become internally serialized. Adjacent function calls lose the ability to reduce contention by sharing a lock. Second, under-locking: where race conditions can still exist because locks are not made at a higher "transactional" level. I prefer instead to do multithreading with tasks.
clear() vs. reset()
There are two ways to purge all data from the container. The slower but safer method is to use clear(), which maintains the integrity of the sparse set, increments the generation of all entries, and can detect future lookups with a stale handle. The faster method is to use reset() which destroys the sparse set and empties the freelist.
Fragmentation of entries
Over time, items will be added out of order, and removed using the "swap and pop" method. This leads to a loss of adjacency and cache locality for items that were originally (or should be) added next to each other. I have provided a defragment function that allows the client to define a sort order for the dense set.

If the implementation looks menacing, it's really not all that bad. You're basically looking at a standard insertion sort, with a small optimization for trivially_copyable types. In my profiling, I found the optimized version to be about 8-10% faster. Additionally, the m_fragmented flag is used to eliminate all work when no changes to the container have occurred since the last defragment completion (which should be the statistically common case).

I chose insertion sort due to its properties of being stable and adaptive, with very little overhead. The assumption I'm making is that initially, the data set may be small but randomly ordered. Over time (with occasional defragmentation), the data set may grow but remain in mostly sorted order. The data set is also likely to have "few unique" items where they just need to be grouped next to each other. Using my new favorite web page, I found that insertion sort looked like a good fit.


template template size_t handle_map::defragment(Compare comp, size_t maxSwaps) 
{ 
    if (m_fragmented == 0) 
    { 
        return 0; 
    } 
    
    size_t swaps = 0; 
    int i = 1; 
    
    for (; i < m_items.size() && (maxSwaps == 0 || swaps < maxSwaps); ++i) 
    { 
        T tmp = m_items; 
        Meta_T tmpMeta = m_meta; 
        
        int j = i - 1; 
        int j1 = j + 1; 
        
        // trivially copyable implementation 
        if (std::is_trivially_copyable::value) 
        { 
            while (j >= 0 && comp(m_items[j], tmp)) 
            { 
                m_sparseIds[m_meta[j].denseToSparse].index = j1; 
                --j; --j1; 
            } 
            
            if (j1 != i) 
            { 
                memmove(&m_items[j1+1], &m_items[j1], sizeof(T) * (i - j1));
                memmove(&m_meta[j1+1], &m_meta[j1], sizeof(Meta_T) * (i - j1)); 
                ++swaps; 
                
                m_items[j1] = tmp; 
                m_meta[j1] = tmpMeta; 
                m_sparseIds[m_meta[j1].denseToSparse].index = j1; 
            } 
        } // standard implementation 
        else 
        { 
            while (j >= 0 && (maxSwaps == 0 || swaps < maxSwaps) && comp(m_items[j], tmp)) 
            { 
                m_items[j1] = std::move(m_items[j]);
                m_meta[j1] = std::move(m_meta[j]);
                m_sparseIds[m_meta[j1].denseToSparse].index = j1; 
                --j; --j1; ++swaps; 
            } 
            
            if (j1 != i) 
            { 
                m_items[j1] = tmp; 
                m_meta[j1] = tmpMeta; 
                m_sparseIds[m_meta[j1].denseToSparse].index = j1; 
            } 
        } 
    } 
    
    if (i == m_items.size()) 
    { 
        m_fragmented = 0; 
    } 
    
    return swaps;
}

Example:


const int N = 100000; 

struct Test 
{ 
    int val, sort; 
}; 

handle_map testMap(0, N); 

for (int i = 0; i < N; ++i) 
{ 
    testMap.emplace(1, i);
} 

std::random_shuffle(testMap.begin(), testMap.end());
timer.start();

size_t swaps = testMap.defragment([](const Test& a, const Test& b) 
{
    return a.sort > b.sort; 
});

timer.stop();

As you can see, a near worst case scenario sort (profiled above) can be quite heavy on performance.


handle_map: defragment swaps = 99987, total = 100000, time = 32538.735446 ms

For that reason, the operation is reentrant and can be approached iteratively. Passing maxSwaps will effectively set a ballpark time budget, and calling it periodically will cause the container to always approach the ideal order. Keep in mind, this does not change the fact that this is semantically still an unordered container.

Conclusion

The handle_map is an associative container implemented with the principles of data-oriented design in mind. The container is semantically similar to std::map or std::unordered_map, and exhibits constant time complexity for inserts, lookups, and removals. Handles are used as keys for lookups and contain "smart" internal safety features such as typeId and generation checks. A defragment sort operation can be defined by the user to incrementally approach the ideal in-memory order of the contiguous data set.

Future Work

Create a way to specify a custom allocator?
Turn Meta_T into template parameter to enable additional metadata?
Anything else?

Acknowledgments

For more reading on this concept, and alternative implementations, be sure to view the blog posts that inspired me to create this container for my game project.

Source

Github repo: https://github.com/y2kiah/griffin-containers
Online documentation available at: https://y2kiah.github.io/griffin-containers/

Article Update Log

7 Sept 2016: Initial release

0 Likes 12 Comments

Comments

apatriarca

This sort of implementation is actually useful in a more general settings than game development. This is not however a complete alternative to std::unordered_map since the handles are generated by your handle_map and not by the user. Considering the ECS system for example, the handles of the components will be all different. You can't use a single ID to retrieve all the components of an entity from the various systems.

September 14, 2016 01:58 PM

y2kiah

That's true, though in my ECS implementation I didn't use the "entity as a pure id" strategy. I have an Entity struct which contains a list of component handles, and a handle_map of entities gives me a single id for each entity. I try to have systems operate directly on component lists, so the times when I have to translate from component to entity, and then entity to a related component are minimal, but it does come up.


typedef std::bitset<MAX_COMPONENTS> ComponentMask;

struct Entity {
	explicit Entity() {
		components.reserve(RESERVE_ENTITY_COMPONENTS);
	}

	bool addComponent(ComponentId id);
	bool removeComponent(ComponentId id);
	bool removeComponentsOfType(ComponentType ct);

	inline bool hasComponent(ComponentType ct) const {
		return componentMask[ct];
	}
		
	ComponentMask componentMask;
	std::vector<ComponentId> components;
};

September 14, 2016 04:51 PM

jpetrie

Your benchmarks don't appear to be included in your GitHub repository? Are they available anywhere for peer review?

September 14, 2016 09:48 PM

y2kiah

The benchmarking code is available in this file: https://github.com/y2kiah/project-griffin/blob/master/project-griffin/project-griffin/source/tests/container_tests.cpp

September 14, 2016 11:43 PM

SeanMiddleditch

This design is a good deal more complicated than really necessary. This is mostly owning to the metadata concept. It should be done away with, IMO. For instance, you mention the type id to offer an "element of type safety" which is something you can get far better by using _actual types_, e.g. a strong enum:

enum class PlayerId : HandleMapId {};
enum class TextureId : HandleMapId {};
 
PlayerId player;
TextureId texture;
player = texture; // COMPILE ERROR

If you're concerned with supporting older versions of C++, the same is trivially accomplishable with tag types:

template <typename Tag>
struct Id { ... };
 
struct PlayerIdTag;
typedef Id<PlayerIdTag> PlayerId;
 
struct TextureIdTag;
typedef Id<TextureIdTag> TextureId;

This resolves concerns with memory usage, too. You only need two arrays at most (and that includes the id free list), and those you can also allocate in a fused block if you must (though for many uses, using distinct block allocators is likely to be faster both for allocation and runtime usage).

It's a good article, but I don't think you've decently justified why your more complex version is better than the industry standard approaches (which are also decently well-documented, e.g. [url]http://bitsquid.blogspot.com/2011/09/managing-decoupling-part-4-id-lookup.html[/url]).

September 18, 2016 02:22 AM

y2kiah

@SeanMiddleditch not sure but it sounds like you're mixing up the meta vector with the "meta" information stored in the handle. Which are you opposed to, or is it both?

The third vector only stores a sparse set location, maintained in parallel with the dense set. If you are ok with linear complexity on removals then by all means, take out the third array. I don't think this approach is "better" than any other, I just opted for constant time removals.

I actually do what you suggested most often when I need type safety. For example my Entity snippet above shows EntityId and ComponentId which are just aliases for Id_T. The point of typeId in the handle is mainly for my ECS. Many components are defined in Lua script and are not known at compile time. All component types, both static and dynamic, are identified by a unique id. Holding this type id in the component's handle is useful for locating the storage location of the component type. If you absolutely need a 4-byte handle instead of 8-bytes, then by all means, eliminate the typeId and generation and just make the handle a straight up index.

September 18, 2016 04:55 AM

Fuaa

The benchmark is wrong, you are reserving space on the handle_map but you are not doing the same for the others data structures. Also when you populate the unique_ptr vector you are doing all those small allocations on the heap, I guess it takes most of the time.

September 20, 2016 09:18 PM

y2kiah

Fuaa, that's just incorrect. All containers are reserved at the beginning of the function. I think you're missing the point of the unique_ptr comparison. The point IS to compare handle_map to individual heap allocations.

September 21, 2016 12:18 AM

SenneH

Wouldn't it be better to use flags instead of bit fields? They have some implementation-dependent behavior and performance wise it should matter.

September 23, 2016 04:34 PM

SeanMiddleditch

Which are you opposed to, or is it both? The third vector only stores a sparse set location, maintained in parallel with the dense set. If you are ok with linear complexity on removals then by all means, take out the third array. I don't think this approach is "better" than any other, I just opted for constant time removals.

You don't need them for constant-time insertions or removals.

I 100% promise you that you can do everything you're trying to do, with less code, less complexity, and less memory usage, keeping only a single level of indirection for lookup. I highly suggest you read the article I linked; it's pretty close to what we use. :)

One vector to store the objects, one vector for indirections, embedding generation data in the indirection entry, and also embedding the free-list into the empty indirection slots with no space overhead.

September 24, 2016 03:53 AM

y2kiah

Bitsquid stores the sparse set index in the object itself to achieve O(1) removals, making their implementation intrusive and more brittle IMO, whereas handle_map places the id internally in a separate meta vector, keeping it away from the dense set data. This is a pretty standard data oriented technique. Is it more complex? Well, yes, but when representing data as a structure of arrays you expect to have more arrays and tighter objects. Other than that one key difference, they operate in the same way.

So it turns out you don't actually save a lot of memory with their solution, only displace it. There is a *very* small additional fixed overhead in handle_map, but it's insignificant next to the size of the actual data contained.

On the complexity, you should factor in that their solution is incompatible with C++ "OO" style, and RAII containers (e.g. shared_ptr) cannot be stored because they elect not to run destructors on removal. I'm not saying you should store non-pod types in handle_map, only that you can. If you do store a pod type in handle_map, you won't pay any destructor overhead anyway. So it's clear what the added complexity actually buys you in overall flexibility.

Storing the identity of the object as a member of itself departs from the idiomatic C++ standard approach of non-intrusive containers. As a user of handle_map, do you worry about the complexity behind the curtain anyway? It's already bought and paid for. There are no internal concerns of the container leaking into your problem space, other than the usual awareness of iterator invalidation and the like.

Bitsquid's solution is great, I would never fault anyone for deciding on their lightweight, C-style approach. Just saying, the capabilities are not 1:1, which may or may not justify the implementation differences, depending on what you are doing. The trade-offs are not lost on me, use the best tool for the job!

I appreciate all contributions and improvements to the implementation. All should feel free to send a pull request if you can make the code cleaner, faster, or more portable.

September 24, 2016 05:06 AM

efigenio

Great entry , like always, but could you point out why this structure should be used in place of an hash ?

December 05, 2016 02:22 PM

You must log in to join the conversation.

Don't have a GameDev.net account? Sign up!

Game Engine Containers - handle_map

Introduction

Goals

Implementation

Use Cases

Performance

Considerations

Conclusion

Future Work

Acknowledgments

Source

Article Update Log

Comments

Featured Tutorial

Recommended Tutorials

Other Tutorials by y2kiah

Game Engine Containers - handle_map

Introduction

Goals

Implementation

Use Cases

Performance

Considerations

Conclusion

Future Work

Acknowledgments

Source

Article Update Log

Comments

Featured Tutorial

Recommended Tutorials

Other Tutorials by y2kiah

Reticulating splines