y2kiah

Members
  • Content count

    403
  • Joined

  • Last visited

Community Reputation

2031 Excellent

About y2kiah

  • Rank
    Member

Personal Information

Social

  • Github
    https://github.com/y2kiah
  1. It doesn't matter how big your LUT is, if the cache line you're looking up isn't hot in CPU cache, you will wait longer for the memory lookup than it takes to just calculate the sqrt on the fly. LUTs for expensive operations (sin,cos,tan) used to be common, but nowadays it's usually a pessimization. Even back in the day, sqrt was not often tabled. Usually it was approximated with cheaper operations since they were "good enough" for game purposes. You didn't mention your platform btw, if this is for something like a raspberry pi with limited CPU horsepower, maybe this could be a win.
  2. Look at JobSwarm on John Ratcliff's blog. I've never used it, but it sounds like what you're looking for.
  3. Rendering Text for GUI

    I use NanoVG for my GUI needs, which gives OpenGL font rendering and vector drawing capabilities in one lightweight package. It uses Sean Barrett's library internally for TTF parsing, and can also optionally use Freetype. Prior to that I always used Win32 APIs for font rendering, but I have been enjoying the ease of this approach and performance doesn't seem like a problem so far.
  4.   Interesting... I will look into that, thanks. Can you provide a reference?
  5. maybe I'm missing something but why do you wrap the result of make_unique in std::move above? The result of make_unique is already an rvalue, so the std::move is redundant.   Back to the topic at hand, I happen to agree with the rules written here by Stroustrup and Sutter: http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-in F.16: For “in” parameters, pass cheaply-copied types by value and others by reference to const   So you're doing it right, and what you are communicating through this function signature (const IRenderTarget &pRT) is: 1) this method has no interest in the ownership of the object being passed, and does not sink or re-seat the pointer, otherwise you would pass (unique_ptr<IRenderTarget>), see rule F.18 2) this is an "in" parameter, otherwise you would have used (IRenderTarget&) or (IRenderTarget*) 3) it cannot be passed no value (aka nullptr), otherwise you would have used (const IRenderTarget* pRT)   This is a nitpick and a matter of personal preference but I would not prefix a const ref parameter name with "p".   On a slight tangent, I would prefer to sink those objects in to member variables via constructor parameters, dependency injection style, rather than calling "make_unique" within the Renderer's constructor.
  6. Over the holiday I had some time off and decided to just develop a small clone for the browser for no particular reason. It's one javascript file of about 850 lines, and took a few hours to complete. I purposely kept things very simple, unsophisticated, and sub-optimal, and just had fun with it.   Each play through generates new random patterns and speeds (F5 refresh to regenerate).   You have 3 lives. To win, get 5 frogs across the river to the bridge.   Use Chrome or Firefox.   Play it from github   or grab the source: https://github.com/y2kiah/frogger    
  7. A retrospective on the Infinity project

    Welcome back! Your journal is one of the all time greats on gamedev.net. Hope to see more entries if you have the time.
  8. Bitsquid stores the sparse set index in the object itself to achieve O(1) removals, making their implementation intrusive and more brittle IMO, whereas handle_map places the id internally in a separate meta vector, keeping it away from the dense set data. This is a pretty standard data oriented technique. Is it more complex? Well, yes, but when representing data as a structure of arrays you expect to have more arrays and tighter objects. Other than that one key difference, they operate in the same way.   So it turns out you don't actually save a lot of memory with their solution, only displace it. There is a *very* small additional fixed overhead in handle_map, but it's insignificant next to the size of the actual data contained.   On the complexity, you should factor in that their solution is incompatible with C++ "OO" style, and RAII containers (e.g. shared_ptr) cannot be stored because they elect not to run destructors on removal. I'm not saying you should store non-pod types in handle_map, only that you can. If you do store a pod type in handle_map, you won't pay any destructor overhead anyway. So it's clear what the added complexity actually buys you in overall flexibility.   Storing the identity of the object as a member of itself departs from the idiomatic C++ standard approach of non-intrusive containers. As a user of handle_map, do you worry about the complexity behind the curtain anyway? It's already bought and paid for. There are no internal concerns of the container leaking into your problem space, other than the usual awareness of iterator invalidation and the like.   Bitsquid's solution is great, I would never fault anyone for deciding on their lightweight, C-style approach. Just saying, the capabilities are not 1:1, which may or may not justify the implementation differences, depending on what you are doing. The trade-offs are not lost on me, use the best tool for the job!   I appreciate all contributions and improvements to the implementation. All should feel free to send a pull request if you can make the code cleaner, faster, or more portable.
  9. Fuaa, that's just incorrect. All containers are reserved at the beginning of the function. I think you're missing the point of the unique_ptr comparison. The point IS to compare handle_map to individual heap allocations.
  10. @SeanMiddleditch not sure but it sounds like you're mixing up the meta vector with the "meta" information stored in the handle. Which are you opposed to, or is it both?   The third vector only stores a sparse set location, maintained in parallel with the dense set. If you are ok with linear complexity on removals then by all means, take out the third array. I don't think this approach is "better" than any other, I just opted for constant time removals.   I actually do what you suggested most often when I need type safety. For example my Entity snippet above shows EntityId and ComponentId which are just aliases for Id_T. The point of typeId in the handle is mainly for my ECS. Many components are defined in Lua script and are not known at compile time. All component types, both static and dynamic, are identified by a unique id. Holding this type id in the component's handle is useful for locating the storage location of the component type. If you absolutely need a 4-byte handle instead of 8-bytes, then by all means, eliminate the typeId and generation and just make the handle a straight up index.
  11. Introduction This article explores the creation of a data container for game programming. This container is meant to take the place of C++ standard library containers such as std::map and std::unordered_map, with an alternative that stores data contiguously in memory. Data should be contiguous to maximize cache locality when iterating over the data, resulting in faster performance vs. containers that individually allocate elements on the heap. The goal here is not to produce a drop-in replacement for the STL containers. That being said, whenever practical I tried to remain consistent with the conventions and semantics established in the STL. The std::vector is my poor-man's memory allocator of choice. What sets this container apart from other associative containers like std::map and std::unordered_map, is that we do not specify the key when inserting data. This is why I'm referring to the key as a handle. We receive a handle back from the container upon inserting. Quite often when using associative containers, I find that the values of keys are not actually important. We just end up storing the keys and using them only for lookup purposes anyway. Often we use hash maps purely for the associative semantics of the container, and not because we actually care about the key values. We're still forced into supplying a key, which can lead to convenient but flawed choices like std::string. This is where handles can be a great alternative. We still have to store the handle somewhere, but the burden of type selection and supplying the value is relieved from the programmer. Handles also hold information internally that make this container smarter and safer than bog standard hash maps. So now that I've given an overview of the container and why you might want to use it, let's outline the goals for the implementation. Goals Store data in a contiguous block of memory This is key to our performance goals. We will separate our container into a dense set and a sparse set, where the data itself is kept in the dense set, tightly packed to ensure best cache locality during traversals. Even though this is an associative container, semantically it doesn't make sense to traverse in the order of the handles' values. That would also incur a performance penalty due to the double-indirection of handle lookups (more on that later). More likely, what you'll want to do is either look up a single entry by handle, or traverse all entries by their in-memory order just as you would an array/vector, completely bypassing the use of handles. This is not an ordered container, so traversal of the dense set is not guaranteed to visit elements in the same order each time. Create an associative mapping between a handle and the data The sparse set will play two important roles. First, it is the handle index and remains stable with insertions and deletions. This allows us to adjust the memory location of elements in the dense set to keep them tightly packed, without invalidating our lookup value. The second role of the sparse set is to internally store a freelist. Gaps in the sparse set resulting from deletions will be filled by subsequent insertions. The sparse set will always fill gaps before growing, and thus will itself try to remain dense. O(1) lookups Constant time lookups are achieved because handles contain information about their location in the sparse set. This makes lookups into the sparse set nothing more than an array lookup. The information contained in the sparse set knows the item's exact location in the dense set, so again just an array lookup. One consideration for lookups by handle vs. a direct pointer dereference is the cost of an extra indirection, so traversal "by handle" (over the sparse set) is not recommended. Therefore, iterators to the dense set are provided by the handle_map for traversals. O(1) insertions Constant time insertions are done by pushing the new item to the back of the dense set, and using the first available slot in the sparse set's freelist for the index. O(1) removals Constant time removal is made possible by using the "swap and pop" trick in the dense set. That is, we swap the item being removed with the last item in the array, and then pop_back to remove it. We also have to update the index of the last item referenced in the sparse set. We have to discover the location in the sparse set of the corresponding inner id for the (former) last item that we are swapping with. The index field must be changed to reference its new location in the dense set. One option would be to linearly search through the sparse set until we find the match, but to achieve O(1), we instead introduce a third storage set (in addition to the dense set and sparse set) which we'll term the meta set. The meta set is maintained in parallel with the dense set, and contains an index back to the sparse set, giving us a two-way link. We do not combine the meta information directly into the dense set because it's currently only used for deletion, and there is no sense in harming cache locality and alignment of the main object store for infrequently used data. Implementation Let's begin with an explanation of the handle structure itself. The handle is a 64-bit value, so there is no additional overhead compared to a pointer on 64-bit platforms. /** * @struct Id_T * @var free 0 if active, 1 if slot is part of freelist, only applicable to inner ids * @var typeId relates to m_itemTypeId parameter of handle_map * @var generation incrementing generation of data at the index, for tracking accesses to old data * @var index When used as a handle (outer id, given to the client): * free==0, index of id in the sparseIds array * When used as an inner id (stored in sparseIds array): * free==0, index of the item in the dense items array * free==1, index of next free slot, forming an embedded linked list * @var value unioned with the above four vars, used for direct comparison of ids */ struct Id_T { union { /** * the order of this bitfield is important for sorting prioritized by free, then typeId, * then generation, then index */ struct { uint32_t index; uint16_t generation; uint16_t typeId : 15; uint16_t free : 1; }; uint64_t value; }; }; typedef std::vector IdSet_T; #define NullId_T Id_T{} It is worth mentioning the two ways that the handle structure is used, which I've termed "outer id" and "inner id." The outer id is an Id_T value that's given to the user of the container as a handle. The inner id is the related Id_T value that is stored in the sparse set. The two values differ in how the index property is used. The outer id stores an index into the sparse set, while the inner id stores an index into the dense set. The typeId field is optional extra metadata, set in the container's constructor, that can introduce an element of type safety into your outer id handles. For example, say you have two handle_maps, one storing Apples and the other storing Oranges. The typeId would differentiate handles between the two, that would otherwise be equivalent. The programmer simply needs a mechanism to assign a unique 15-bit integer for each type contained within a handle_map. In many cases, you'll just bypass this and set zero. However, in cases where many types are tracked in an array of handle_maps, like components in an entity component system (ECS), or fields in a SoA inversion, the typeId can come in very handy. In addition to type safety, the typeId can be used as an additional O(1) lookup to first get you into the correct handle_map (within an array of handle_maps), before getting you to the final value. This is exactly how automatic storage for all component types is done in the griffin engine's ECS implementation. The generation field is used as a safety mechanism to detect when a stale handle is trying to access data that has since been overwritten in the corresponding slot. Every time a slot in the sparse set is removed, the generation increments. Handle lookups assert that the outer id and inner id generations match. If debug asserts aren't safe enough for your purposes, an exception could be thrown and handled at runtime, but I view stale lookups as a bug and prefer to just crash hard and fix the issue. The free field is a 1-bit flag to indicate if the sparse set entry represents a value, or if it is actually empty and part of the internal freelist. When free==1, the inner id's index field no longer stores a lookup index into the dense set, and instead stores the index of the next free slot in the internal freelist. This gives us an embedded singly linked list within the sparse set array, and as long as we store the front index of the freelist separately, it is an O(1) operation to find the next available slot and maintain the singly linked list. The following free functions are used to directly compare Id_T types. // struct Id_T comparison functions inline bool operator==(const Id_T& a, const Id_T& b) { return (a.value == b.value); } inline bool operator!=(const Id_T& a, const Id_T& b) { return (a.value != b.value); } inline bool operator< (const Id_T& a, const Id_T& b) { return (a.value < b.value); } inline bool operator> (const Id_T& a, const Id_T& b) { return (a.value > b.value); } Let's jump into the handle_map itself. I've removed comments for brevity, so we can focus on the important parts. Links to the full implementation are provided at the bottom of the article. template class handle_map { public: struct Meta_T { uint32_t denseToSparse; //!< index into m_sparseIds array stored in m_meta }; typedef std::vector DenseSet_T; typedef std::vector MetaSet_T; // Functions T& at(Id_T handle); const T& at(Id_T handle) const; T& operator[](Id_T handle) { return at(handle); } const T& operator[](Id_T handle) const { return at(handle); } template Id_T emplace(Params... args) { return insert(T{ args... }); } template IdSet_T emplaceItems(int n, Params... args); typename DenseSet_T::iterator begin() { return m_items.begin(); } typename DenseSet_T::const_iterator cbegin() const { return m_items.cbegin(); } typename DenseSet_T::iterator end() { return m_items.end(); } typename DenseSet_T::const_iterator cend() const { return m_items.cend(); } size_t erase(Id_T handle); size_t eraseItems(const IdSet_T& handles); Id_T insert(T&& i); Id_T insert(const T& i); void clear() _NOEXCEPT; void reset() _NOEXCEPT; bool isValid(Id_T handle) const; size_t size() const _NOEXCEPT { return m_items.size(); } size_t capacity() const _NOEXCEPT { return m_items.capacity(); } template size_t defragment(Compare comp, size_t maxSwaps = 0); DenseSet_T& getItems() { return m_items; } const DenseSet_T& getItems() const { return m_items; } MetaSet_T& getMeta() { return m_meta; } const MetaSet_T& getMeta() const { return m_meta; } IdSet_T& getIds() { return m_sparseIds; } const IdSet_T& getIds() const { return m_sparseIds; } uint32_t getFreeListFront() const { return m_freeListFront; } uint32_t getFreeListBack() const { return m_freeListBack; } uint16_t getItemTypeId() const { return m_itemTypeId; } uint32_t getInnerIndex(Id_T handle) const; explicit handle_map(uint16_t itemTypeId, size_t reserveCount) : m_itemTypeId(itemTypeId) { m_sparseIds.reserve(reserveCount); m_items.reserve(reserveCount); m_meta.reserve(reserveCount); } private: bool freeListEmpty() const { return (m_freeListFront == 0xFFFFFFFF); } // Variables uint32_t m_freeListFront = 0xFFFFFFFF; //!< start index in the embedded ComponentId freelist uint32_t m_freeListBack = 0xFFFFFFFF; //!< last index in the freelist uint16_t m_itemTypeId; //!< the Id_T::typeId to use for ids produced by this handle_map uint8_t m_fragmented = 0; // The insert and erase functions do most of the heavy lifting to keep things consistent. template Id_T handle_map::insert(T&& i) { Id_T handle = { 0 }; m_fragmented = 1; if (freeListEmpty()) { Id_T innerId = { (uint32_t)m_items.size(), 1, m_itemTypeId, 0 }; handle = innerId; handle.index = (uint32_t)m_sparseIds.size(); m_sparseIds.push_back(innerId); } else { uint32_t outerIndex = m_freeListFront; Id_T &innerId = m_sparseIds.at(outerIndex); m_freeListFront = innerId.index; // the index of a free slot refers to the next free slot if (freeListEmpty()) { m_freeListBack = m_freeListFront; } // convert the index from freelist to inner index innerId.free = 0; innerId.index = (uint32_t)m_items.size(); handle = innerId; handle.index = outerIndex; } m_items.push_back(std::forward(i)); m_meta.push_back({ handle.index }); return handle; } template size_t handle_map::erase(Id_T handle) { if (!isValid(handle)) { return 0; } m_fragmented = 1; Id_T innerId = m_sparseIds[handle.index]; uint32_t innerIndex = innerId.index; // push this slot to the back of the freelist innerId.free = 1; ++innerId.generation; // increment generation so remaining outer ids go stale innerId.index = 0xFFFFFFFF; // max numeric value represents the end of the freelist m_sparseIds[handle.index] = innerId; // write outer id changes back to the array if (freeListEmpty()) { // if the freelist was empty, it now starts (and ends) at this index m_freeListFront = handle.index; m_freeListBack = m_freeListFront; } else { m_sparseIds[m_freeListBack].index = handle.index; // previous back of the freelist points to new back m_freeListBack = handle.index; // new freelist back is stored } // remove the component by swapping with the last element, then pop_back if (m_items.size() > 1) { std::swap(m_items.at(innerIndex), m_items.back()); std::swap(m_meta.at(innerIndex), m_meta.back()); // fix the ComponentId index of the swapped component m_sparseIds[m_meta.at(innerIndex).denseToSparse].index = innerIndex; } m_items.pop_back(); m_meta.pop_back(); return 1; } In the spirit of the data-oriented programming mantra "where there is one, there are many" I have provided a couple of methods to emplace and erase items in batch. We also get to have some fun with C++11 variadic templates, which enable us to pass any constructor arguments needed for the items. template template IdSet_T handle_map::emplaceItems(int n, Params... args) { IdSet_T handles(n); assert(n > 0 && "emplaceItems called with n = 0"); m_fragmented = 1; m_items.reserve(m_items.size() + n); // reserve the space we need (if not already there) m_meta.reserve(m_meta.size() + n); std::generate_n(handles.begin(), n, [&, this](){ return emplace(args...); }); return handles; // efficient to return vector by value, copy elided with NVRO (or with C++11 move semantics) } template size_t handle_map::eraseItems(const IdSet_T& handles) { size_t count = 0; for (auto h : handles) { count += erase(h); } return count; } The at function shows our O(1) lookup by handle, plus typeId and generation asserts for added safety. template inline T& handle_map::at(Id_T handle) { assert(handle.index < m_sparseIds.size() && "outer index out of range"); Id_T innerId = m_sparseIds[handle.index]; assert(handle.typeId == m_itemTypeId && "typeId mismatch"); assert(handle.generation == innerId.generation && "at called with old generation"); assert(innerId.index < m_items.size() && "inner index out of range"); return m_items[innerId.index]; } Example: handle_map testMap(0, 1); Id_T handle = testMap.insert(1); int i = testMap[handle]; Use Cases There are several areas in a game engine where handle_map can be employed. This container is not a silver bullet, but given the performance characteristics, I often prefer to use it where a plain array or vector are unsuitable. Some examples include: Entity component systems - The components of an ECS could be stored in a handle_map, and use handles externally to store references to the entity or its components. Scene graphs and embedded linked lists - An embedded list or tree structure is straightforward to represent by storing handles instead of direct pointers. Combined with a defragment sort, the flat representation can also be optimized for cache locality of adjacent nodes. Resource caches - It is common practice to reference resources by handle rather than directly by pointer or shared_pointer, but the handle_map isn't ideal for directly storing large blobs since entries are constantly being swapped around. Consider this pattern instead: cache the resource in a memory location that doesn't change, and store a shared_ptr to it in the handle_map cache. There is more to it, but I am planning a future article covering my resource cache implementation in full detail. Performance I did some profiling to compare handle_map's performance to a vector of heap allocated unique_ptr's, and unordered_map. This should give a rough idea of the performance advantages over those two alternatives. Creating 100,000 items: handle_map: create 100000 items, time = 4.749955 ms unique_ptr: create 100000 items, time = 111.625944 ms unordered_map: create 100000 items, time = 89.358049 ms Iterating over all items, summing each item into total: handle_map: iteration over dense set, total = 100000, time = 0.263713 ms unique_ptr: iteration, total = 100000, time = 0.524748 ms unordered_map: iteration by iterator, total = 100000, time = 3.466641 ms Iterating by lookup: handle_map: iteration by handle, total = 100000, time = 0.653704 ms unordered_map: iteration by lookup, total = 100000, time = 15.921830 ms Clearing all items: handle_map: clear 100000 items, time = 0.377051 ms unique_ptr: clear 100000 items, time = 10161.480165 ms unordered_map: clear 100000 items, time = 7615.724425 ms Considerations Not lightweight Under the hood, handle_map is implemented with three vectors plus (at least) 11 bytes of additional overhead. Extra indirection on lookups Lookups by handle require indexing into two arrays, which can add an extra cache miss over just dereferencing a pointer. This isn't something you should worry about too much, just make sure to iterate over the dense set using getItems() whenever possible. If handle_map is used to store components for an ECS, the "system" implementation should be able to bypass handles and operate directly on the dense set. What to do when an obsolete key is used for lookup My preference is to use this as a debug-time assert. The assumption is that I don't ever expect it to happen, and in a release build we'd have undefined behavior and possibly a crash. It would be trivial to instead throw an exception, and try to recover from the situation in much the same way you would recover from failing to lock a dangling weak_ptr. Thread safety This container is not thread-safe. Other than concurrent_queue, I'm not a big fan of implementing thread safety at the container level. I find that doing so often leads to two subtle problems. First, over-locking: where each function call is protected by a mutex, and calls to the container's functions become internally serialized. Adjacent function calls lose the ability to reduce contention by sharing a lock. Second, under-locking: where race conditions can still exist because locks are not made at a higher "transactional" level. I prefer instead to do multithreading with tasks. clear() vs. reset() There are two ways to purge all data from the container. The slower but safer method is to use clear(), which maintains the integrity of the sparse set, increments the generation of all entries, and can detect future lookups with a stale handle. The faster method is to use reset() which destroys the sparse set and empties the freelist. Fragmentation of entries Over time, items will be added out of order, and removed using the "swap and pop" method. This leads to a loss of adjacency and cache locality for items that were originally (or should be) added next to each other. I have provided a defragment function that allows the client to define a sort order for the dense set. If the implementation looks menacing, it's really not all that bad. You're basically looking at a standard insertion sort, with a small optimization for trivially_copyable types. In my profiling, I found the optimized version to be about 8-10% faster. Additionally, the m_fragmented flag is used to eliminate all work when no changes to the container have occurred since the last defragment completion (which should be the statistically common case). I chose insertion sort due to its properties of being stable and adaptive, with very little overhead. The assumption I'm making is that initially, the data set may be small but randomly ordered. Over time (with occasional defragmentation), the data set may grow but remain in mostly sorted order. The data set is also likely to have "few unique" items where they just need to be grouped next to each other. Using my new favorite web page, I found that insertion sort looked like a good fit. template template size_t handle_map::defragment(Compare comp, size_t maxSwaps) { if (m_fragmented == 0) { return 0; } size_t swaps = 0; int i = 1; for (; i < m_items.size() && (maxSwaps == 0 || swaps < maxSwaps); ++i) { T tmp = m_items; Meta_T tmpMeta = m_meta; int j = i - 1; int j1 = j + 1; // trivially copyable implementation if (std::is_trivially_copyable::value) { while (j >= 0 && comp(m_items[j], tmp)) { m_sparseIds[m_meta[j].denseToSparse].index = j1; --j; --j1; } if (j1 != i) { memmove(&m_items[j1+1], &m_items[j1], sizeof(T) * (i - j1)); memmove(&m_meta[j1+1], &m_meta[j1], sizeof(Meta_T) * (i - j1)); ++swaps; m_items[j1] = tmp; m_meta[j1] = tmpMeta; m_sparseIds[m_meta[j1].denseToSparse].index = j1; } } // standard implementation else { while (j >= 0 && (maxSwaps == 0 || swaps < maxSwaps) && comp(m_items[j], tmp)) { m_items[j1] = std::move(m_items[j]); m_meta[j1] = std::move(m_meta[j]); m_sparseIds[m_meta[j1].denseToSparse].index = j1; --j; --j1; ++swaps; } if (j1 != i) { m_items[j1] = tmp; m_meta[j1] = tmpMeta; m_sparseIds[m_meta[j1].denseToSparse].index = j1; } } } if (i == m_items.size()) { m_fragmented = 0; } return swaps; } Example: const int N = 100000; struct Test { int val, sort; }; handle_map testMap(0, N); for (int i = 0; i < N; ++i) { testMap.emplace(1, i); } std::random_shuffle(testMap.begin(), testMap.end()); timer.start(); size_t swaps = testMap.defragment([](const Test& a, const Test& b) { return a.sort > b.sort; }); timer.stop(); As you can see, a near worst case scenario sort (profiled above) can be quite heavy on performance. handle_map: defragment swaps = 99987, total = 100000, time = 32538.735446 ms For that reason, the operation is reentrant and can be approached iteratively. Passing maxSwaps will effectively set a ballpark time budget, and calling it periodically will cause the container to always approach the ideal order. Keep in mind, this does not change the fact that this is semantically still an unordered container. Conclusion The handle_map is an associative container implemented with the principles of data-oriented design in mind. The container is semantically similar to std::map or std::unordered_map, and exhibits constant time complexity for inserts, lookups, and removals. Handles are used as keys for lookups and contain "smart" internal safety features such as typeId and generation checks. A defragment sort operation can be defined by the user to incrementally approach the ideal in-memory order of the contiguous data set. Future Work Create a way to specify a custom allocator? Turn Meta_T into template parameter to enable additional metadata? Anything else? Acknowledgments For more reading on this concept, and alternative implementations, be sure to view the blog posts that inspired me to create this container for my game project. http://bitsquid.blogspot.de/2011/09/managing-decoupling-part-4-id-lookup.html http://molecularmusings.wordpress.com/2013/07/24/adventures-in-data-oriented-design-part-3c-external-references/ Source Github repo: https://github.com/y2kiah/griffin-containers Online documentation available at: https://y2kiah.github.io/griffin-containers/ Article Update Log 7 Sept 2016: Initial release
  12. The benchmarking code is available in this file: https://github.com/y2kiah/project-griffin/blob/master/project-griffin/project-griffin/source/tests/container_tests.cpp
  13. That's true, though in my ECS implementation I didn't use the "entity as a pure id" strategy. I have an Entity struct which contains a list of component handles, and a handle_map of entities gives me a single id for each entity. I try to have systems operate directly on component lists, so the times when I have to translate from component to entity, and then entity to a related component are minimal, but it does come up. typedef std::bitset<MAX_COMPONENTS> ComponentMask; struct Entity { explicit Entity() { components.reserve(RESERVE_ENTITY_COMPONENTS); } bool addComponent(ComponentId id); bool removeComponent(ComponentId id); bool removeComponentsOfType(ComponentType ct); inline bool hasComponent(ComponentType ct) const { return componentMask[ct]; } ComponentMask componentMask; std::vector<ComponentId> components; };
  14.   Honestly, it's easy to lose site of this perspective being a relatively "modern" approach to OO design. And when I say modern, I'm talking post-2003 - after the "first five principles" were published, which then acquired the SOLID acronym. For me, this was knowledge I picked up on my own accord after college. Anyone like me who first learned about OO programming in the 90's, or earlier, was very likely to have been actively taught that OO kind of IS about deep hierarchies, finding nouns from the real world and modeling them as classes, using "is-a" and "has-a" as the litmus test, etc. Basically, a lot of the bad/lazy practices that we now take for granted as "not real OO" in hindsight, was just not pervasive wisdom at that time. I'm sure if you were in the right place or had the right teacher, some of those ideas that later coalesced into SOLID were already being written about and discussed. But for 99.9% of programmers, your curriculum was based on the common wisdom (or lack thereof) of the day and your teacher just followed the established lesson plan. It's almost a generational thing. Programmers taught a certain way go to industry, discover new and better ways of doing things, become the older and wiser folks with some pull, and then bring that wisdom back to academia to teach the next crop. It can be a painfully slow process to adapt teaching methods. That's why even today, you may never hear a mention of SOLID principles in lower level courses that teach basic OO using Java.
  15.   [Edited, nevermind, I'm thinking of preprocessor macros] It's actually pretty straightforward to see the exact code generated, at least in visual studio you can write the template output to a file.   I don't look at templates as an optimization or pessimization inherently. You can avoid the virtual function indirection with CRTP, yes. But you are also writing "generic" code which means you are specifically *not* optimizing for the types where you try to apply that code. It can go both ways. Think of the CRTP (aka compile-time polymorphism) as C++'s way to do duck typing, vs. paying a runtime cost for vtable polymorphism. You need to ask yourself the question of whether or not you need the polymorphism at runtime.