Jump to content
  • Advertisement

y2kiah

Member
  • Content count

    404
  • Joined

  • Last visited

Community Reputation

2032 Excellent

About y2kiah

  • Rank
    Member

Personal Information

  • Website
  • Industry Role
    Programmer
  • Interests
    Art
    Audio
    Design
    DevOps
    Programming

Social

  • Github
    https://github.com/y2kiah

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. y2kiah

    More Adventures in Robust Coding

    @Awoken you mentioned missing Chrome as a dev environment due to lack of debugging in node.js in your first entry. Not sure if you are aware but you can do real-time debugging of node.js from VS Code.
  2. y2kiah

    A retrospective on the Infinity project

    Welcome back! Your journal is one of the all time greats on gamedev.net. Hope to see more entries if you have the time.
  3. Bitsquid stores the sparse set index in the object itself to achieve O(1) removals, making their implementation intrusive and more brittle IMO, whereas handle_map places the id internally in a separate meta vector, keeping it away from the dense set data. This is a pretty standard data oriented technique. Is it more complex? Well, yes, but when representing data as a structure of arrays you expect to have more arrays and tighter objects. Other than that one key difference, they operate in the same way.   So it turns out you don't actually save a lot of memory with their solution, only displace it. There is a *very* small additional fixed overhead in handle_map, but it's insignificant next to the size of the actual data contained.   On the complexity, you should factor in that their solution is incompatible with C++ "OO" style, and RAII containers (e.g. shared_ptr) cannot be stored because they elect not to run destructors on removal. I'm not saying you should store non-pod types in handle_map, only that you can. If you do store a pod type in handle_map, you won't pay any destructor overhead anyway. So it's clear what the added complexity actually buys you in overall flexibility.   Storing the identity of the object as a member of itself departs from the idiomatic C++ standard approach of non-intrusive containers. As a user of handle_map, do you worry about the complexity behind the curtain anyway? It's already bought and paid for. There are no internal concerns of the container leaking into your problem space, other than the usual awareness of iterator invalidation and the like.   Bitsquid's solution is great, I would never fault anyone for deciding on their lightweight, C-style approach. Just saying, the capabilities are not 1:1, which may or may not justify the implementation differences, depending on what you are doing. The trade-offs are not lost on me, use the best tool for the job!   I appreciate all contributions and improvements to the implementation. All should feel free to send a pull request if you can make the code cleaner, faster, or more portable.
  4. Fuaa, that's just incorrect. All containers are reserved at the beginning of the function. I think you're missing the point of the unique_ptr comparison. The point IS to compare handle_map to individual heap allocations.
  5. @SeanMiddleditch not sure but it sounds like you're mixing up the meta vector with the "meta" information stored in the handle. Which are you opposed to, or is it both?   The third vector only stores a sparse set location, maintained in parallel with the dense set. If you are ok with linear complexity on removals then by all means, take out the third array. I don't think this approach is "better" than any other, I just opted for constant time removals.   I actually do what you suggested most often when I need type safety. For example my Entity snippet above shows EntityId and ComponentId which are just aliases for Id_T. The point of typeId in the handle is mainly for my ECS. Many components are defined in Lua script and are not known at compile time. All component types, both static and dynamic, are identified by a unique id. Holding this type id in the component's handle is useful for locating the storage location of the component type. If you absolutely need a 4-byte handle instead of 8-bytes, then by all means, eliminate the typeId and generation and just make the handle a straight up index.
  6. Introduction This article explores the creation of a data container for game programming. This container is meant to take the place of C++ standard library containers such as std::map and std::unordered_map, with an alternative that stores data contiguously in memory. Data should be contiguous to maximize cache locality when iterating over the data, resulting in faster performance vs. containers that individually allocate elements on the heap. The goal here is not to produce a drop-in replacement for the STL containers. That being said, whenever practical I tried to remain consistent with the conventions and semantics established in the STL. The std::vector is my poor-man's memory allocator of choice. What sets this container apart from other associative containers like std::map and std::unordered_map, is that we do not specify the key when inserting data. This is why I'm referring to the key as a handle. We receive a handle back from the container upon inserting. Quite often when using associative containers, I find that the values of keys are not actually important. We just end up storing the keys and using them only for lookup purposes anyway. Often we use hash maps purely for the associative semantics of the container, and not because we actually care about the key values. We're still forced into supplying a key, which can lead to convenient but flawed choices like std::string. This is where handles can be a great alternative. We still have to store the handle somewhere, but the burden of type selection and supplying the value is relieved from the programmer. Handles also hold information internally that make this container smarter and safer than bog standard hash maps. So now that I've given an overview of the container and why you might want to use it, let's outline the goals for the implementation. Goals Store data in a contiguous block of memory This is key to our performance goals. We will separate our container into a dense set and a sparse set, where the data itself is kept in the dense set, tightly packed to ensure best cache locality during traversals. Even though this is an associative container, semantically it doesn't make sense to traverse in the order of the handles' values. That would also incur a performance penalty due to the double-indirection of handle lookups (more on that later). More likely, what you'll want to do is either look up a single entry by handle, or traverse all entries by their in-memory order just as you would an array/vector, completely bypassing the use of handles. This is not an ordered container, so traversal of the dense set is not guaranteed to visit elements in the same order each time. Create an associative mapping between a handle and the data The sparse set will play two important roles. First, it is the handle index and remains stable with insertions and deletions. This allows us to adjust the memory location of elements in the dense set to keep them tightly packed, without invalidating our lookup value. The second role of the sparse set is to internally store a freelist. Gaps in the sparse set resulting from deletions will be filled by subsequent insertions. The sparse set will always fill gaps before growing, and thus will itself try to remain dense. O(1) lookups Constant time lookups are achieved because handles contain information about their location in the sparse set. This makes lookups into the sparse set nothing more than an array lookup. The information contained in the sparse set knows the item's exact location in the dense set, so again just an array lookup. One consideration for lookups by handle vs. a direct pointer dereference is the cost of an extra indirection, so traversal "by handle" (over the sparse set) is not recommended. Therefore, iterators to the dense set are provided by the handle_map for traversals. O(1) insertions Constant time insertions are done by pushing the new item to the back of the dense set, and using the first available slot in the sparse set's freelist for the index. O(1) removals Constant time removal is made possible by using the "swap and pop" trick in the dense set. That is, we swap the item being removed with the last item in the array, and then pop_back to remove it. We also have to update the index of the last item referenced in the sparse set. We have to discover the location in the sparse set of the corresponding inner id for the (former) last item that we are swapping with. The index field must be changed to reference its new location in the dense set. One option would be to linearly search through the sparse set until we find the match, but to achieve O(1), we instead introduce a third storage set (in addition to the dense set and sparse set) which we'll term the meta set. The meta set is maintained in parallel with the dense set, and contains an index back to the sparse set, giving us a two-way link. We do not combine the meta information directly into the dense set because it's currently only used for deletion, and there is no sense in harming cache locality and alignment of the main object store for infrequently used data. Implementation Let's begin with an explanation of the handle structure itself. The handle is a 64-bit value, so there is no additional overhead compared to a pointer on 64-bit platforms. /** * @struct Id_T * @var free 0 if active, 1 if slot is part of freelist, only applicable to inner ids * @var typeId relates to m_itemTypeId parameter of handle_map * @var generation incrementing generation of data at the index, for tracking accesses to old data * @var index When used as a handle (outer id, given to the client): * free==0, index of id in the sparseIds array * When used as an inner id (stored in sparseIds array): * free==0, index of the item in the dense items array * free==1, index of next free slot, forming an embedded linked list * @var value unioned with the above four vars, used for direct comparison of ids */ struct Id_T { union { /** * the order of this bitfield is important for sorting prioritized by free, then typeId, * then generation, then index */ struct { uint32_t index; uint16_t generation; uint16_t typeId : 15; uint16_t free : 1; }; uint64_t value; }; }; typedef std::vector IdSet_T; #define NullId_T Id_T{} It is worth mentioning the two ways that the handle structure is used, which I've termed "outer id" and "inner id." The outer id is an Id_T value that's given to the user of the container as a handle. The inner id is the related Id_T value that is stored in the sparse set. The two values differ in how the index property is used. The outer id stores an index into the sparse set, while the inner id stores an index into the dense set. The typeId field is optional extra metadata, set in the container's constructor, that can introduce an element of type safety into your outer id handles. For example, say you have two handle_maps, one storing Apples and the other storing Oranges. The typeId would differentiate handles between the two, that would otherwise be equivalent. The programmer simply needs a mechanism to assign a unique 15-bit integer for each type contained within a handle_map. In many cases, you'll just bypass this and set zero. However, in cases where many types are tracked in an array of handle_maps, like components in an entity component system (ECS), or fields in a SoA inversion, the typeId can come in very handy. In addition to type safety, the typeId can be used as an additional O(1) lookup to first get you into the correct handle_map (within an array of handle_maps), before getting you to the final value. This is exactly how automatic storage for all component types is done in the griffin engine's ECS implementation. The generation field is used as a safety mechanism to detect when a stale handle is trying to access data that has since been overwritten in the corresponding slot. Every time a slot in the sparse set is removed, the generation increments. Handle lookups assert that the outer id and inner id generations match. If debug asserts aren't safe enough for your purposes, an exception could be thrown and handled at runtime, but I view stale lookups as a bug and prefer to just crash hard and fix the issue. The free field is a 1-bit flag to indicate if the sparse set entry represents a value, or if it is actually empty and part of the internal freelist. When free==1, the inner id's index field no longer stores a lookup index into the dense set, and instead stores the index of the next free slot in the internal freelist. This gives us an embedded singly linked list within the sparse set array, and as long as we store the front index of the freelist separately, it is an O(1) operation to find the next available slot and maintain the singly linked list. The following free functions are used to directly compare Id_T types. // struct Id_T comparison functions inline bool operator==(const Id_T& a, const Id_T& b) { return (a.value == b.value); } inline bool operator!=(const Id_T& a, const Id_T& b) { return (a.value != b.value); } inline bool operator< (const Id_T& a, const Id_T& b) { return (a.value < b.value); } inline bool operator> (const Id_T& a, const Id_T& b) { return (a.value > b.value); } Let's jump into the handle_map itself. I've removed comments for brevity, so we can focus on the important parts. Links to the full implementation are provided at the bottom of the article. template class handle_map { public: struct Meta_T { uint32_t denseToSparse; //!< index into m_sparseIds array stored in m_meta }; typedef std::vector DenseSet_T; typedef std::vector MetaSet_T; // Functions T& at(Id_T handle); const T& at(Id_T handle) const; T& operator[](Id_T handle) { return at(handle); } const T& operator[](Id_T handle) const { return at(handle); } template Id_T emplace(Params... args) { return insert(T{ args... }); } template IdSet_T emplaceItems(int n, Params... args); typename DenseSet_T::iterator begin() { return m_items.begin(); } typename DenseSet_T::const_iterator cbegin() const { return m_items.cbegin(); } typename DenseSet_T::iterator end() { return m_items.end(); } typename DenseSet_T::const_iterator cend() const { return m_items.cend(); } size_t erase(Id_T handle); size_t eraseItems(const IdSet_T& handles); Id_T insert(T&& i); Id_T insert(const T& i); void clear() _NOEXCEPT; void reset() _NOEXCEPT; bool isValid(Id_T handle) const; size_t size() const _NOEXCEPT { return m_items.size(); } size_t capacity() const _NOEXCEPT { return m_items.capacity(); } template size_t defragment(Compare comp, size_t maxSwaps = 0); DenseSet_T& getItems() { return m_items; } const DenseSet_T& getItems() const { return m_items; } MetaSet_T& getMeta() { return m_meta; } const MetaSet_T& getMeta() const { return m_meta; } IdSet_T& getIds() { return m_sparseIds; } const IdSet_T& getIds() const { return m_sparseIds; } uint32_t getFreeListFront() const { return m_freeListFront; } uint32_t getFreeListBack() const { return m_freeListBack; } uint16_t getItemTypeId() const { return m_itemTypeId; } uint32_t getInnerIndex(Id_T handle) const; explicit handle_map(uint16_t itemTypeId, size_t reserveCount) : m_itemTypeId(itemTypeId) { m_sparseIds.reserve(reserveCount); m_items.reserve(reserveCount); m_meta.reserve(reserveCount); } private: bool freeListEmpty() const { return (m_freeListFront == 0xFFFFFFFF); } // Variables uint32_t m_freeListFront = 0xFFFFFFFF; //!< start index in the embedded ComponentId freelist uint32_t m_freeListBack = 0xFFFFFFFF; //!< last index in the freelist uint16_t m_itemTypeId; //!< the Id_T::typeId to use for ids produced by this handle_map uint8_t m_fragmented = 0; // } The insert and erase functions do most of the heavy lifting to keep things consistent. template Id_T handle_map::insert(T&& i) { Id_T handle = { 0 }; m_fragmented = 1; if (freeListEmpty()) { Id_T innerId = { (uint32_t)m_items.size(), 1, m_itemTypeId, 0 }; handle = innerId; handle.index = (uint32_t)m_sparseIds.size(); m_sparseIds.push_back(innerId); } else { uint32_t outerIndex = m_freeListFront; Id_T &innerId = m_sparseIds.at(outerIndex); m_freeListFront = innerId.index; // the index of a free slot refers to the next free slot if (freeListEmpty()) { m_freeListBack = m_freeListFront; } // convert the index from freelist to inner index innerId.free = 0; innerId.index = (uint32_t)m_items.size(); handle = innerId; handle.index = outerIndex; } m_items.push_back(std::forward(i)); m_meta.push_back({ handle.index }); return handle; } template size_t handle_map::erase(Id_T handle) { if (!isValid(handle)) { return 0; } m_fragmented = 1; Id_T innerId = m_sparseIds[handle.index]; uint32_t innerIndex = innerId.index; // push this slot to the back of the freelist innerId.free = 1; ++innerId.generation; // increment generation so remaining outer ids go stale innerId.index = 0xFFFFFFFF; // max numeric value represents the end of the freelist m_sparseIds[handle.index] = innerId; // write outer id changes back to the array if (freeListEmpty()) { // if the freelist was empty, it now starts (and ends) at this index m_freeListFront = handle.index; m_freeListBack = m_freeListFront; } else { m_sparseIds[m_freeListBack].index = handle.index; // previous back of the freelist points to new back m_freeListBack = handle.index; // new freelist back is stored } // remove the component by swapping with the last element, then pop_back if (innerIndex != m_items.size() - 1) { std::swap(m_items.at(innerIndex), m_items.back()); std::swap(m_meta.at(innerIndex), m_meta.back()); // fix the ComponentId index of the swapped component m_sparseIds[m_meta.at(innerIndex).denseToSparse].index = innerIndex; } m_items.pop_back(); m_meta.pop_back(); return 1; } In the spirit of the data-oriented programming mantra "where there is one, there are many" I have provided a couple of methods to emplace and erase items in batch. We also get to have some fun with C++11 variadic templates, which enable us to pass any constructor arguments needed for the items. template template IdSet_T handle_map::emplaceItems(int n, Params... args) { IdSet_T handles(n); assert(n > 0 && "emplaceItems called with n = 0"); m_fragmented = 1; m_items.reserve(m_items.size() + n); // reserve the space we need (if not already there) m_meta.reserve(m_meta.size() + n); std::generate_n(handles.begin(), n, [&, this]() { return emplace(args...); }); return handles; // efficient to return vector by value, copy elided with NVRO (or with C++11 move semantics) } template size_t handle_map::eraseItems(const IdSet_T& handles) { size_t count = 0; for (auto h : handles) { count += erase(h); } return count; } The at function shows our O(1) lookup by handle, plus typeId and generation asserts for added safety. template inline T& handle_map::at(Id_T handle) { assert(handle.index < m_sparseIds.size() && "outer index out of range"); Id_T innerId = m_sparseIds[handle.index]; assert(handle.typeId == m_itemTypeId && "typeId mismatch"); assert(handle.generation == innerId.generation && "at called with old generation"); assert(innerId.index < m_items.size() && "inner index out of range"); return m_items[innerId.index]; } Example: handle_map testMap(0, 1); Id_T handle = testMap.insert(1); int i = testMap[handle]; Use Cases There are several areas in a game engine where handle_map can be employed. This container is not a silver bullet, but given the performance characteristics, I often prefer to use it where a plain array or vector are unsuitable. Some examples include: Entity component systems - The components of an ECS could be stored in a handle_map, and use handles externally to store references to the entity or its components. Scene graphs and embedded linked lists - An embedded list or tree structure is straightforward to represent by storing handles instead of direct pointers. Combined with a defragment sort, the flat representation can also be optimized for cache locality of adjacent nodes. Resource caches - It is common practice to reference resources by handle rather than directly by pointer or shared_pointer, but the handle_map isn't ideal for directly storing large blobs since entries are constantly being swapped around. Consider this pattern instead: cache the resource in a memory location that doesn't change, and store a shared_ptr to it in the handle_map cache. There is more to it, but I am planning a future article covering my resource cache implementation in full detail. Performance I did some profiling to compare handle_map's performance to a vector of heap allocated unique_ptr's, and unordered_map. This should give a rough idea of the performance advantages over those two alternatives. Creating 100,000 items: handle_map: create 100000 items, time = 4.749955 ms unique_ptr: create 100000 items, time = 111.625944 ms unordered_map: create 100000 items, time = 89.358049 ms Iterating over all items, summing each item into total: handle_map: iteration over dense set, total = 100000, time = 0.263713 ms unique_ptr: iteration, total = 100000, time = 0.524748 ms unordered_map: iteration by iterator, total = 100000, time = 3.466641 ms Iterating by lookup: handle_map: iteration by handle, total = 100000, time = 0.653704 ms unordered_map: iteration by lookup, total = 100000, time = 15.921830 ms Clearing all items: handle_map: clear 100000 items, time = 0.377051 ms unique_ptr: clear 100000 items, time = 10161.480165 ms unordered_map: clear 100000 items, time = 7615.724425 ms Considerations Not lightweight Under the hood, handle_map is implemented with three vectors plus (at least) 11 bytes of additional overhead. Extra indirection on lookups Lookups by handle require indexing into two arrays, which can add an extra cache miss over just dereferencing a pointer. This isn't something you should worry about too much, just make sure to iterate over the dense set using getItems() whenever possible. If handle_map is used to store components for an ECS, the "system" implementation should be able to bypass handles and operate directly on the dense set. What to do when an obsolete key is used for lookup My preference is to use this as a debug-time assert. The assumption is that I don't ever expect it to happen, and in a release build we'd have undefined behavior and possibly a crash. It would be trivial to instead throw an exception, and try to recover from the situation in much the same way you would recover from failing to lock a dangling weak_ptr. Thread safety This container is not thread-safe. Other than concurrent_queue, I'm not a big fan of implementing thread safety at the container level. I find that doing so often leads to two subtle problems. First, over-locking: where each function call is protected by a mutex, and calls to the container's functions become internally serialized. Adjacent function calls lose the ability to reduce contention by sharing a lock. Second, under-locking: where race conditions can still exist because locks are not made at a higher "transactional" level. I prefer instead to do multithreading with tasks. clear() vs. reset() There are two ways to purge all data from the container. The slower but safer method is to use clear(), which maintains the integrity of the sparse set, increments the generation of all entries, and can detect future lookups with a stale handle. The faster method is to use reset() which destroys the sparse set and empties the freelist. Fragmentation of entries Over time, items will be added out of order, and removed using the "swap and pop" method. This leads to a loss of adjacency and cache locality for items that were originally (or should be) added next to each other. I have provided a defragment function that allows the client to define a sort order for the dense set. If the implementation looks menacing, it's really not all that bad. You're basically looking at a standard insertion sort, with a small optimization for trivially_copyable types. In my profiling, I found the optimized version to be about 8-10% faster. Additionally, the m_fragmented flag is used to eliminate all work when no changes to the container have occurred since the last defragment completion (which should be the statistically common case). I chose insertion sort due to its properties of being stable and adaptive, with very little overhead. The assumption I'm making is that initially, the data set may be small but randomly ordered. Over time (with occasional defragmentation), the data set may grow but remain in mostly sorted order. The data set is also likely to have "few unique" items where they just need to be grouped next to each other. Using my new favorite web page, I found that insertion sort looked like a good fit. template template size_t handle_map::defragment(Compare comp, size_t maxSwaps) { if (m_fragmented == 0) { return 0; } size_t swaps = 0; int i = 1; for (; i < m_items.size() && (maxSwaps == 0 || swaps < maxSwaps); ++i) { T tmp = m_items; Meta_T tmpMeta = m_meta; int j = i - 1; int j1 = j + 1; // trivially copyable implementation if (std::is_trivially_copyable::value) { while (j >= 0 && comp(m_items[j], tmp)) { m_sparseIds[m_meta[j].denseToSparse].index = j1; --j; --j1; } if (j1 != i) { memmove(&m_items[j1+1], &m_items[j1], sizeof(T) * (i - j1)); memmove(&m_meta[j1+1], &m_meta[j1], sizeof(Meta_T) * (i - j1)); ++swaps; m_items[j1] = tmp; m_meta[j1] = tmpMeta; m_sparseIds[m_meta[j1].denseToSparse].index = j1; } } // standard implementation else { while (j >= 0 && (maxSwaps == 0 || swaps < maxSwaps) && comp(m_items[j], tmp)) { m_items[j1] = std::move(m_items[j]); m_meta[j1] = std::move(m_meta[j]); m_sparseIds[m_meta[j1].denseToSparse].index = j1; --j; --j1; ++swaps; } if (j1 != i) { m_items[j1] = tmp; m_meta[j1] = tmpMeta; m_sparseIds[m_meta[j1].denseToSparse].index = j1; } } } if (i == m_items.size()) { m_fragmented = 0; } return swaps; } Example: const int N = 100000; struct Test { int val, sort; }; handle_map testMap(0, N); for (int i = 0; i < N; ++i) { testMap.emplace(1, i); } std::random_shuffle(testMap.begin(), testMap.end()); timer.start(); size_t swaps = testMap.defragment([](const Test& a, const Test& b) { return a.sort > b.sort; }); timer.stop(); As you can see, a near worst case scenario sort (profiled above) can be quite heavy on performance. handle_map: defragment swaps = 99987, total = 100000, time = 32538.735446 ms For that reason, the operation is reentrant and can be approached iteratively. Passing maxSwaps will effectively set a ballpark time budget, and calling it periodically will cause the container to always approach the ideal order. Keep in mind, this does not change the fact that this is semantically still an unordered container. Conclusion The handle_map is an associative container implemented with the principles of data-oriented design in mind. The container is semantically similar to std::map or std::unordered_map, and exhibits constant time complexity for inserts, lookups, and removals. Handles are used as keys for lookups and contain "smart" internal safety features such as typeId and generation checks. A defragment sort operation can be defined by the user to incrementally approach the ideal in-memory order of the contiguous data set. Future Work Create a way to specify a custom allocator? Turn Meta_T into template parameter to enable additional metadata? Anything else? Acknowledgments For more reading on this concept, and alternative implementations, be sure to view the blog posts that inspired me to create this container for my game project. http://bitsquid.blogspot.de/2011/09/managing-decoupling-part-4-id-lookup.html http://molecularmusings.wordpress.com/2013/07/24/adventures-in-data-oriented-design-part-3c-external-references/ Source Github repo: https://github.com/y2kiah/griffin-containers Online documentation available at: https://y2kiah.github.io/griffin-containers/ Article Update Log 7 Sept 2016: Initial release
  7. The benchmarking code is available in this file: https://github.com/y2kiah/project-griffin/blob/master/project-griffin/project-griffin/source/tests/container_tests.cpp
  8. That's true, though in my ECS implementation I didn't use the "entity as a pure id" strategy. I have an Entity struct which contains a list of component handles, and a handle_map of entities gives me a single id for each entity. I try to have systems operate directly on component lists, so the times when I have to translate from component to entity, and then entity to a related component are minimal, but it does come up. typedef std::bitset<MAX_COMPONENTS> ComponentMask; struct Entity { explicit Entity() { components.reserve(RESERVE_ENTITY_COMPONENTS); } bool addComponent(ComponentId id); bool removeComponent(ComponentId id); bool removeComponentsOfType(ComponentType ct); inline bool hasComponent(ComponentType ct) const { return componentMask[ct]; } ComponentMask componentMask; std::vector<ComponentId> components; };
  9. I will admit I have not seen those talks, and I will definitely check them out. I try to consume as many credible opinions as possible.   One bit of wisdom (if it can be called that) I have picked in my career is to try and not be swayed too far one way or the other by individual articles, talks, or journal comments. If I based my opinions on certain books from the early 2000's, I would think that using global Singleton managers everywhere is "good design."   I have heard many compelling arguments for ideas that go against the grain. But, I try to aggregate and average the ideas from many sources of influence over a long period of time to recognize real trends in thinking. If you look at the evidence over the last decade or so, managed memory languages have exploded in popularity, most of which use ref counting mechanisms to faciliate garbage collection. And the C++ language itself, steered by giants of the industry, is aligning itself in many ways with those managed language patterns. You can basically completely factor the "delete" keyword out of the c++ language these days.
  10. Meant for everyone with an opinion who wants to engage in the conversation :-)   To be clear, I am not arguing the merit of heap memory allocation over stack allocation. I agree that stack allocation is generally preferred over dynamic allocation when practical. I'm talking about the method used to share dynamically allocated resources across your engine's subsystems. My contention is that if you're going to use a handle, that a smart pointer already IS the handle that you are looking for.   That appears to be the exact opposite direction that the modern c++ language is progressing. You're swimming upstream on this one.   This is kind of a hand-wavy argument since lots of things can cause subtle bugs in multi-threaded code, but I don't see evidence of how ref counted pointers alone can be implicated. If you use a condition variable with a master/slave task queue, you protect the concurrent queue itself, there is no need to mutex every individual memory access contained within a task. If you are paranoid, use a unique_ptr to guarantee only one reference, then use the pointer's "move" semantics to hand it back to the master thread when done, or possibly convert it into a shared_ptr at that point if it is indeed a "shared" resource.   Why are you reorganizing memory? Copying large chunks of memory around should be avoided. Use a memory pool or placement new to allocate it in the right location in the first place. But please, at least let yourself live in a world where a resource stays at one place in memory for its lifetime.   Cache coherency has very little to do with value vs. reference semantics and more about how you lay out your allocations in memory, and your access patterns over that memory later. Align your allocations to the order that you iterate over items in a loop. Copying a handle with value semantics is moot if the resources behind the handle are spread all over memory anyway. A handle is going to be a slower way to get at that memory vs. a pointer for the obvious reason that it will always involve some kind of container lookup by the "owner". If you're careful, it will be an O(1) operation but still relatively slow. This also means that the handle needs some behavior built in to know how to locate / request the resource. So instead of just delivering the actual dependency to the object that uses it, you're delivering this thing that instead knows how to go and find the thing that you really need - a pointless indirection. Then you have to handle the case where that thing is not actually available because the central authority disposed of it early because it had no idea (no ref count) that you still needed it - this adds cruft to your code. It's much "cleaner" to live in a world where you are guaranteed that the resource you need still exists in memory as long as you are holding on to a reference. You are indeed a shared owner of that thing as long as that thing is needed for you to function.   The ref-counted smart pointer knows... that's the reason for a ref count, so you KNOW exactly how many references are around. In any sane system, it should be trivial to locate those pointers and replace them if you have a situation where some central "authority" can rip the proverbial rug from under your feet. In such a situation, I would use a weak_ptr / shared_ptr pair to share the resource. If the weak_ptr fails to convert, at that point you can go and "ask" for the resource again and grab a new pointer. But honestly, in my 16 years of c++ development, I can't think of any case where I needed to "notify" a bunch of pointers that the memory has moved. That kind of thing would be a red flag to me of poor design.     It's a solution to a problem that doesn't need to exist anymore, if you use modern c++ constructs. Ten years ago I was using integer IDs to spread around my resources thinking it was cleaner and easier than using naked pointers (and it was). I used a handle that knew how to go and grab the resource from a "manager/owner" every time I needed a resource... hundreds or thousands of times every loop doing lookups, some slower than others. With multi-threading in the mix, it's all the more reason to abandon the "single owner" concept because access to that owner object's container would have to be mutexed, thus adding extra contention to that container throughout your system.   Once a programmer has "seen the light" of the shared ownership concept, I believe they shed the fear of "losing control" over memory and realize that they still retain full control. You DO still control your object's lifetime. You DO still control the memory layout and you CAN get good cache coherency. Those are not matters of which language construct you use. Stop fighting the concept by shoehorning dated patterns into your code - it will result in you having cleaner, more readable, more self-documenting, more testable, more portable code.   To summarize my argument, the most important trade off you make by adopting an integer ID / handle + owner lookup model is that you have dirtier, more tightly coupled code: 1) you spread around a type-unsafe representation of your resource - this amorphous integer that really gives you no guarantees at all 2) you outright lie about the true nature of your dependencies, at the same time adding the requirement of a framework to make this thing useful 3) your code is loaded with self-rolled cruft- it's less readable by other programmers, less testable, and less reusable. Have you heard the expression "kill your darlings"? 4) an integer ID / handle is a wolf in sheep's clothing. A handle is a pointer, and a pointer is a handle if they both represent a way to get at a resource. I think it was George Carlin that said something like "don't change the word and expect the definition to change with it." A pointer lets you just use the thing that you need. A handle lets you add some tight coupling to an externality, a "global" state from the perspective of the client object, that can slowly look up a resource based on an id and give it to you. If you are not plugged into this "framework," you can't operate because your tricky mechanism for converting arbitrary integers into real useful things is broken. Why not just inject the thing you actually need to use, instead of making all of your objects responsible for going out and looking for the resources they need? 5) If I see a parameter of (const shared_ptr<Texture> &texPtr) rather than (int textureId), I am being more straightforward and clear about the true nature of my dependency. Without a single comment, I can ascertain a ton of self-documented information. For one, I know this function needs A TEXTURE. I don't need to know or worry about where to get an ID for some texture to pass in, and wonder whether the ID is going to be valid or not and what is controlling it, and pass it along hoping that the function I'm calling knows how to use the ID to get the thing it really needs. I see "shared_ptr" and immediately I know "oh, I am not the only user of this thing, so I will make sure to only hold this reference in scope for as long as I actually need it!". Also notice that I use "value semantics" to pass this shared_ptr object around. For all intents and purposes it is a value being copied. But I pass a const reference so that I don't incur any additional copy / ref counting overhead unless I actually store a local copy of it.   Well I think my point is made (probably to a fault). I don't expect to change any minds, programmers' brains aren't easily swayed by opinions on forums. I'm just putting it out there.
  11. Edit: I see you are trying to move away from smart pointers as your "handle"? I'm interested to hear why you are making that decision. There are several arguments I would make in favor of using standard shared_ptr and the occasional weak_ptr over the use of integer id based handles.
  12. y2kiah

    On APIs.

    I understand the average reaction to this "no API" talk, it's a shocking prospect. I do think we're being a bit near-sighted when it comes to this discussion though. The idea of throwing out the APIs within the next 10 years makes very little sense, but fast forward to 10, 15 even 20 years from now. The "GP" in GPU is transitioning from [i]Graphics Processing[/i] to [i]General Purpose[/i] in a very obvious way, we see this trend happening before our very eyes. Soon there will be very little reason NOT to completely integrate the GPU and CPU into a single cohesive chip that has all of the performance of a CPU/GPU mix, but none of the overhead. We can already see this type of thing happening with Sandy Bridge. I think the idea of starting over with our APIs is to approach the design and implementation from a completely new angle that is more in line with the way new hardware will be, once it gets to that point. There will still be APIs, but eventually Direct3d and OpenGL will have run their course I think.
  13. y2kiah

    Cessna Test Flight

    really awesome! I can't wait to see the next advancement in this project
  14. y2kiah

    Warbots Online - BETA Download

    Quote: Cool...I'm curious if it runs on Windows 7, I have not heard from anybody who has tried it. It does! I was getting a pretty solid 40-50fps with pretty high settings, 1680x1050, 4xAA specs: Windows 7 64 RC1 Athlon 64 X2 5000+ OC to 3GHz 4 GB ram 8800 GT 512MB
  15. y2kiah

    Logarithmic Depth Buffer

    interesting, now I see what you mean. I plotted resolution with respect to distance for a range of C values to get a better visual picture of the resolution function. It's interesting to actually see how quickly the standard z-buffer precision diverges compared to the logarithmic scale. C = 0.001 seems like a good "middle ground" value where you still have about 1/2 millimeter of precision at close to zero depth (assuming your units are meters) which should be more than enough AFAIK, while the depth function plots to a relatively linear curve between -100 and +100. It shouldn't be too difficult to enforce tessellation at sub 200meter sizes I would think :)
  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!