Game Engine API, smart pointers or not?

Started by
35 comments, last by frob 4 years, 12 months ago
21 minutes ago, Gnollrunner said:

I believe he should be able to do a custom allocator. I think nobody really uses standard new and delete for this kind of stuff.  It's really rare that I do. Slab allocation is almost instantaneous after initial start up. You can easy make it use inlined free-list push and pops, especially if you aren't worried about threading.  Even if you are, there are some cheats that you can do depending on the usage.

Also in my experience the fragmentation issue is often overstated.  In many applications you have a lot of smallish objects, and yes they may get created and deleted often. However with size specific heaps, or as I like to use, a heap with a hash table of size specific free-lists, the holes get filled before the heap grows.

The main issues I've found with std::shared_ptr and std::weak_ptr is the double sized pointers and the fact that you can't easy pass around raw pointers and then assign them to shared_ptr or week_ptr on the fly since you'll end up with multiple control blocks. However this is kind of a design thing, so for his API it might be fine as long as he's consistent with using smart pointers.

Why do I need a custom allocator? What for? I don't really get the fragmentation thing.

Advertisement
8 minutes ago, Tim Leijten said:

Why do I need a custom allocator? What for?

You don't "need" one. However they are often a LOT faster.  The standard library provides some facility for using their smart pointers with your own heap and if you want things to be fast it's probably something you should look into at some point. I actually have a rather large library of heap implementations that I have built up over the years, and I'm sure you can find some around the web, However, there is no reason you can't get things working and then add in a custom heap later.

Fragmentation just means you have unused space in the middle of the heap.  You can think of the heap as a long piece of paper tape.  You start from one end and write stuff to it. However if you erase something in the middle you will have unused space there. Ideally you want to reuse that space later on before continuing to write to the end of the tape.

However it's not so simple.  You might need to write something that's longer than the hole you are trying to fill, so you many need to find a larger hole, or you may in fact have to write to the end of the tape.  Also perhaps the hole is far bigger than what you want, so then you want to use a piece of it and save the rest for later.   Or maybe you just want to find a smaller hole if there is one. There are a lot of different use cases.

A general purpose allocator has to handle all this stuff, however in practice where you have a lot of objects that are the same size. or say have 5 different sizes for instance, you can make things a lot simpler and faster using "free lists" and other tricks.
 

25 minutes ago, Gnollrunner said:

You don't "need" one. However they are often a LOT faster.  The standard library provides some facility for using their smart pointers with your own heap and if you want things to be fast it's probably something you should look into at some point. I actually have a rather large library of heap implementations that I have built up over the years, and I'm sure you can find some around the web, However, there is no reason you can't get things working and then add in a custom heap later.

Fragmentation just means you have unused space in the middle of the heap.  You can think of the heap as a long piece of paper tape.  You start from one end and write stuff to it. However if you erase something in the middle you will have unused space there. Ideally you want to reuse that space later on before continuing to write to the end of the tape.

However it's not so simple.  You might need to write something that's longer than the hole you are trying to fill, so you many need to find a larger hole, or you may in fact have to write to the end of the tape.  Also perhaps the hole is far bigger than what you want, so then you want to use a piece of it and save the rest for later.   Or maybe you just want to find a smaller hole if there is one. There are a lot of different use cases.

A general purpose allocator has to handle all this stuff, however in practice where you have a lot of objects that are the same size. or say have 5 different sizes for instance, you can make things a lot simpler and faster using "free lists" and other tricks.
 

Well, I don't know how to write a heap implementation or custom allocator. So I'm going to leave that for later.

4 minutes ago, Tim Leijten said:

Well, I don't know how to write a heap implementation or custom allocator. So I'm going to leave that for later.

They are actually quite easy to write.  When you are ready look up "slab allocation"  or "slab allocator"........but yeah.... leave it for later.

16 minutes ago, Gnollrunner said:

They are actually quite easy to write.  When you are ready look up "slab allocation"  or "slab allocator"........but yeah.... leave it for later.

Okay, I'll look into it for a bit, see if I understand it.

 

But I just thought of another solution to my problem, what if I use both handles and raw pointers, but in a different way?

What I mean is, all functions return raw pointers and all game api functionality uses raw pointers, BUT, I make a function e.g. createHandleFromPointer, that will turn a pointer into a handle. And then having a function to get a pointer back from that handle.

So createHandleFromPointer would add the pointer to a table, and then whenever an object get's deleted, that object will remove itself from the table. and then when you try to get a pointer from the handle, it will just return 0 because the pointer isn't in the table anymore.

Does that sound like a possible solution or is it flawed?

Edit: Is it safe to assume the destructor of an object will always be called? Or are there cases where it wont be called?

12 minutes ago, Tim Leijten said:

Okay, I'll look into it for a bit, see if I understand it.

 

But I just thought of another solution to my problem, what if I use both handles and raw pointers, but in a different way?

What I mean is, all functions return raw pointers and all game api functionality uses raw pointers, BUT, I make a function e.g. createHandleFromPointer, that will turn a pointer into a handle. And then having a function to get a pointer back from that handle.

So createHandleFromPointer would add the pointer to a table, and then whenever an object get's deleted, that object will remove itself from the table. and then when you try to get a pointer from the handle, it will just return 0 because the pointer isn't in the table anymore.

Does that sound like a possible solution or is it flawed?

Edit: Is it safe to assume the destructor of an object will always be called? Or are there cases where it wont be called?

I'm not sure I see the point of using both. A weak_ptr  already gets invalidated so you can just check that.  As I see it you should probably stick with one or the other.  I guess the main advantage I see of a handle is you can get away with 8 bytes and maybe even 4. But there will probably be some extra access overhead. It's a trade off.

Destructors pretty much always get called *IF* they are virtual or there is no inheritance. If not, you have to be a bit more careful to make sure child destructors get called.  In general I like to make most of my classes polymorphic so that isn't an issue. The cost is one vtable pointer per object.

2 minutes ago, Gnollrunner said:

I'm not sure I see the point of using both. A weak_ptr  already gets invalidated so you can just check that.  As I see it you should probably stick with one or the other.  I guess the main advantage I see of a handle is you can get away with 8 bytes and maybe even 4. But there will probably be some extra access overhead. It's a trade off.

Destructors pretty much always get called *IF* they are virtual or there is no inheritance. If not, you have to be a bit more careful to make sure child destructors get called.  In general I like to make most of my classes polymorphic so that isn't an issue. The cost is one vtable pointer per object.

So you reccommend just using a weak_ptr instead?(Makes things quite a bit easier)

I think the default allocators are a lot better than they were, you probably can't write a general purpose allocator that is outright better now. I believe for example MSVC ends up in HeapAlloc which by default is the Low Fragmentation Heap which uses "predetermined buckets" (and I am guessing a free list for each) to deal with anything less than I believe 16KB (with I believe 4KB virtual memory pages).

 

But you can beat it easily for a specific use case, sometimes while also optimising memory layout to make better use of a CPU cache (iterating compact arrays good, jumping around the heap with a linked list or array of pointers less so).

  • As mentioned allocators for a specific size can be really simple and faster. This works well with common objects, but maybe not so well as an "Entity allocator" when the different entity subtypes might all have drastically different sizes, quantities, life cycles, etc.
  • If you can store something directly in an array, that is often about as near optimal as you can get, as iterating through it will make good use of the CPU cache and reduce cache-misses over iterating through pointers, even if all those pointers came from the same memory block, jumping back and forth is less optimal.
    • You might mark array elements as "dead" in some way, skip it when iterating and allocate new objects into the first/next dead element.
    • You might compact the array so it only contains valid elements, and always place new objects at the end.
      • this compacting step makes the use of shared_ptr/weak_ptr impossible, as the object will change address.
  • Another one I have seen, is to allocate a large pool, and then use that as a stack for a single operation, say an update. Allocations simply take the top of the stack, and deallocations are no-op. At the end of the operation the entire pool is considered empty. I have seen this more commonly in various C programs to avoid a lot of manual tracking for free(ptr) when dealing with say temporary strings, arrays, buffers, etc.
9 minutes ago, Tim Leijten said:

So createHandleFromPointer would add the pointer to a table, and then whenever an object get's deleted, that object will remove itself from the table. and then when you try to get a pointer from the handle, it will just return 0 because the pointer isn't in the table anymore.

Does that sound like a possible solution or is it flawed?

Assuming you have a rule that a raw pointer is never stored like in my example code before, then yes, roughly.

But I always just made my unique ID "handle" part of the object itself. As for the "table", well usually my scene/map needs to have track of everything anyway to process updates, etc. So I already have that, no separate table.

e.g.


    class Entity
    {
    public:
        Entity(unsigned id) : _id(id) {}
        unsigned id()const { return _id; }
    private:
        const unsigned _id;
    };

 

44 minutes ago, Tim Leijten said:

So you reccommend just using a weak_ptr instead?(Makes things quite a bit easier)

You know.... I'm not really an architecture Nazi.  I mostly just discuss options unless I see some serious problem with dong something one way.  Most things have pros and cons.  Personally I use my own smart pointer system for various reasons, but there is nothing wrong with using the standard library smart pointers if it suits your needs. Likewise some people like handle systems which is OK too.  I'm just saying if you want to combine the two you should know what you are going to get out of that, i.e. what's the advantage. Otherwise it's just an added complication.

The one down side I see of the standard library smart pointers is they are large.  Say for instance you are doing an octree. You might use unique_ptr which has no overhead, but lets say for some reason your objects may have more than one owner and you can't. In that case your 8 shared_ptr objects are going to take 128 bytes.  If you had used raw pointers that would only be 64 bytes. If you had used 4 byte handles that would only be 32.  (I actually implement a 4 byte pointers system for stuff like that. It's kind of like a handle but acts like a pointer). So you can see there is quite a memory savings if you have a large octree. But again, it depends on what you are doing.

3 minutes ago, SyncViews said:

I think the default allocators are a lot better than they were, you probably can't write a general purpose allocator that is outright better now. I believe for example MSVC ends up in HeapAlloc which by default is the Low Fragmentation Heap which uses "predetermined buckets" (and I am guessing a free list for each) to deal with anything less than I believe 16KB (with I believe 4KB virtual memory pages).

 

But you can beat it easily for a specific use case, sometimes while also optimising memory layout to make better use of a CPU cache (iterating compact arrays good, jumping around the heap with a linked list or array of pointers less so).

  • As mentioned allocators for a specific size can be really simple and faster. This works well with common objects, but maybe not so well as an "Entity allocator" when the different entity subtypes might all have drastically different sizes, quantities, life cycles, etc.
  • If you can store something directly in an array, that is often about as near optimal as you can get, as iterating through it will make good use of the CPU cache and reduce cache-misses over iterating through pointers, even if all those pointers came from the same memory block, jumping back and forth is less optimal.
    • You might mark array elements as "dead" in some way, skip it when iterating and allocate new objects into the first/next dead element.
    • You might compact the array so it only contains valid elements, and always place new objects at the end.
      • this compacting step makes the use of shared_ptr/weak_ptr impossible, as the object will change address.
  • Another one I have seen, is to allocate a large pool, and then use that as a stack for a single operation, say an update. Allocations simply take the top of the stack, and deallocations are no-op. At the end of the operation the entire pool is considered empty. I have seen this more commonly in various C programs to avoid a lot of manual tracking for free(ptr) when dealing with say temporary strings, arrays, buffers, etc.

Assuming you have a rule that a raw pointer is never stored like in my example code before, then yes, roughly.

But I always just made my unique ID "handle" part of the object itself. As for the "table", well usually my scene/map needs to have track of everything anyway to process updates, etc. So I already have that, no separate table.

e.g.



    class Entity
    {
    public:
        Entity(unsigned id) : _id(id) {}
        unsigned id()const { return _id; }
    private:
        const unsigned _id;
    };

 

That's even better than what I came up with! One issue though, if the user requests a pointer from a handle, do you then search through the entire scene for that one object wit the correct handle? Isn't a table faster?

 

3 minutes ago, Gnollrunner said:

You know.... I'm not really an architecture Nazi.  I mostly just discuss options unless I see some serious problem with dong something one way.  Most things have pros and cons.  Personally I use my own smart pointer system for various reasons, but there is nothing wrong with using the standard library smart pointers if it suits your needs. Likewise some people like handle systems which is OK too.  I'm just saying if you want to combine the two you should know what you are going to get out of that, i.e. what's the advantage. Otherwise it's just an added complication.

The one down side I see of the standard library smart pointers is they are large.  Say for instance you are doing an octree. You might use unique_ptr which has no overhead, but lets say for some reason your objects may have more than one owner and you can't. In that case your 8 shard_ptr objects are going to take 128 bytes.  If you had used raw pointers that would only be 64 bytes. If you had used 4 byte handles that would only be 32.  (I actually implement a 4 byte pointers system for stuff like that. It's kind of like a handle but acts like a pointer). So you can see there is quite a memory savings if you have a large octree. But again, it depends on what you are doing.

Okay. I'll go with smart pointers. Once memory becomes an issue, I might write my own library. But currently I don't think it's needed yet.

Thanks for the tip though!

This topic is closed to new replies.

Advertisement