Allocator design issues

Started by
5 comments, last by Gnollrunner 5 years, 9 months ago

I've spent quite a while (and probably far longer than I actually should) trying to design an allocator system.  I've bounced ideas around to various people in the past, but never really gotten something satisfactory.

Basically, the requirements I'm trying to target are:

  1.   Composability -- allocators that seamlessly allocate from memory allocated by other allocators.  This helps me to do things like, for example, write an allocator that pads allocations from its parent allocator with bit patterns to detect heap corruption.  It also allows me to easily create spillovers, or optionally assert on overflow with specialized fallbacks.
  2.   Handling the fact that some allocators have different interfaces than others in an elegant way.  For example, a regular allocator might have Allocate/Deallocate, but a linear allocator can't do itemized deallocation (but can deallocate everything at once).
  3.   I want to be able to tell how much I've allocated, and how much of that is actually being used.  I also want to be able to bucket that on subsystem, but as far as I can tell, that doesn't really impact the design outside of adding a new parameter to allocate calls.

Note:  I'm avoiding implementation of allocation buckets and alignment from this, since it's largely orthogonal to what I'm asking and can be done with any of the designs.

 

To meet those three requirements, I've come up with the following solutions, all of which have significant drawbacks.

Static Policy-Based Allocators

I originally built this off of this talk.

Examples;


struct AllocBlock
{
	std::byte* ptr;
	size_t size;
};

class Mallocator
{
	size_t allocatedMemory;
public:
	Mallocator();
	AllocBlock Allocate(size_t size);
	void Deallocate(AllocBlock blk);
};

template <typename BackingAllocator, size_t allocSize>
class LinearAllocator : BackingAllocator
{
  	AllocBlock baseMemory;
	char* ptr;
  	char* end;
 public:
	LinearAllocator() : baseMemory(BackingAllocator::Allocate(allocSize)) { /* stuff */ }
  	AllocBlock Allocate(size_t size);
};

template <typename BackingAllocator, size_t allocSize>
class PoolAllocator : BackingAllocator
{
	AllocBlock baseMemory;
  	char* currentHead;
 public:
  	PoolAllocator() : baseMemory(BackingAllocator::Allocate(allocSize)) { /* stuff */ }
  	void* Allocate(); // note the different signature.
  	void Deallocate(void*);
};

// ex:  auto allocator = PoolAllocator<Mallocator, size>;

Advantages:

  • SFINAE gives me a pseudo-duck-typing thing.  I don't need any kind of common interfaces, and I'll get a compile-time error if I try to do something like create a LinearAllocator backed by a PoolAllocator.
  • It's composable.

Disadvantages:

  • Composability is type composability, meaning every allocator I create has an independent chain of compositions.  This makes tracking memory usage pretty hard, and presumably can cause me external fragmentation issues.  I might able to get around this with some kind of singleton kung-fu, but I'm unsure as I don't really have any experience with them.
  • Owing to the above, all of my customization points have to be template parameters because the concept relies on empty constructors.  This isn't a huge issue, but it makes defining allocators cumbersome.

Dynamic Allocator Dependency

This is probably just the strategy pattern, but then again everything involving polymorphic type composition looks like the strategy pattern to me. ?

Examples:


struct AllocBlock
{
	std::byte* ptr;
	size_t size;
};

class Allocator
{
  virtual AllocBlock Allocate(size_t) = 0;
  virtual void Deallocate(AllocBlock) = 0;
};

class Mallocator : Allocator
{
	size_t allocatedMemory;
public:
	Mallocator();
	AllocBlock Allocate(size_t size);
	void Deallocate(AllocBlock blk);
};

class LinearAllocator
{
  	Allocator* backingAllocator;
  	AllocBlock baseMemory;
	char* ptr;
  	char* end;
 public:
	LinearAllocator(Allocator* backingAllocator, size_t allocSize) 
		: backingAllocator(backingAllocator) { baseMemory = backingAllocator->Allocate(allocSize); /* stuff */ }
  	AllocBlock Allocate(size_t size);
};

class PoolAllocator
{
	Allocator* backingAllocator;
	AllocBlock baseMemory;
  	char* currentHead;
 public:
  	PoolAllocator(Allocator* backingAllocator, size_t allocSize)
		: backingAllocator(backingAllocator) { baseMemory = backingAllocator->Allocate(allocSize); /* stuff */ }
  	void* Allocate(); // note the different signature.
  	void Deallocate(void*);
};

// ex:  auto allocator = PoolAllocator(someGlobalMallocator, size);

There's an obvious problem with the above:  Namely that PoolAllocator and LinearAllocator don't inherit from the generic Allocator interface.  They can't, because their interfaces provide different semantics.  There's to ways I can solve this:

  1.   Inherit from Allocator anyway and assert on unsupported operations (delegates composition failure to runtime errors, which I'd rather avoid).
  2.   As above:  Don't inherit and just deal with the fact that some composability is lost (not ideal, because it means you can't do things like back a pool allocator with a linear allocator)

As for the overall structure, I think it looks something like this:
Advantages:

  • Memory usage tracking is easy, since I can use the top-level mallocator(s) to keep track of total memory allocated, and all of the leaf allocators to track of used memory.  How to do that in particular is outside the scope of what I'm asking about, but I've got some ideas.
  • I still have composability

Disadvantages:

  • The interface issues above.  There's no duck-typing-like mechanism to help here, and I'm strongly of the opinion that programmer errors in construction like that should fail at compile-time, not runtime.

Composition on Allocated Memory instead of Allocators

This is probably going to be somewhat buggy and poorly thought, since it's just an idea rather than something I've actually tried.

Examples:


struct AllocBlock
{
 	void* ptr;
  	size_t size;
  	std::function<void()> dealloc;
}

class Mallocator
{
  	size_t allocatedMemory;
 public:
  	Mallocator();
  	AllocBlock Allocate(size_t size) { void* ptr = malloc(size);  return {ptr, size, [ptr](){ free(ptr); }}; }
};

class LinearAllocator
{
	AllocBlock baseMemory;
  	char* ptr;
  	char* end;
public:
	LinearAllocator(AllocBlock baseMemory) : baseMemory(baseMemory) {end = ptr = baseMemory.ptr;}
	AllocBlock Allocate(size_t);
};

class PoolAllocator
{
	AllocBlock baseMemory;
  	char* head;
public:
  	PoolAllocator(AllocBlock baseMemory) : baseMemory(baseMemory) { /* stuff */ } 
  	void* Allocate();
};

// ex: auto allocator = PoolAllocator(someGlobalMallocator.Allocate(size));

I don't really like this design at first blush, but I haven't really tried it.

Advantages:

  • "Composable", since we've delegated most of what composition entails into the memory block rather than the allocator.
  • Tracking memory is a bit more complex, but I *think* it's still doable.

Disadvantages:

  • Makes the interface more complex, since we have to allocate first and then pass that block into our "child" allocator.
  • Can't do specialized deallocation (i.e. stack deallocation) since the memory blocks don't know anything about their parent allocation pool.  I might be able to get around this though.

 

I've done a lot of research against all of the source-available engines I can find, and it seems like most of them either have very small allocator systems or simply don't try to make them composable at all (CryEngine does this, for example).  That said, it seems like something that should have a lot of good examples, but I can't find a whole lot.  Does anyone have any good feedback/suggestions on this, or is composability in general just a pipe dream?

Advertisement

Why did you want to composite allocators together? The use of composition is to have different atomic tasks that you want to attach to your master-component to add functionality depending on your current needs. An allocator for example can just do one thing; allocating and managing memory. I don't see any use of composition here because each allocator should exactly tackle one kind of allocation. A heap allocator should process heap memory, stack allocator should go for stack memory and a block allocator should work on certain block of memory, stored anywhere else.

Doing allocation profiling and memory performance checks is part of the profiler API you will/should use and shouldn't be something to compose together to an allocator. Either you use it statically/dynamically or you opt it out using the preprocessor.

Another part you should really got headaches from is the fact of multithreaded memory allocation. Do you want to make all allocators multithreading compatible and have always to attach a second composition that handles locked memory access?

I used the article in this blog to build my memory management system. It is multithreading safe by using atomics to count the number of objects/total byte count and also produces profiler entiries that don't just tell me how much memory is consumed but also what pointer addresses have been allocated so I can join my game engine's memory from the profiler process and investigate the memory cell on demand.

Getting memory leaks is nearly impossible with how it is implemented because any allocator will assert on failure, for example if there are objects left in allocation when the allocator gets disposed.

I additionally wrote a static allocator working on a small fixed size memory block on startup, to allocate any other allocators in instead of the new approach the article describes

On 6/16/2018 at 1:32 AM, SeraphLance said:

Handling the fact that some allocators have different interfaces than others in an elegant way.  For example, a regular allocator might have Allocate/Deallocate, but a linear allocator can't do itemized deallocation (but can deallocate everything at once).

If your allocation interface is at the C++ object level, instead of the blob-of-bytes level, then it's easier to support a common interface.

A linear allocator can deallocate a C++ object (by calling the destructor), it just doesn't actually free up any address space until a later unwind operation takes place.

If the allocation interface is templated by object type (i.e. Allocate<T>(arraySize) instead of Allocate(sizeof(T)*arraySize) ) then the pool can also implement the common interface (expect probably only for a single T per pool instance, instead of any T).

Quote

std: :function<void ()> dealloc ;

If you're going to go down this path, consider a regular function pointer and a single void* parameter. std::function is pretty heavyweight for such a low level consideration as tracking individual allocations..

9 hours ago, Shaarigan said:

Why did you want to composite allocators together? The use of composition is to have different atomic tasks that you want to attach to your master-component to add functionality depending on your current needs. An allocator for example can just do one thing; allocating and managing memory. I don't see any use of composition here because each allocator should exactly tackle one kind of allocation. A heap allocator should process heap memory, stack allocator should go for stack memory and a block allocator should work on certain block of memory, stored anywhere else.

By compositing allocators, I can allocate from one arena into another.  For example, I might have a stack allocator that grabs N bytes of stack memory, and then using that allocator I might create a pool of objects.  It also lets me do something like this (using the first method in my OP):
 


template<typename BackingAllocator>
class DebugAllocator : BackingAllocator
{
	constexpr char sentinelValue = 0xFE;
	constexpr size_t sentinelSize = 4;
public:
	AllocBlock Allocate(size_t size)
	{
		// insert sentinels at beginning and end of allocation range.
		auto block = BackingAllocator::Allocate(size + sentinelSize * 2);
		::memset(block.ptr, sentinelValue, sentinelSize);
		::memset(block.ptr + size + sentinelSize, sentinelValue, sentinelSize);
		
		block.size = size;
		block.ptr -= sizeof(magicNumber);
		return block;
	}
	//...
};

So I can inject behavior like memory region guards.  I guess in your parlance, I've got a stack allocator, a heap allocator (which I call "mallocator"), and everything else is a block allocator that uses memory doled out by the two aforementioned allocators.  I could call them "memory adaptors", but that just sounds weird.

 

9 hours ago, Shaarigan said:

 Doing allocation profiling and memory performance checks is part of the profiler API you will/should use and shouldn't be something to compose together to an allocator. Either you use it statically/dynamically or you opt it out using the preprocessor.

I'm not sure what you mean by this.  I want to be able to show myself something like this:
Current Memory Usage:
Networking: 2 KB bytes (budget 4 KB)
Textures: 120 MB (budget 512 MB)
Scripts: 37 MB (budget 64 MB)
Game Objects: 27 MB (budget 25 MB -- OVER BUDGET -- )

..and so on.

That naturally can't be done statically, other than defining budget values.  In order to know that sort of information, it has to hook into the allocators somehow, right?  Even the bitsquid article you cite uses "proxy allocators" to achieve the same thing.  I considered doing it that way too, and haven't entirely ruled it out.

In fact, I read that article while I was trying to come up with this stuff, and it was an influence on the "case 2" example in my OP.  re-reading it now that you posted it, I might actually go back to using the proxy idea instead of what I had in mind, but I deliberately left it out because there's a number of ways to get that information that don't impact the base design much.

 

9 hours ago, Shaarigan said:

 Another part you should really got headaches from is the fact of multithreaded memory allocation. Do you want to make all allocators multithreading compatible and have always to attach a second composition that handles locked memory access?

 I used the article in this blog to build my memory management system. It is multithreading safe by using atomics to count the number of objects/total byte count and also produces profiler entiries that don't just tell me how much memory is consumed but also what pointer addresses have been allocated so I can join my game engine's memory from the profiler process and investigate the memory cell on demand.

 

My current plan of attack with regards to threading is to just make my allocators all thread local.  Mallocator is currently stateless and malloc itself is thread-safe so that should be fine, and everything else is either operating on stack memory or pre-allocated heap memory, so I don't think I can have any issues with that as long as I don't pass raw memory pointers around between threads.  I haven't really thought of a use case for sharing an allocator between threads that isn't questionable at some level.

If I do end up calling the Virtual* functions instead of using mallocator, I'll have to revisit things a bit, and that will definitely have to be thread-aware to some degree I'm sure.

 

5 hours ago, Hodgman said:

If your allocation interface is at the C++ object level, instead of the blob-of-bytes level, then it's easier to support a common interface.

A linear allocator can deallocate a C++ object (by calling the destructor), it just doesn't actually free up any address space until a later unwind operation takes place.

If the allocation interface is templated by object type (i.e. Allocate<T>(arraySize) instead of Allocate(sizeof(T)*arraySize) ) then the pool can also implement the common interface (expect probably only for a single T per pool instance, instead of any T).

That's an interesting observation.  So for untyped allocators (i.e. stack/linear allocators), just templatize the allocate function?  I like it, thanks.
 

5 hours ago, Hodgman said:

If you're going to go down this path, consider a regular function pointer and a single void* parameter. std::function is pretty heavyweight for such a low level consideration as tracking individual allocations..

Yeah, I just wanted to write something up to express an idea, and it was faster for me to remember the syntax of std:;function than it was for the function pointer syntax. ;)

That said, I've become increasingly convinced over the last few days that this third route of letting allocated blocks free themselves is a poor one, because for stuff like stack allocators I kind of need the context of neighboring allocations to know if they can be freed at all, and adding the plumbing for that sounds like more work than it's worth.

On 6/15/2018 at 8:32 AM, SeraphLance said:

Handling the fact that some allocators have different interfaces than others in an elegant way.  For example, a regular allocator might have Allocate/Deallocate, but a linear allocator can't do itemized deallocation (but can deallocate everything at once).

This wouldn't change the interface. The purpose of "Deallocate" is to inform the allocator that the memory is no longer in use, at which point it can do with it what it pleases (including nothing at all). If the user is expected to know more about the underlying implementation details, you have a leaky abstraction.

Take a look at the standard allocator traits. The essential elements of the interface are allocate/deallocate, and construct/destroy (notice how they're considered separate responsibilities). You really don't need much beyond that, aside perhaps from some templated helper functions that combine allocate+construct and destroy+deallocate. Then, different implementations of this interface can be combined using the adapter pattern to achieve the layered functionality you're looking for.

On 6/15/2018 at 6:32 PM, SeraphLance said:

I've spent quite a while (and probably far longer than I actually should) trying to design an allocator system. 

 

Sounds like me. I have huge library of various types of heaps built up over 25 years, and I ended up writing another one for my current project. In fact it's a rare thing for me to use a straight new or delete in C++. I'm almost always using placement new for anything that goes on the heap.

This topic is closed to new replies.

Advertisement