Custom heap allocator that allocated only within a chunk (for game)

Started by
8 comments, last by hyyou 6 years, 10 months ago

I am trying to develop a heap allocator that has following properties :-

  • I will assign a contiguous memory chunk (begin-end address) to it.
  • It will allocate using memory only in the assigned chunk. (not a property of std::allocator)
  • I can ask it to allocate any size.
  • If the chunk is not large enough / it is full, it will return null.
  • reasonable performance (It is totally ok to be slower than poll, scope, linear allocators.)

Something like this :-


HeapAllocator heap;
void* beginByte=...;
int numByte=...;
heap.assignMemory(beginByte,numByte);
....

int iWantNumByte=sizeof(SomeClass);
void* iGetYou=heap.allocate(iWantNumByte);
heap.deallocate(iGetYou);    

To avoid reinventing the wheel, here are such libraries I found :-

  1. slow : http://jinie.github.io/memmgr/ (code at https://github.com/jinie/memmgr)
  2. slow : http:// eli.thegreenplace.net/2008/10/17/memmgr-a-fixed-pool-memory-allocator/ (code at https://github.com/eliben/code-for-blog/tree/master/2008/memmgr)
  3. https://www.codeproject.com/Articles/14525/Heap-Manager-for-Allocating-Memory-from-a-Shared-M
  4. slow : https://www.gamedev.net/resources/_/technical/general-programming/c-custom-memory-allocation-r3010

As I skimmed the code :-

  • 1 and 2 and 4 : cost max cost of allocate/deallocate = O(amountOfFreeBlock). They are very easy to understand.
  • 3 : cost = O( log(amountOfFreeBlock)) The document is quite good, but it doesn't cover every aspect.
  • 3 : also has a bug ( comment in that page)

Are there more libraries that I can copy study?

Are there any good books/articles I should read?

Advertisement

hm, i think the problem/solution with O( f( amountOfFreeBlocks)) really depends on the size of the storage. at a certain point it might be beneficial to sort freeBlocks in a binTree or something, but this really is something i feel needs to be profiled under a real workload to see if there is any point in optimizing.

at least for me the last time it was more important how i grouped stuff to store in my custom allocator(s) (caching i guess) than O(-1) operation.

naive O(amountOfFreeBlocks) with some even more naive tweaks worked fine. like sort freeBlocks by size after release - might sound insane, but actually works well in my specific case. that way either the first freeBlock is large enough, or i would have to resize or defragment.

ninnghazad ... like sort freeBlocks by size after release ...

Yes!! The project in codeproject (3) sorts like that (by size and by address), so it is very fast.

How do you sort it? Do you need some more allocated memory?

I found that it is really hard to sort thing without standard allocation in places.

More specifically, I am trying to make my game allocated only once .... so my heap have to allocate some memory inside the chunk for that sort operation.

The size of memory that used to sort is not fixed, and can't be known in advance (as far as I know).

k07zw3T.png

Thus, this problem is like finding a memory to manage memory .... so recursive. :wacko:

i actually use std::sort on std::vector<MyChunkStructWhichIsJustTwoSize_T>. std::sort does inplace sorting using move-swap where possible. so low mem-needs. and i think you need at most sizeof(item) mem to sort a vector even without move-swap-magic. but you might use sorted linked list, and plop in released blocks in the right spot so you don't have to sort - or well it just *is* sorted all the time. start simple and solid - optimize later.
but that aside, i think you may have entered the vast regions of YAGNI. as long as you aren't working on some strictly restricted platform with tiny mems, you might create more problems for yourself as you solve. is it for some old console? otherwise the 13-and-a-half bits you need to store the meta-info a few vectors need really won't make a difference.

also sizeof(std::vector<WhatEvs>) is implementation-defined i think.

Thank a lot, ninnghazad.

Seeing an expert recommends that it usually does not cost too much (i.e. it is a premature optimization) make me clam down and start to seek for a solution which is a little more expensive but much easier-to-code.

Now, I can use std::multimap, then might optimize later. :D

Windows API Heap* calls meet these requirements: https://msdn.microsoft.com/en-us/library/windows/desktop/aa366599(v=vs.85).aspx

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

Thus, this problem is like finding a memory to manage memory .... so recursive.
An early Unix file system used the free blocks at the disk for its free blocks administration. When you have many free blocks, you need more space, but you have lots of free blocks for it then :)

Thank a lot, everyone, for good advises!

I found out later that even using std::map AND a lot of function pointer (my poor skill) to implement custom heap ->

an experimental custom heap allocator still faster than the standard new/delete!

(It is an evidence that it is somehow a premature optimizaton, as ninnghazad mentioned.)

I still want to know where should I read more / find more library though ....

Windows API Heap* calls meet these requirements: https://msdn.microsoft.com/en-us/library/windows/desktop/aa366599(v=vs.85).aspx

Thanks. Does anyone know any others open source library?

An early Unix file system used the free blocks at the disk for its free blocks administration. When you have many free blocks, you need more space, but you have lots of free blocks for it then :)

It is interesting! :D I believe there is no guarantee how std::map allocated, so I think I can't adapt that concept, though.

an experimental custom heap allocator still faster than the standard new/delete!

It's interesting to note that most standard libraries implement new/delete in terms of malloc() (as dictated by the standard), and malloc() is generally a really good caching allocator with best-fit and compaction.
The trick is that malloc() is required to be threadsafe, which means locking and unlocking on every call. In other words, even the worst poorly-coded and poorly-designed replacement for new/delete is likely to outperform the library by an order of magnitude. Until the moment you have a threaded app, in which case you lose big.
A good reference for a fast, efficient, and portable chunk allocator can be found in one of the most widely-used implementations in the world.

Stephen M. Webb
Professional Free Software Developer

Bregma, thank for a good link to the so-satisfying big code.

In my code, here is how I allocate/deallocate a new byte array (I do it without mutex).


::operator new(numByte);  
::operator delete(void_star_pointer);

Is it the same as malloc()/alloc() you mentioned?

(clarify) My prototype performance linear allocator is generally better than standard new/delete :-


allocator->makeStrong<SomeClass>()   .... faster than ....... new SomeClass().

Is it partially because my class is not thread-safe comparing to new SomeClass()?

This topic is closed to new replies.

Advertisement