Memory Management

Started by
19 comments, last by Baraclese 19 years, 9 months ago
Hey, What would be the smarter thing to do, when designing a memory manager for a game (or just a framework)? 1) Allocate a big chunk and then do some tag boundary allocations. Fx every block has a header, saying whether it's free or not, it's size and maybe a pointer to next block. This design can eventually be expanded to a free list or something like that (to speed up the allocation procedure). On of the great advantages with this design will be the low internal fragmentation, due to the coalescence process. But that process do also cause a more time comsuming freeing process - something I don't think a game will appreciate? 2) It can also be done with a design, where numerous pools are allocated in the beginning, each containong blocks of a specific size. Fx one pool with 2 bytes sized blocks and another with 4 bytes sized blocks. When an allocation is called, the first block in the free list, that containes the requested size, is taken (simply removed from the free list). When it has to be deallocated agian, the block does simply get inserted as the first element in the right free list again. This design will perform fast and safe, but may result in some heavy internal fragmentation. What design would you use? Do you have a better one?
Advertisement
To pick the right one, you'd have to how the program you use it for is going to behave. If it will be allocating zillions of little blocks the second one is preferable, if it is allocating rather big chunks of different sizes the first one works better. What I am currently doing for my own project is keep a bunch of 'pools' (as in your second approach) for small blocks (8, 12 and 16 bytes), and just call the standard operator new when asked for bigger chunks.
Go with number 1. If you really have an algorithm somewhere in your game that needs fast memory allocation and deallocation you'd write a fixed size pool allocator anyway.
I wrote a memoy manager which doesn't use boundary tags but it allocates a bunch of "boundary tags" at its initialization, this limits your application to a fixed size of free and used blocks but it improves locality for the allocation and deallocation algorithms alot.
I use a hybrid aproach. I create multiple pools on startup (using Doug Lea's malloc) to keep "alike" allocations together (ex. textures are allocated from the same pool, sounds, etc). Each pool then returns the requested amount of memory from its "chunk" and updates its internal state for every allocation. All allocations are aligned, and the way memory is "freed" dramatically reduces fragmentation. When memory is allocated a header stating the block's size and pool it came from is added just before the actual memory. When it is freed, the block is added to the top of a list and a pointer to the previous block replaces the pool variable in the header. The pool itself will use these freed blocks first, joining and splitting free blocks as necessary.

Personally, I think this is the best allocator I have programmed to date. I cant wait until I get my program up and running so I can spend more time tweaking it and testing the fragmentation numbers after a few days of continous allocations.
I also wrote my memory manager pretty similar to what the previous poster wrote.
Allocating big blocks/pools of memory. And once an allocation is done by the user, it will just return a piece of memory inside those blocks instead of doing an actualy malloc.
Each allocation has a header in front of it and the memory address returned can be aligned by a value specified by the user that does the allocation.
Also it is possible to group specific allocation types together. For example I group all strings together in the same pools. This makes it more efficient, otherwise you can end up having big blocks that are empty after some temparory memory was allocated etc.

I made a macro that works like this:


class Vertex
{
DECLARE_MEMORYOBJECT(Vertex, 16, MEMCATEGORY_MESHDATA)
//.....
};


This will automatically create the operators new and delete. So when you do "new Vertex" it will automatically be managed by the memory manager.

Deleting memory is pretty fast. However allocating isn't so fast as the standard malloc. This is because I store no free list. What I do is search for free places inside the blocks and find the best fitting memory hole to minimize fragmentation. This is speed up by introducing cached searches. It is very likely that there is still a free piece of memory in the last block where an allocation has been made in. This gives a very big speedup in allocation. Also I cache some other stuff.

Re-allocating data, like realloc is done by trying to simply change the header of the given allocation so that it registers more memory. This doesn't involve any allocation or so. So it will most likely return the same memory location as it already had initially. Unless the resizing of the allocation won't fit, in that case it will delete the allocation and allocate it again in another block. Also it pre-allocates some extra data for dynamic data, to speedup incremental resizing for things such as dynamic arrays.

Allocations that are above a given size will get their own private memory blocks so will basically be done with a traditional malloc, but allow alignment etc.

I can generate per category statistics and global memory statistics. This is very useful to see how much memory your mesh data actualy uses for example.
Also it can report things such as memory overhead due to alignment, block overhead per allocation, etc.

Last but not least it automatically shows all possible memory leaks when you shutdown your application, and it will clean up those allocations afterwards, so that at least all memory has been released again.

When you get a huge number of allocations, the allocation process can become quite slow though compared to using just a malloc. This is some area that I still need to improve.

Cheers,
- John / http://www.emotionfx.com

I simply override the global new and delete operators to make integration with the memory manager transparent to the application. This also lets me compare the memory manager speed up by simply commenting out the overides and testing.

BTW, searching for the best fit is not a good way to prevent long term fragmentation. Your allocations will gradually become more spread out using that method because you are searching for a best fit and using that address as the starting point for the next search. And so on and so on.
No, I'm not using that address as a starting point for the next allocation. This way I minimize the amount of small memory holes, so that bigger allocations will have less chance to not fit inside the smaller holes. This reduces the memory fragmentation more as opposed to just using the first fit.
Also, overriding global new and delete might cause problems when different libraries being used all do this.
But there are also some advantages of it of course. But in my library I only want to count the memory used by the library, and if the user allocates custom objects, it shouldn't add those to the memory statistics of the library.
Most big games (especially on consoles where memory management is a critical part of the code) use several memory allocation schemes. Each scheme is tailored to solve a particular problem, and deciding which allocator to use depends on how the memory is going to be used.

Rather than re-inventing the wheel, look up how the various existing memory allocation schemes work. Chances are somebody has already implemented something like what you are considering and figured out all the advantages and disadvantages.

[Edited by - JohnBolton on July 7, 2004 5:14:36 PM]
John BoltonLocomotive Games (THQ)Current Project: Destroy All Humans (Wii). IN STORES NOW!
I totally agree with JohnBolton. There is no allocator that will match each problem field well. Different memory needs and usage patterns require different allocation schemes if you want optimal memory management. Implementing various types of allocator systems will make the project a bit more complex, but will payout better performance wise. Also as was mentioned before there are some pretty robust allocators out there you might want to look at. Charles Bloom for example has some niffty allocation schemes implemented inside Galaxy3 . You also must take into consideration caching when designing/implementing allocators/memory managers.

Also on hybrid approaches. One thing I learnt about hybrid approaches from personal experience is that they tend to get over-engineered, and usually add no performance gain or even can become the bottleneck. A few robust allocation schemes applied at the right places will give you an edge over hybrid generalized solutions (plus note they are much easier to implement robustly).

This topic is closed to new replies.

Advertisement