Design of a Memory Allocator for Game Engine

Started by
12 comments, last by JohnnyCode 13 years, 5 months ago
Hi,
I want to Implement a memory allocator that can give gud performance in practical scenarios.
Is there any good resouce(articles/code) which actually gives insides of memory allocation scheme implemented in professional game engines ?

Thanks.
Advertisement
Memory allocator is fairly simple:

1)Allocate a huge chunk of memory on program initialization, decide on a fixed object size.
2)Generate a list of pointers to that memory, n*object_size apart, where 0<=n<=(memory_size/object_size)
3)During your program, if the memory is needed, take it off the list and return the pointer.
4)If the memory is freed (given back to the allocator), simply add it to the list.
5)If you need memory allocation for more than one size, create another instance of your allocator that uses the particular object size.
It's usually not as simple as just making an allocator for the game.

Replacing the default allocators (new/malloc) can be a decent starting point, but you'd still just be replacing one general-purpose allocator with another general-purpose allocator.

The game-specific performance increases come from special-purpose allocators, not general-purpose allocators. It's the inside-knowledge of how the game/game-engine work that lets you optimize it.

By default, all your allocations should use a good general-purpose scheme.
...But you also have things like:
* Things you can put a limit on (if the allocator fails, it won't ruin the game). E.g. some special effects. For these, you can use a small reusable memory pool, and allow it to fail gracefully.
* Things you can recycle (if the allocator fails, it can ask the game to give it back an old object to reuse). E.g. bullet holes, etc.
* Temporary objects that only exist until the end of the frame -- you can use a mark and release allocator to make these allocations as fast as a pointer increment, and deallocation as fast as setting a value to zero...
* Data structures that are processed in tight loops -- pooling these can result in great cache efficiency. E.g. if you do AABB culling, allocate all your AABBs from similar address-space, and process them in order (ascending addresses).
...and so on.
So there's not one scheme -- each part of the engine uses/requires memory in different ways, so optimise for each of those situations separately.
Thnks for your repsonse.

@xytor
I have already made an allocator which actually does almost similar stuff you have mentioned along with memory leak tracking.


@Hodgman:
I am exploring the areas you have mentioned. Basically I am looking for some proven schemes which can provide cache friendly allocations with fast searching. As per my understanding engines create couple of heaps and places the object in one of the heaps based on its size. Consider the case of UnReal engine. It can be used to develope almost all generes of games. What would be good memory management scheme for such an engine ? Like how many heaps provide fairly gud performance and less wastage of memory along with balanced performance. I have seen couple of implementation. Some uses simple doubly linked list while some uses fairly complex data structures such as Red/Black tree.
Assuming a "normal" 60 fps, do you need more than 20,000 allocations per frame? Because if you don't, then wasting time on optimizing memory allocation is kind of silly, in my opinion.
There are very few cases where the performance of the standard allocator is not sufficient, in my opinion. And in those cases, one should probably consider to just do fewer allocations per frame.
Quote:Original post by samoth
Assuming a "normal" 60 fps, do you need more than 20,000 allocations per frame? Because if you don't, then wasting time on optimizing memory allocation is kind of silly, in my opinion.
There are very few cases where the performance of the standard allocator is not sufficient, in my opinion. And in those cases, one should probably consider to just do fewer allocations per frame.


Agreed.

The only time I've ever been concerned about this is with particle systems. Over time, I've moved to having one dynamically allocated object contain a vector of PODs, preallocated to the maximum to represent the particles.

Prove to yourself you have a bottleneck on allocations before you waste too much time on this. You probably shouldn't need to be allocating many things per frame in a sane design.
Quote:Original post by skynet_wolf
Like how many heaps
Around 42.

Quote:I have seen couple of implementation. Some uses simple doubly linked list while some uses fairly complex data structures such as Red/Black tree.

And some use arrays, others use vectors, third use malloc, some just don't bother.

It depends. That's the whole point, general solution is provided by C++ as-is.

For anything else, analyze the usage patterns, fix low-hanging fruit as needed.

BTW, rolling general purpose allocator is a waste of time. There is Doug Lea's malloc which, for all practical purposes, remains unbeaten after being tested for years on just about every platform.

For everything else, it depends completely on algorithms and data structures. This is simply not a problem that can be solved generally. There exist in-place data structures, cache-oblivious algorithms and similar, but all are horribly context and usage dependent.
Quote:Original post by skynet_wolf
As per my understanding engines create couple of heaps and places the object in one of the heaps based on its size.
Pools are one tool in the toolbox, yes.
Quote:Consider the case of UnReal engine. It can be used to develope almost all generes of games. What would be good memory management scheme for such an engine ?
Like I said, there's no answer for the engine as a whole. You've got to look at the memory requirements of each individual compenent that makes up the engine -- they all have different needs, usage patterns, acceptable compromises, etc...
Without targeting these specific circumstances, you're just doing general purpose allocation.
Quote:Like how many heaps provide fairly gud performance and less wastage of memory along with balanced performance.
These are questions on how to design a general purpose allocator. Plenty of these already exist.
Unreal probably uses the one from TBB.
"C++ For Game Programmers" has a chapter on memory allocation and it has an example of a custom manager.

http://www.amazon.com/C-Game-Programmers-Development/dp/1584502274
In my GFX engine I was not concerned with own memory allocation except for one special case - tight (or not so tight) loops repeated each frame with a lot of allocations inside. Example:
- dynamic list of objects to display (constructed from current scene graph)
- dynamic list of objects to update (all objects that had been modified by engine user)
- some other similar places, that I don't remember now

In such cases using general allocator took a lot of CPU time for allocation purpose only. But instead of writing some complicated allocator I have created a simple allocator template (based on allocated size) and call it for particular structure required.

My allocator is acting as a pool of objects of that particular size for use. If pool is empty it creates new object (using general purpose allocator), but if pool has any unused object it just returns it. Deleting object does not really delete it, but put it back to pool. You could consider such implementation. Although it is fairly simple it does what it should and gives a real boost in such tight loops.

This topic is closed to new replies.

Advertisement