Dynamic Memory Allocation Strategy for Modern Game Engines

Started by
6 comments, last by voguemaster 13 years, 5 months ago
Greetings,

I was looking through the old Quake 3 (Id Tech 3) code base the other day and I noticed that it mallocs a large chunk of memory during initialization. The size of this memory pool is controlled through the -hunk command-line argument. I have two questions relating to this. The first is why did Id choose to manage their own memory and the second is do modern engines still write their own memory managers?

I noticed on a recent update to Valve's Source engine (which evolved from the Quake 1 and 2 engines) that they recently removed the -hunk parameter. I did a little research on writing your own C++ allocator and the general advice was that it is hard to beat the standard one. This leads me to believe that modern engines do not write their own memory managers. Anyone have any insight related to this?
Advertisement
I was wondering about this issue as well.

I'm reading Jason Gregory's "Game Engine Architecture" and he mentions the topic of memory management a lot, stuff like different allocation strategies, avoiding fragmentation etc.

I know that Jason Gregory worked mostly on games for consoles (am I wrong here?) and afaik in consoles the topic of memory management is very important, since consoles does not have virtual memory mechanism and sometimes are limited to some relatively small amount of RAM.

I think the questions about writing custom memory managers falls a part with the same question as "should I write my own std::vector (or any other stl/boost stuff) cause I heard std::vector is slooooww and fragment memory".

As for me I stick with using the default new/delete for my game/engine/whatever as soon as I dont see memory allocations at the top of my profiler.

So as a response to your question, if you are developing for PC and dont know/cant see/not sure if default memory allocations strategies provided by the OS will be good for you, I say stick with them. Your main goal is to finish a game, you always can optimize the problematic aspects of the code later when you profiled the code.

Quote:Donald Knuth
If you optimize everything, you will always be unhappy.

I would love to change the world, but they won’t give me the source code.

Quote:mallocs a large chunk of memory during initialization
Easier to debug and gives full control over allocations. It also ensures that memory is allocated in single chunk. If it were added incrementally, it could lead to fragmentation issues. By having one single block one also gets determinism - if n allocations are made in sequence, the addresses will always be the same. This is not necessarily the case with third-party allocators or OS. Such approach also guarantees consistent behavior across platforms.

Quote:the general advice was that it is hard to beat the standard one.
Yes, it is. General purpose allocator is just that. Jack of all trades.

Quote:This leads me to believe that modern engines do not write their own memory managers.

A memory manager is a weasel word. It doesn't mean anything.

There are two aspects to memory management. Macro aspect is about claiming memory from OS. This typically works for large chunks that are infrequently deallocated. Micro aspect is about individual data structures. Using linked list or map with per-item allocation is a disaster. Here various pooling and preallocation techniques are used.

The "slowness" comes primarily from these small allocations. There exists malloc version which has special cases for small block allocations. While it is a fairly decent drop-in replacement, it doesn't beat per-instance allocation.

But probably more important aspect than performance is having full control over memory. That can simplify debugging. It also becomes much simpler to inspect memory.
Quote:Original post by DariusBoone
This leads me to believe that modern engines do not write their own memory managers. Anyone have any insight related to this?


They still do, in part because they are targeted and consoles and in part because if you know what you are doing you can get a performance boost.

When it comes to consoles, as pointed out, you have a fixed amount of ram, from this everything has to be shared which means that everything has a fixed amount of memory to work with. So, graphics get a chunk, audio get a chunk, game play get a chunk and so on. As to how that memory is handled within those chunks depends and leads into the 'if you know what you are doing' part.

The standard memory allocator is pretty good but only in the general case; if you know more about your memory allocation requirements it can be a good idea to take over it.

For example, at work our graphics system has a chunk of memory, this memory is then subdivided into smaller pools for various sections of the graphics, and is futher subdivided into slots. When we load something in we request a slot from the underlaying memory manager and load directly into that.

What this gains us is speed; memory allocation is simply a matter of returning a free slot, and as this is a streaming engine we need that setup. What we lose however is granularity; you have to make a slot as big as your worst case, and if your worst case is significantly bigger than your average case then you end up wasting ram as all slots are the same size.
(There are ways around this however, I saved 1.2meg from a memory pool by increasing the number of slots but reducing the slot size to the average case and forcing larger objects to be split up. However this wreaked our streaming budget so now I'm working on a new system to repack those broken up objects into a single file so I can change the loading).

Another advantage of the pool system is that of locality. If all your objects are close to each other if you process them in a lump you'll get better cache performance and better processor usage over all which can help speed things up.

Finally, when it comes to small allocations in a non-managed language it can help again; if you dedicate a pool to small frequent object allocation which are needed over a frame you can avoid a round trip into the OS and instead just give out chunks of memory. Again, the trade off here is speed vs memory usage vs flexibility as you'll need to allocate a chunk up front but it can help.

Now, the big question; should you bother with all this?

Most probably not, no.
This is alot of hardwork and its very easy to make something which performs worse than the general case.

For the average game its not going to make much of a difference, there are some areas you should pay attention to (vectors over lists for example) but going out of your way to write a full blown memory manager isn't required.

It somewhat the old axiom; if you need to ask about it then you probably don't need it/aren't ready to do it [smile]
New/Delete for memory is still slow. Consoles definitely seem to still use memory pools.

Remember as well that back then machines were more limited. What if you want to simulate a set amount of memory? You could warn the user when you fill up your memory pool at run-time instead of checking each week how much ram you are using.

As said, you don't really need to worry about it at an amateur level. Only if you are squeezing to max out the next-generation of what we already have.

NBA2K, Madden, Maneater, Killing Floor, Sims http://www.pawlowskipinball.com/pinballeternal

Even if you do a good job of memory management for your application and engine code on a console, most likely you will use third-party packages. These can be problematic.

Last game I worked on, we used Scaleform for the user interface and Quazal for networking. Both used the global heap and OS for allocating and deallocating memory. However, both were particularly bad about the number of allocations. Our calls to Scaleform led to millions of new/delete calls per frame, many of these for blocks of 1 to 4 bytes (very bad design). Memory fragmentation was a killer, especially when you are trying to load a level and use nearly all of VRAM for it. Order of objects during load is important. Last thing you want is to load a large object when there is enough free memory but it is fragmented to hell and you cannot find a contiguous chunk to load into.

I wrote code to do a statistical analysis of the memory usage patterns of both packages, and then used this information to write specialized memory managers (no, these are not weasel words) that avoided fragmentation.

For whatever reasons (don't ask), we used PSGL for PS3. Its memory management was also problematic, and I hacked that code so that we could keep our content at the desired amount, rather than cutting to 1/3 which is what PSGL was going to force us to do. In a sense, consider the hacking a form of specialized memory management.

I also recall years ago the Bungie folks complaining about Havok not allowing you to specify a block of memory for its memory budget, which made debugging and predictable behavior difficult.

So whether or not you agree with writing specialized memory management, you might very well be forced to do so when you do not have control over the third-party packages you use. On consoles, memory budgets are a good thing for predictable and controllable project management. Having hooks in a third-party package to alloc/dealloc is simply not enough--it needs to allow you to specify its chunk (the size is based on your needs and application behavior).
It's not that hard to beat the standard allocator, since it's designed for general purpose. But performance is not the main interest to write your own memory manager. On console their is no OS to defrag the allocated memory. When programming a game with a lot of streaming you may need to write your own memory manager to do the defragmentation.

Memory pools are well documented (ex: "Modern C++ design") and esay to program (at least on a single thead context). You will gain not only in allocation/deallocation speed but also in memory available for your program since the book keeping of memory pools is very low.

For multi-thread programming, it's interesting to use memory manager which works only on the memory space of the thread, so you don't have a critical section or a mutex set for every allocation/deallocation.

Their is a lot to do, and it's quite interesting (IMO) but as said before it's only useful if you want to max out the potential of your target platform and their is a lot of pitfall to avoid. You can do great game without it for sure!
Hi,

Dynamic memory allocation is not important only to games. Its true that in consoles you'll need to manage your memory better, not only because you have less of it but also because sometimes you want control on the layout of how things reside in memory.

Whether its to better utilize your memory, avoid fragmentation or just have a boost in performance because all your scene graph nodes are sequential in memory, the point is you almost always have to do some kind of special management.

Maybe not everything is optimized for cache efficiency right from the start, but definitely if you have specific limitations you must design your allocations accordingly.

I don't work in the game industry but fragmentation and cache optimizations are quite important to our kind of application (medical imaging workstations), especially in some specific code sections to be sure.

Heck, maybe after a while, after a certain engine matures enough, it has most of these locked down. Most, because you must remember that every usage pattern calls for different strategies sometimes.

So its always - Find a solution to the problem you're facing.

I guarantee not all solutions readily available in even the most mature engines will 100% match your needs...
-----------------------------He moves in space with minimum waste and maximum joyGalactic Conflict demo reel -http://www.youtube.com/watch?v=hh8z5jdpfXY

This topic is closed to new replies.

Advertisement