|
My brain is built of paths and slides and ladders and lasers and I have invited all of you to enter its pavilion. My brain, as you enter, will smell of tangerines and brand-new running shoes.
| Thursday, October 30, 2008 |
 Memory Management Part II |
Posted - 10/30/2008 11:08:15 AM | Managing Your Memory - Part II
Capturing Allocation Information From Your Program
In the last post I discussed the rationale and general design of a memory management system for C++ programs. This time around, we'll talk about how to actually collect data on your memory usage from a running program.
The method I present will work for all memory allocated via new, new[], and STL containers. Unfortunately there are a few ways that memory can slip past this mechanism:
- Directly requesting memory from the OS, such as with HeapAlloc or VirtualAlloc
- Using C memory management functions (malloc, calloc, etc.)
- Third party libraries that are linked externally
The first two are easy enough to solve by doing a quick global search of the code base, and making sure that any calls are tracked using the memory tracker interface (which we will discuss shortly). The best way to do this is to write wrapper functions which do the tracking, and then directly use the external memory allocation mechanism.
The third problem is very tricky to solve. However, we will consider two possible options for dealing with it. The first option is to manually register any memory usage with the tracker interface; this is usually possible by querying the library for its usage information. Many libraries offer this functionality, so if you are working with such a library, there's really no problem.
The second option is to query the operating system itself directly, using functions like HeapWalk on Windows. This has the advantage of reporting all memory that your process uses, regardless of how it was allocated. However, there are two caveats: you can't see who allocated the memory, and the allocations may not match those tracked internally by our memory tracker code.
Sidebar: Why HeapWalk may disagree with our tracker
As mentioned, it is possible that you will see a totally different set of allocations at the OS-level than at the application level. This depends entirely on the implementation of the allocation functions (which are provided as part of the standard library) as well as any custom allocators you might use.
For example, the application-level allocator may allocate a large block from the OS and then parcel it out as memory is requested by the application. From the application's point of view (and from the perspective of our memory tracker), there will be many smaller allocations; from the OS perspective, there is simply one large allocation. This is especially common with pool allocators and garbage collector implementations.
Because of this it is not always feasible to compare the results of an OS-level memory query with the data stored in our memory tracker.
Defining Memory Types
Before we can track memory, we have to have some method of identifying where memory is being used, and what it is used for. Write down a list of areas that you want to track, and then create an enumeration for them, as shown in the following code snippet.
Note that we use a macro here instead of a plain enum, because we're going to do some magic with the macro a little bit later.
#define MEMTYPEENUM \
MEMTYPEMACRO(MEMTYPE_GENERAL, "General") \
MEMTYPEMACRO(MEMTYPE_GENERALARRAY, "General array") \
MEMTYPEMACRO(MEMTYPE_STRINGS, "Strings") \
MEMTYPEMACRO(MEMTYPE_GAMEGRAPH, "Gamegraph") \
MEMTYPEMACRO(MEMTYPE_UI, "UI") \
MEMTYPEMACRO(MEMTYPE_GFX, "Misc. Graphics") \
MEMTYPEMACRO(MEMTYPE_MESHES, "Meshes") \
MEMTYPEMACRO(MEMTYPE_TEXTURES, "Textures") \
MEMTYPEMACRO(MEMTYPE_SHADERS, "Shaders") \
MEMTYPEMACRO(MEMTYPE_AUDIO, "Audio") \
MEMTYPEMACRO(MEMTYPE_PHYSICS, "Physics") \
MEMTYPEMACRO(MEMTYPE_AI, "AI") \
MEMTYPEMACRO(MEMTYPE_SCRIPTS, "Scripts") \
MEMTYPEMACRO(MEMTYPE_CUTSCENES, "Cutscenes") \
MEMTYPEMACRO(MEMTYPE_XML, "XML") \
MEMTYPEMACRO(MEMTYPE_FS, "Filesystem") \
MEMTYPEMACRO(MEMTYPE_LOOKUPS, "Lookups") \
MEMTYPEMACRO(MEMTYPE_FILEIO, "File IO") \
\\
MEMTYPEMACRO(MEMTYPE_ENUMSIZE, "ERROR")
#define MEMTYPEMACRO(typenumber, typestring) typenumber,
enum MemType
{
MEMTYPEENUM
};
#undef MEMTYPEMACRO
Be sure to have the MEMTYPE_ENUMSIZE constant as the last entry in the list, so we can use it to count how many memory types are used. This will come in handy a little bit later on.
These memory types should be fairly self-explanatory. Some areas may not interest you, so you can roll them into the MEMTYPE_GENERAL type. Or you might have additional areas that need their own added memory type. Customize the list to suit your needs; when you're done, we'll move on to capturing memory allocations.
Capturing Memory Allocations and Deallocations
Our weapon of choice will be an overloaded set of operators, as follows:
void* operator new(size_t size);
void* operator new[](size_t size);
void operator delete(void* ptr);
void operator delete[](void* ptr);
void* operator new(size_t size, MemType type);
void* operator new[](size_t size, MemType type);
void operator delete(void* ptr, MemType type);
void operator delete[](void* ptr, MemType type);
Notice the second set of operators: these are provided so we can specifically allocate memory in certain memory types. If we use a regular old new, the memory will go to MEMTYPE_GENERAL.
Place these operators in a file that is included before all other headers. If you are using precompiled headers, placing this #include at the top of your precompiled header file is a good choice. Otherwise, well... use precompiled headers! They're your friends and have many good uses.
Once the operators are declared, we need to define them. The implementation of these operators is pretty simple:
void* operator new(size_t size) { return Mem::Allocate(size, MEMTYPE_GENERAL); }
void* operator new[](size_t size) { return Mem::Allocate(size, MEMTYPE_GENERALARRAY); }
void operator delete(void* ptr) { Mem::Free(ptr); }
void operator delete[](void* ptr) { Mem::Free(ptr); }
void* operator new(size_t size, MemType type) { return Mem::Allocate(size, type); }
void* operator new[](size_t size, MemType type) { return Mem::Allocate(size, type); }
void operator delete(void* ptr, MemType type) { Mem::Free(ptr); }
void operator delete[](void* ptr, MemType type) { Mem::Free(ptr); }
I personally split up these two bits of code into memory.h and memory.cpp, respectively.
Tracking Memory Used by Classes
The next trick we have is pretty nifty. Right now, if we have a class Foo and we create a new Foo(), the memory will go to MEMTYPE_GENERAL. This is annoying! We want Foo to belong to a specific memory type, MEMTYPE_BAZ.
The solution is easy. In memory.h (or wherever you placed the overloaded new/delete declarations) add the following macro:
#define TRACK_MEMORY(memtype) \
public: \
static void* operator new(size_t size) { return Mem::Allocate(size, memtype); } \
static void* operator new[](size_t size) { return Mem::Allocate(size, memtype); } \
static void* operator new(size_t size, MemType type) { return Mem::Allocate(size, type); } \
static void* operator new[](size_t size, MemType type) { return Mem::Allocate(size, type); } \
static void operator delete(void* ptr) { return Mem::Free(ptr); } \
static void operator delete[](void* ptr) { return Mem::Free(ptr); } \
static void operator delete(void* ptr, MemType type) { return Mem::Free(ptr); } \
static void operator delete[](void* ptr, MemType type) { return Mem::Free(ptr); } \
private:
Now we just need to make a small addition to Foo's class declaration. Insert the line TRACK_MEMORY(MEMTYPE_BAZ) into the class declaration, at the very top. (You can technically place it anywhere, but I prefer the top for easy access. Also, the macro is set up assuming you place it at the top; hence the trailing protected access specifier, so you don't accidentally make any members public by using the macro.)
That's it! Any time we get a new Foo() we'll magically see the memory usage show up as MEMTYPE_BAZ. This will work regardless of how we use the class.
Suppose we have a Foo that really ought to belong to MEMTYPE_QUUX instead, for some technical reason. (A good example is a generic class that is used by several different modules.) Instead of using the default syntax, we make one small tweak:
Foo* thefoo = new (MEMTYPE_QUUX) Foo();
And now that particular Foo shows up under the MEMTYPE_QUUX memory type instead. This gives us a very fine-grained control over what memory type is used for any given object. This works on built-in types as well as classes, since we defined an extra new overload earlier. Of course it also works for structs.
Note that if Foo itself allocates memory, you'll need to specifically track that memory, too.
Wrapping Up
We now have all the tools we need to track the memory used by our program. In the next installment of the series, we'll look at how to store the memory tracking information, and provide some definition for those mysterious Mem::Allocate and Mem::Free functions.
Stay tuned - the best is yet to come!
| |
| Thursday, October 16, 2008 |
 Managing Your Memory - Part I |
Posted - 10/16/2008 7:54:15 PM | Managing Your Memory - Part I
Designing a robust memory management system for C++ software
This is the first part of a series of posts which I will be putting up here over the next several weeks or so (provided I can come up with the material for an actual series).
I will be discussing the topic of memory management and how to go about setting up a C++ program for debugging and tracking your memory usage. There are of course as many strategies for this as there are programmers in the world, so please don't take this to be the final word on memory management! It simply reflects my experience and my current set of best practices.
Getting Started
The first thing we want to do is define exactly what our system will do, and define a few critical concepts.
First of all, what exactly is memory management? A large number of things fall into this category, but I will be dealing specifically with a couple of special points. (As always, feedback is welcome - if you notice some major feature or behaviour which I've omitted, please let me know!)
- We need a way to track how much memory overall a program is using
- We need a way to track specifically how much memory various modules are using
- We need to detect duplicate freeings of memory so we can debug them
- We need to do our best to detect leaks and orphaned memory so we can fix those issues
- We need to keep overhead (both speed and memory overhead) minimal
- We need to support aligned memory allocators
- We would like some nifty statistics and graphs and such too
I'll be covering all this (and possibly more) in the remainder of the series. For now, we'll start with the basics.
The General Strategy
What exactly do we need to do in order to track where our memory usage is going? I use a four-pronged attack to solve this question:
- Track all "general" allocations by overloading global new and delete
- Track specific allocations by overloading new and delete for particular classes
- Allow third party libraries to register their memory usage via a simple interface
- Provide STL allocators that track the memory they allocate
Believe it or not, this covers the vast majority of allocations. (Since we are using C++, I will assume pray that you don't use malloc or something. However, if you do, be sure to pass those allocations through a wrapper, e.g. my_malloc, which takes care of registering the memory with the memory tracker system, via the interface defined in the third point above).
The remainder is stuff like HeapAlloc, VirtualAlloc, and so on (depending on your platform you may get others, eg. on the Xbox). Those allocations will need to be registered manually via point #3. However, since they can be found easily using a global find/replace, this isn't too big of a headache.
Designing the Memory Tracker
Before we get too bogged down in the details of actually capturing memory allocations, we need to figure out exactly how to keep track of all the memory usage in the first place.
Normally, we could be really lazy and do something simple, like this:
struct AllocationRecord
{
void* Address;
size_t NumBytes;
};
vector<AllocationRecord> Allocations;
However, this will invoke the global ::new operator when the vector does its allocations, which means the memory tracker will try to record itself - and die in a fiery infinite loop that eventually obliterates your stack and crashes the program.
To get around this, we use the Windows feature that a process can have multiple heaps. By using HeapCreate to create a secondary heap, we can set up a special sandbox for keeping all of the memory metadata in.
So now we have some space, but what data structure should we use?
There are two basic concerns we need to balance out here: speed of allocation, and speed of deallocation.
Option One: Unsorted Vector
The simplest approach is to just blob all the memory into an unsorted vector, and insert/remove items as they are allocated/freed, respectively. However, this quickly leads to terrible performance, because the vector will get resized and recopied many times.
A better approach is to flag a slot in the vector as "freed". When a new piece of memory needs to be added, we first look through the vector for a free slot, and place the memory record there. If no free slot is found, then we expand the vector by a factor of 1.5 or 2, give or take, depending on how the memory usage patterns look. (This is one of many numbers that can take a lot of tweaking to get just right.)
Unfortunately, this option performs very poorly because it requires a large number of O(n) lookups and copies, where n is the number of allocations your program has made. So let's try something different.
Option Two: Sorted Vector
If we sort the vector by allocation address, we can then look up an item in O(log2 n), using a binary search. This makes it very fast to free an item of memory. However, we pay a dear cost for this in the memory allocation routine, which has to ensure that the vector remains sorted. Not the best option available.
Option Three: Bucket-Sorted Vectors
The final approach, which I have personally settled on, is to use a set of buckets, each with a small vector associated with it. We look at the address of each allocated block of memory, do some bit shuffling, and use the resulting number to decide which bucket the address belongs to.
Once we have a bucket determined, we can simply add the memory record to the end of the vector, resizing it as needed. We can keep the vector size (and resize increment) fairly small.
To make things even better, instead of just a vector, we can use a linked list of vectors, each with a fixed (small) size. When one vector fills up, we just allocate another one and point to it from the end of the last vector. This allows us to keep a theoretically unlimited number of allocation records without requiring huge amounts of overhead up-front.
Typical overhead in the places I've used this technique varies from 1 to 25 MB, depending on the number and pattern of allocations being made. In general, it is very efficient, both in terms of space and time - lookups are dramatically sped up by the bucket system, and traversing the linked list of vectors is fairly fast. Overall there was less than 1 FPS drop by introducing this system in our current project code.
Keeping Track of Call Stacks
When you identify a memory leak or other problematic allocation, it is extremely useful to be able to see a call stack describing where and how the allocation occurred. So along with our allocation size and address, let's also store a short (maybe 5-10 entry) call stack capture. This has a minimal cost since we can simply store return addresses from the stack frames (more technical detail on that in a later article).
Memory Types
One final thing we want to consider is dividing up the memory usage into types. In our current project at the moment we have 26 different types, ranging from a "general" catchall type to highly specific things like XML parsing, strings, textures, meshes, and so on.
This can help save a lot of time diagnosing memory problems, since you always know what part of the code is using the memory.
A Peek Ahead
I'm deliberately avoiding implementation details in this first part of the series, specifically to keep things simple and short. A good programmer should be able to look at the described system and put it together by himself in a few hours. For those of you interested in a step-by-step walkthrough of the implementation, stay tuned!
For now though, here's a juicy little preview, summing up the information we want to track:
struct MemTrackRecord
{
void* Address;
unsigned LastAccessCallStack[MEMTRACK_CALLSTACK_DEPTH];
int Size;
MemType MemoryType;
};
struct MemTrackBucketPage
{
MemTrackBucketPage* NextPage;
unsigned NumberOfEntries;
MemTrackRecord* RecordArray;
};
Until next time!
| |
| Saturday, October 11, 2008 |
 Memory Management in Real Games |
Posted - 10/11/2008 5:44:01 PM | Over the past week or so I've been working on a memory management system for our as-yet-unannounced project over at Egosoft.
We had several goals for the system, including the ability to debug common memory issues (leaks, double deletions, premature deletions, orphaned blocks, and so on) and present detailed information about what memory is going where.
During the course of this small adventure, I've run into several things that I thought would be worth sharing. Chances are this is going to only really affect people doing very large-scale projects (or console/handheld projects where memory is a premium resource, unlike on the PC). Also note that it's pretty much strictly C++ stuff, so those of you using real programming languages probably don't need to worry as much 
Tip #1 - Start Early
We've left memory management until fairly late; a large block of code is already written, and all of my updates for tracking memory have required careful surgical manipulation of the 640+ file, 200,000+ LOC code base. Going into existing code, understandings its memory usage patterns, and then retrofitting memory budgeting and tracking logic is very painful.
This is ideally something that should be done almost before anything else in the game is coded; it affects literally every facet of the game, and it's always much simpler to build on what already exists than to build in a vacuum and come back later to fill in the gaps.
Tip #2 - Use Lots of Typedefs
You should always use a typedef instead of a standard container directly. For example, instead of spreading references to std::vector<WayPoint> scattered everywhere, create a typedef std::vector<WayPoint> WayPointVectorT and use WayPointVectorT everywhere.
This makes it much simpler to come back later and introduce STL allocators to your existing containers.
Tip #3 - Detail is Good
We've divided up our memory budget into about 20 different "types" of memory - ranging from file I/O buffers to graphics meshes to AI logic. Having a very detailed output of where memory is going makes it much simpler to see leaks. For example, if you know that the 300KB that's floating around belongs to the pathfinding system, you have radically narrowed down the amount of code you have to trace to find the leak.
In general, my rule of thumb here is pretty simple: if I can't decide what memory type a given chunk of memory belongs to, I create a new one just for that purpose. If implemented well, the overhead of this is extremely cheap, and precision is a very valuable tool when it comes to debugging.
Tip #4 - Create the Tools First, Then Teach People to Use Them
Inevitably on any project with more than one programmer you will encounter someone who is a bit behind the curve relative to their peers on the team. Instead of waiting for people to ask for a tool that would be useful, implement the tool anyways and teach people to use it as soon as possible.
This may seem like wasted time, but the rationale is really just common-sense: sometimes you don't notice you've been burned until you've had your hand stuck in the fire for a while. People won't recognize the need for tools in most cases; providing them as early as you can not only saves time, but improves the work environment overall.
Tip #5 - Implement it on a weekend
The best way to get something this significant and complex built into a pre-existing codebase is to get everyone else to temporarily check in all their code (make sure it compiles and runs first!), and work on the implementation while nobody else is active on the project. Merging delicate changes to memory management can be a nightmare, and often introduces unforeseen bugs and complications that will haunt you later.
Tip #6 - Take the opportunity to refactor a bit
During the course of implementing memory management, you'll begin to notice a lot of dependencies and interconnections between modules based on their memory usage behaviour. If any of these situations smell wrong, you've got a hand-crafted opportunity to correct them. Now you're improving the code base in two ways instead of one... so ask for a raise when you're done 
Tip #7 - Plan your budget rigorously
We've been more or less flying on the seat of our pants with memory usage up until now. Unfortunately, we really need to get RAM usage under control, for a variety of reasons. This means we've planned on setting up a strict budget - each memory type is allowed only a certain number of bytes of allocation. Exceeding that size will fail and halt the game.
Just be careful not to invent the budget numbers on the fly. You'll be much better off if you divide your 1GB address space (or whatever your minimum spec says) into roughly correct blocks based on memory type. Without some planning ahead, it's easy to reach a situation where you've crammed in 9 out of 10 features, and the last one just totally thrashes your RAM constraints. So budget first, and then work on getting things into the correct space. Having a hard number of "it has to be this small" is very useful versus simply "uh oh, it has to be smaller somehow".
So there you have it - some little bits of drivel from the trenches. Now if you'll excuse me, I have 15 compiler errors to go fix.
| |
In locus hic, omnes res dementes sunt.
|
| S | M | T | W | T | F | S | | | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | | 12 | 13 | 14 | 15 | | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | | 31 | |
OPTIONS
Track this Journal
ARCHIVES
July, 2009
June, 2009
May, 2009
April, 2009
March, 2009
February, 2009
January, 2009
October, 2008
September, 2008
August, 2008
July, 2008
June, 2008
May, 2008
April, 2008
March, 2008
February, 2008
January, 2008
December, 2007
November, 2007
October, 2007
September, 2007
August, 2007
July, 2007
June, 2007
May, 2007
April, 2007
March, 2007
February, 2007
January, 2007
December, 2006
November, 2006
October, 2006
September, 2006
August, 2006
July, 2006
June, 2006
May, 2006
April, 2006
March, 2006
February, 2006
January, 2006
December, 2005
November, 2005
October, 2005
|