Sign in to follow this  
mrheisenberg

Is memory management a must have?

Recommended Posts

mrheisenberg    362
In the book Game Programming Gems 8 it talks about overriding the [i]new [/i]operator and creating some sort of HeapManager system to avoid memory fragmentation in the RAM during run time.Is this some kind of problem for older systems,or do you still have to do it on windows7/with new hardware?It also says that if you have an array of structs with lets say 5 [i]ints [/i]in each and you only use 1 [i]int [/i]from each struct,the CPU still wastes cycles for the extra unused data.Can't find much info about this anywhere else. Edited by mrheisenberg

Share this post


Link to post
Share on other sites
wqking    761
Don't do for the problem you didn't get yet.

Also it's quite depending on your memory management strategy.
If you allocate all memory when entering a game level and free all when exiting, you won't get memory fragments.

5 ints cause CPU wastes cycles? Now CPU has cache for at least 32 bytes per page, so how can the waste happen? [img]http://public.gamedev.net//public/style_emoticons/default/biggrin.png[/img] Edited by wqking

Share this post


Link to post
Share on other sites
JohnnyCode    1046
in c++ local variables are allocated from preallocated pool, so OS do not search for memory to provide, BUT, when you use "new" it searches memory.
If you have a function that has a local variable -char* pMany= new char[100000000]; and before it finishes the block you call delete[] pMany, then you shoud rather have a pre allocated pool, such as you use

{
char* pMany=(char*)m_Pool->Request(100000000);// all great
CClassLike pObject=(CClassLike*)m_Pool->Request(sizeof(CClassLike));// beware, THIS will not call constructor or destructor
//use it and when finished with local data- reset pool, or reset it before function runs
m_Pool->Reset();
}

This would be the m_Pool type - CPool,

public class CPool
{
public:
Cpool()
{
m_bAllocated=false;
}
bool CreatePool(long bytescount)
{
// use only once, or handle freeing
m_bAllocated=true;
m_pPoolMemory=new char[bytescount];
m_iPoolSize=bytescount;
m_iCurrentProvidedBytes=0;
}
char* Request(long size)
{
if (m_iCurrentProvidedBytes+size>m_iPoolSize)
return NULL;
char* result= (char*)(m_pPoolMemory+m_iCurrentProvidedBytes);
m_iCurrentProvidedBytes+=size;
return result;
}
void Reset()
{
m_iCurrentProvidedBytes=0;
}
void FreePool()
{
if (m_bAllocated)
delete[] m_pPoolMemory;
}
private:
long m_iPoolSize;
long m_iCurrentProvidedBytes;
char* m_pPoolMemory;
bool m_bAllocated;
}

As you can see CPool::Request(size) is some fast instruction. And I would not override "new" operator, but this is just my personal preference.

Share this post


Link to post
Share on other sites
To add on to frob's statement, you don't even need a generic "pool" to take advantage of this. Depending on what you are doing, you can just reuse the same memory blocks for the same purpose (but with new data), and get massive speedups. This has the added benefit that you don't need to worry about calculating sizes or fragmentation issues.

Real life example from my current project:
Since my game is a 2D tile based game, and every "chunk" of the world is the same size (20 by 20 tiles), when I need to unload chunks and load new chunks when the player moves around, I use the memory allocated for the previous chunks to store the new chunks, instead of calling delete() on the old area and calling new() on the new area. Instead of calling new() for 20 * 20 (400) tile structs (per layer per chunk), the tile memory is just reused for new tiles. My speed ups were very noticeable, though I don't remember the exact amount gained.

In the same way, I reuse "tile" memory, "layer" memory, and "chunk" memory, and only allocate if I need more layers, or delete if I have unused layers.
[size=2]Note: the 'tile' struct was very small and was already taking advantage of the [url="http://en.wikipedia.org/wiki/Flyweight_pattern"]Flyweight pattern[/url], mostly holding a pointer to the shared data and a few extra ints for animation details (current frame, timing, etc...).[/size]

The rest of my game [i]doesn't[/i] do this, since it's [i]not[/i] a bottleneck and would be [url="http://en.wikipedia.org/wiki/Premature_optimization#When_to_optimize"]premature optimization[/url] (and introduces undesired complexity), but when I [u]need[/u] the extra speed, there's almost always a way to get the extra speed. But don't be distracted by what you don't yet need.

Share this post


Link to post
Share on other sites
the_edd    2109
[quote name='frob' timestamp='1349129916' post='4985892']
If you really feel like you need them, use a pre-written pool library like [url="http://www.boost.org/libs/pool"]boost::pool[/url] that is already debugged and working properly.
[/quote]

Just a note to say that boost::pool has been found to be considerably slower than the default allocator on some systems, just highlighting how non-trivial this is. I believe many people on the boost mailing list want the library (in its current form) deprecated in some way.

Share this post


Link to post
Share on other sites
rip-off    10979
Is such memory management a must have? How much effort you expend managing memory heavily depends on the kind of game you are making.

For small games, it is often enough to ensure that one is not doing silly things - such as loading large resource multiple times (e.g. caching textures/sounds etc), or dynamically allocating scores of simple objects such as particles or bullets.

As your game scales up merely not doing silly things will eventually cease to suffice. Of course, it is an open question as to what limit you are going to start brushing up against first. Depending on the application, you might find that the memory limitations you might come up against are actually on the GPU. Alternatively, the limit might be unrelated to managing memory, but instead due to memory access patterns as others have mentioned. Or it might be neither - there are lots of other things which could cause the game to under-perform, such as an algorithmic bottleneck in some unexpected subsystem. Middle range games might only need a bit of attention applied to one or two key subsystems to make the game perform as expected.

At the upper end of this scale are the kind of AAA games that require major engineering effort to even work on the computers they will be released on. You will be handed a time/memory budget for your subsystem and you'd better not exceed it! It is highly likely that all subsystems in the underlying engine will have some kind of custom memory management - even if only to enable the developers to see if they are meeting their budget.

Share this post


Link to post
Share on other sites
Kyall    287
It's not a must have.

But having a working tool to locate memory leaks is.

You can have memory leak detection stuff going on with a custom heap allocator/deallocator, but generally the tools that are out there that work with new & delete are far better in terms of usability in terms of tracking down memory leaks and fixing them.

Share this post


Link to post
Share on other sites
mhagain    13430
[quote name='mrheisenberg' timestamp='1348966427' post='4985207']
In the book Game Programming Gems 8 it talks about overriding the [i]new [/i]operator and creating some sort of HeapManager system to avoid memory fragmentation in the RAM during run time.Is this some kind of problem for older systems,or do you still have to do it on windows7/with new hardware?[/quote]

This only applies if you're allocation and releasing memory at runtime. And if you are - well, you shouldn't be. The correct pattern is to allocate everything you need up-front at start or load time, then just use it at runtime; if you need a temp pool of scratch memory for short-lived runtime allocations, then create a temp pool of scratch memory for short-lived runtime allocations and pull from that (but create the pool at startup or load time too, not at runtime); otherwise you shouldn't be overloading new or delete in the general case.

[quote name='mrheisenberg' timestamp='1348966427' post='4985207']It also says that if you have an array of structs with lets say 5 [i]ints [/i]in each and you only use 1 [i]int [/i]from each struct,the CPU still wastes cycles for the extra unused data.Can't find much info about this anywhere else.
[/quote]

That sounds like the worst kind of micro-optimization. If you're worrying about things down to individual bytes and cycles, then you're worrying about the wrong things. There's rarely meaningful performance gains to be had from that kind of optimization. At the same time, if you have a struct with 5 ints but you only use one of them, the big question is - [i]why on earth do you have 5 ints in the struct[/i]? If there's no reason for the other 4 to be there - [i]get rid of them[/i]. But that's from a good code-cleanliness perspective rather than anything else.

Share this post


Link to post
Share on other sites
alh420    5995
[quote name='mhagain' timestamp='1349695318' post='4987949']
That sounds like the worst kind of micro-optimization. If you're worrying about things down to individual bytes and cycles, then you're worrying about the wrong things. There's rarely meaningful performance gains to be had from that kind of optimization. At the same time, if you have a struct with 5 ints but you only use one of them, the big question is - [i]why on earth do you have 5 ints in the struct[/i]? If there's no reason for the other 4 to be there - [i]get rid of them[/i]. But that's from a good code-cleanliness perspective rather than anything else.
[/quote]

I think they are more referring to a design issue, where those 5 ints in some way are conceptually coupled (they are parameters to a "vehicle" or such), but you in that particular function you want to optimise only use one of them (like for example the "speed"). In that case it might be bad for memory throughput and cache to work on this sparse array.
Still a micro optimisation though, and nothing you should worry about until you find you need it through performance measurements.

It's good to know about these "tricks" or "gems", but one should not confuse them with code guidelines, and should not worry about them in daily work. (unless you are a performance optimisation specialist) Edited by Olof Hedman

Share this post


Link to post
Share on other sites
Saruman    4339
[quote name='mhagain' timestamp='1349695318' post='4987949']
That sounds like the worst kind of micro-optimization. If you're worrying about things down to individual bytes and cycles, then you're worrying about the wrong things. There's rarely meaningful performance gains to be had from that kind of optimization.
[/quote]
I wouldn't call it a micro-optimization at all but rather a design issue and one that is becoming more important every year. There are two issues at stake, the first being the fact that fetching from RAM is slow and will continue to get slower. Therefore one of the most important issues is how you fetch and cache the data to operate on as there is really no reason not to do this... it usually makes the code much easier to read. On current generation platforms (and likely future) it is extremely important as you can't afford to DMA a bunch of data that you don't need to work on, etc. Therefore I wouldn't call this the "worst kind of optimization" but rather "the best kind of design".

Also note when data is separated out and designed like this it usually goes hand in hand with being able to parallelize operations much easier. You aren't passing a large object with the kitchen sink inside (where realistically anything could be called/changed)... you are able to pass large contiguous blocks of memory that hold specific pieces of data to be worked on. Edited by Saruman

Share this post


Link to post
Share on other sites
JohnnyCode    1046
[quote name='mhagain' timestamp='1349695318' post='4987949']

That sounds like the worst kind of micro-optimization. If you're worrying about things down to individual bytes and cycles, then you're worrying about the wrong things. There's rarely meaningful performance gains to be had from that kind of optimization. At the same time, if you have a struct with 5 ints but you only use one of them, the big question is - [i]why on earth do you have 5 ints in the struct[/i]? If there's no reason for the other 4 to be there - [i]get rid of them[/i]. But that's from a good code-cleanliness perspective rather than anything else.
[/quote]

in my game I have a on air mesh loading from hdd stream. A mesh of 100 000 verticies needs about 8 arrays of megabytes size to dispatch to GPU ram. If I had been allocating those temporary byte arrays and sending them to GPU and freeing them, my game would be framing intensively. I instead started to use preallocated memory for those temporary large arrays and now I can load several 100 000 vertex models to scene without a noticable frame (without textures of course, those are preloaded for whole world).

This was just an example, we are not talking about preallocating 5 ints. Nitpicking preallocated memory to be useless optimization is realy out of sense.

Share this post


Link to post
Share on other sites
As the other posters have pointed out, the gem refers to having a pool of memory that you allocate up front, as opposed to allocating on demand. This is the way that the Java JVM works; at startup time, it requests (from the operating system) the maximum amount of memory the program is configured to use (per environment flags), and then does its own allocations out of that memory later. This way it doesn't have to wait on the OS scheduler, kernel, whatever, to do the job for it, and it can optimize its memory arrangement however is optimal for that specific program. The previously mentioned boost::pool does the same thing. There are C libraries that do the same, etc, ad infinitum.

See the wikipedia article on Memory Pools for more generalized information: [url="http://en.wikipedia.org/wiki/Memory_pool"]http://en.wikipedia....iki/Memory_pool[/url] Edited by akesterson

Share this post


Link to post
Share on other sites
larspensjo    1561
[quote name='akesterson' timestamp='1350222745' post='4990033']
As the other posters have pointed out, the gem refers to having a pool of memory that you allocate up front, as opposed to allocating on demand. This is the way that the Java JVM works;
[/quote]

It is a mechanism I am hesitant to. It reminds me of eating lunch at a place like McDonalds; if it looks like the tables are not going to suffice, people start allocating tables before ordering the food. It is also like Microsoft Windows today. Many applications take a long time to start, so they add some pre-startup functionality with the system or login.

Of course, there may be a benefit of speed. But it can also result in everyone losing. Please excuse me for associations in tangent space.

Share this post


Link to post
Share on other sites
[quote name='larspensjo' timestamp='1350296126' post='4990331']
It is a mechanism I am hesitant to. ... Of course, there may be a benefit of speed. But it can also result in everyone losing. Please excuse me for associations in tangent space.
[/quote]

It's not suitable for every situation, certainly, but there are times when you know you are better off allocating everything up front, rather than piecemeal. YMMV.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this