\$50

### Image of the Day Submit

IOTD | Top Screenshots

## What improves better memory performance?

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

41 replies to this topic

### #21Hodgman  Moderators

Posted 07 September 2012 - 05:21 AM

I think the aversion against GC is a little too strong here and there. Especially from card core C/C++ programmers.

I've just worked on a few console games and seen the fallout. They're absolutely great for developers; there's a lot of good reasons as to why we chose to code the game in Lua rather than C++ -- a big one being that it's simply a more productive language.
However, by the end of the project, we realised that we hadn't taken enough care to avoid creating unnecessary garbage, and so the programming team had to crunch for months to optimise the Lua code and re-write many systems in C++, in order to get the GC set fast enough for real-time use. So my advice would be to make sure you keep an eye on your GC overheads. If possible, make the execution of the GC deterministic so you can accurately profile it.

Edited by Hodgman, 07 September 2012 - 05:23 AM.

### #22lawnjelly  Members

Posted 07 September 2012 - 05:55 AM

Have to say I'm with Hodgman on this one.

Using OS calls for allocation and deallocation at runtime in a game is one of the cardinal sins, but GC too? Urggg!! Do you know what these calls do behind the scenes? I wasn't even going to mention it.

If you do look into memory pools, stuff can get quite complicated so sometimes it might be nice to leave it to the GC platform's memory pool.

Well it would 'be nice' to be lazy and leave everything to a GC, but unfortunately there are reasons why people don't tend to use this kind of thing for time dependent stuff. I understand looking after memory is 'an extra bother' and 'complicated' but it's necessary if you want to make fast, stable code. I've also had to spend weeks sorting out problems caused by 'programmers' who thought memory management was 'a bother', and delayed shipping products, and left them bug ridden messes.

Favourite datatype: unsinged int

### #23SIC Games  Members

Posted 07 September 2012 - 09:38 AM

This is what this heavy debate conversation reminds me of:

Game Engine's WIP Videos - http://www.youtube.com/sicgames88
SIC Games @ GitHub - https://github.com/SICGames?tab=repositories
Simple D2D1 Font Wrapper for D3D11 - https://github.com/SICGames/D2DFontX

### #24swiftcoder  Senior Moderators

Posted 07 September 2012 - 10:01 AM

Do you know what these calls do behind the scenes?

Yes, I do (and my guess would be so does Hodgman, and a vast number of other people on this forum).

And that's a key issue when it comes to performance in general, not just where garbage collectors are concerned - you need to understand what goes on under the hood.

Don't fear the garbage collector just because you don't understand it. Odds are that if you don't have the knowledge/skills to rewrite the garbage collector from the ground up, you also don't have the knowledge/skills to beat it at performance.

Tristam MacDonald - Software Engineer @ Amazon - [swiftcoding] [GitHub]

### #25Hodgman  Moderators

Posted 07 September 2012 - 10:39 AM

Do you know what these calls do behind the scenes?

Yes, I do (and my guess would be so does Hodgman, and a vast number of other people on this forum).

I took that sentence as a rhetorical question -- as in, these calls do god awful stuff behind the scenes!

### #26swiftcoder  Senior Moderators

Posted 07 September 2012 - 10:48 AM

I took that sentence as a rhetorical question -- as in, these calls do god awful stuff behind the scenes!

And it may have been intended in that light, but it is still incorrect. There is nothing 'god awful' going on in a garbage collector - it's a deterministic process, much like everything else in computer science.

There are, of course, performance tradeoffs involved, and various pitfalls you need to be aware of, but there is nothing intrinsically evil about garbage collection.

Tristam MacDonald - Software Engineer @ Amazon - [swiftcoding] [GitHub]

### #27Hodgman  Moderators

Posted 07 September 2012 - 11:05 AM

Theres a big difference between
List<Callback> toBeCalledAtLevelEndLeaveMeAloneUntilThen;
and
Graph<RootObject> traverseMeEveryFrameMarkingReachedNodes;
List<EveryObject> traverseMeEveryFrameLookingForNonMarkedNodes;
Yes I've pitched the of simplest memory management against the dumbes non-generational mark-and-sweep GC, but the point is that for a lot of problems GC's or even smart-pointers are complete over-engineering. In my made-up straw-man, it's the difference between zero overhead per frame, and several milliseconds of cache-misses per frame, for no effect.

It's not evil, it may just be a lot more complex than is actually required.

Being a systems programmer, when I see anything with random-access memory patterns, such as a GC traversing an graph of all of your objects, it does conjure up the description of "god awful". On my last game we used Lua, which has a very simple GC. We had to customize it quite a bit, and then also re-write a lot of the Lua code to minimize garbage generation, in order to avoid random 8ms spikes in frame-times. We also ran it on a hyper-threaded core during rendering to try and soak up the GC cost for free, which helped, but it also trashed the cache-lines that the renderer was using, slowing it down too.

I'd still use them in the right circumstances, but with guidelines and care!

Edited by Hodgman, 07 September 2012 - 11:30 AM.

### #28swiftcoder  Senior Moderators

Posted 07 September 2012 - 11:29 AM

Being a systems programmer

In my mind, this is really the crux of the matter. I don't disagree with any of your points, but I also am sure that you know enough to correctly implement an alternative to GC (i.e. careful manual resource disposal, or a robust scoped/smart_ptr solution).

A lot of programmers don't have that background, and a surprising amount of the time the garbage collector actually beats naive attempts to manually manage memory (i.e. for many small allocations, it performs much more like a pool allocator than does malloc/new).

Tristam MacDonald - Software Engineer @ Amazon - [swiftcoding] [GitHub]

### #29Hodgman  Moderators

Posted 07 September 2012 - 11:37 AM

A lot of programmers don't have that background, and a surprising amount of the time the garbage collector actually beats naive attempts to manually manage memory (i.e. for many small allocations, it performs much more like a pool allocator than does malloc/new).

Yep, and also avoids all the other nasties like leaks, dangling pointers and random memory corruption

In some circumstances, Keep It Simple Stupid might mean "just use the damn GC and don't try and reinvent the wheel", and in other circumstances, KISS might mean "oh god why are you using another complex tree structure when you don't even need to be managing resources to solve this problem".

As usual generalisations turn out to not be useful

### #30Matt-D  Members

Posted 07 September 2012 - 12:04 PM

Fixed-Size Block Allocator (FSBAllocator): http://warp.povusers...ator/
Boost.Pool: http://www.boost.org/libs/pool

According to the FSBAllocator's benchmark using either the Boost.Pool allocator or the FSBAllocator might yield significant speed-ups over the default allocator for select cases (e.g., many small allocations that swiftcoder mentioned): http://warp.povusers...ator/#benchmark

IMHO, the fact that you can do that (just plug-in a specialized allocator when you need it / when the defaults don't work for you) is a significant advantage over the one-GC-fits-all solutions (where, when the defaults don't work for you, the best you can do is to experiment with GC tuning).

Edited by Matt-D, 07 September 2012 - 12:05 PM.

### #31Karsten_  Members

Posted 07 September 2012 - 01:23 PM

Well it would 'be nice' to be lazy and leave everything to a GC,

Sometimes the developer (or team) may not be quite skilled enough to have a choice and I hope to god that if working in a team, I wouldn't have to use some bodgy monstrosity.

In my spare time, I port software to FreeBSD and notice quite a few cases where developers have tried to hand roll their own stuff, nothing flags up bugs in this type of thing better than porting to an entirely new platform (and older version of GCC). So I suggest using a garbage collector unless you really know how to use the language properly.

While I havn't properly touched managed languages for well over 4 years, the Boehm GC works satisfactory on C++.

For software which needs no clever memory management however, the only solution is tr1/shared_ptr!.

Edited by Karsten_, 07 September 2012 - 01:28 PM.

http://tinyurl.com/shewonyay - Thanks so much for those who voted on my GF's Competition Cosplay Entry for Cosplayzine. She won! I owe you all beers

Mutiny - Open-source C++ Unity re-implementation.
Defile of Eden 2 - FreeBSD and OpenBSD binaries of our latest game.

### #32Cornstalks  Members

Posted 07 September 2012 - 01:43 PM

For software which needs no clever memory management however, the only solution is tr1/shared_ptr!.

Or if you're using C++11, any of C++'s smart pointers (not in tr1).
[ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

### #33lawnjelly  Members

Posted 07 September 2012 - 02:39 PM

Well it would 'be nice' to be lazy and leave everything to a GC,

Sometimes the developer (or team) may not be quite skilled enough to have a choice and I hope to god that if working in a team, I wouldn't have to use some bodgy monstrosity.

In my spare time, I port software to FreeBSD and notice quite a few cases where developers have tried to hand roll their own stuff, nothing flags up bugs in this type of thing better than porting to an entirely new platform (and older version of GCC). So I suggest using a garbage collector unless you really know how to use the language properly.

While I havn't properly touched managed languages for well over 4 years, the Boehm GC works satisfactory on C++.

For software which needs no clever memory management however, the only solution is tr1/shared_ptr!.

Yup, don't get me wrong, in almost the majority of apps I'd be all for using all the tricks in the book to make things simpler. Garbage collection, you name it.

Sorry if I come off as opinionated on the subject, I was a bit unfair on you Karsten .. I've had to deal with the mess caused in the past and it's not been pleasant. It's not very fair when people's jobs are on the line, and their families depending on them etc.

It's just in the specific case of (professional) games, particularly on fixed low memory devices (and some other software on embedded systems), my personal belief is that controlling the memory yourself can be the best option. That doesn't mean it's necessarily the best approach for people learning .. it's more an approach for making a solid professional product.

The two main reasons I would argue for this are:

Stability
Predictable timing

Stability - no worries about failed allocations .. your game will run each time, every time, no matter how many levels you load, what combinations of objects need to be loaded. There's no, ah but if character B walks round the back of building A, carrying object C and opens the door on level BLAH, then it crashes. Sometimes. Which is pretty much what you don't want to hear about when you are trying to ship something. Or what happens if someone is running such and such a program in the background in a multitasking environment.

Of course it's possible you could get round this to some extent with your Garbage Collection system - if it can allow you to pre-reserve your memory, (depending on its implementation regarding fragmentation), and if you keep a tight handle on your numbers of various objects. But once you get to this extent you are almost doing the work of doing it yourself anyway.

The other is that there is no question over the time taken over a deallocation / allocation. It is determined by your code and can be tightly determined - usually a constant very short time. There's no worry about dropping frames etc. Using a third party allocation / deallocation system leaves you at the mercy of their implementation. That's not to say there aren't good implementations, but there are also bad ones, and worst cases. Windows for example is quite happy to grind to a halt and do some disk swapping when it thinks it's necessary during an allocation / deallocation.

I fully understand that it can be a bit of extra effort (sometimes quite a bit) to manage memory yourself, although it's usually mainly a one off cost setting up your project. But development isn't just the time putting the code together, it's also beta testing, trying lots of different scripts, game levels, combinations of factors. In this situation the more potential problems you can remove the better.

If you are working to a time schedule with milestones and a budget and staff costs to pay, the last thing you want is some vague uncertainty over 'yeah it may take 2 years to beta test this thing'. That's one of the (several) reasons why games get canned / companies go under.

But anyway at the end of the day it's up to whoever is technical lead on a project to make these kind of decisions. Right I'm tired that's enough essaying it's bedtime!

Favourite datatype: unsinged int

### #34swiftcoder  Senior Moderators

Posted 07 September 2012 - 03:16 PM

The other is that there is no question over the time taken over a deallocation / allocation. It is determined by your code and can be tightly determined - usually a constant very short time.

So, I mostly agree with the rest of your post, but this point isn't quite as straightforward as you suggest.

Malloc/new are not deterministic. The cost of an individual allocation is generally much higher than that of a garbage collector, and it is not a fixed cost. But you do get the (to my mind, dubious) benefit that the performance cost is incurred at the call site (whereas garbage collection incurs a performance cost at an indeterminate later date).

If you actually need deterministic allocation cost, then you have to go with other solutions (probably ahead-of-time allocation: pool allocators, SLAB allocators, etc.)

Tristam MacDonald - Software Engineer @ Amazon - [swiftcoding] [GitHub]

### #35Karsten_  Members

Posted 07 September 2012 - 04:07 PM

Or if you're using C++11, any of C++'s smart pointers (not in tr1).

Agreed, although I did mention I was using an older version of GCC (due to the old BSD compatible license). When possible I will always take advantage of newer features of the C++ language!

Sorry if I come off as opinionated on the subject, I was a bit unfair on you Karsten .. I've had to deal with the mess caused in the past

Heh, no worries. I seem to be on the wrong side of this argument anyway because I am usually the first to advocate the use of manual memory management, RAII and simple clean solutions. ;)

I find deterministic destruction plays a much bigger part in my software than simply cleaning up memory too. For example if a unit of execution (i.e a thread) is running within a class (or containing references to), then this will never be flagged for disposal by the GC. This I find is quite a critical design flaw within most GC languages since what I really want to happen is once the object goes out of scope, the class should join the thread and deallocate in an elegant exception safe manner.
The only .NET language that seems to support this is C++/CLI since you can use auto_handle<T> as a means to implement the RAII pattern.

Slightly offtopic...
I dont know if anyone else noticed that Apple has recently deprecated garbage collection in 10.8 for their objective-c. Quite an interesting decision showing that perhaps they feel that manual memory management (or reference counting) isn't much harder than relying on a GC or at least that the performance will be superior etc....
http://developer.app...troduction.html

"Garbage collection is deprecated in OS X Mountain Lion v10.8, and will be removed in a future version of OS X"

Edited by Karsten_, 07 September 2012 - 04:11 PM.

http://tinyurl.com/shewonyay - Thanks so much for those who voted on my GF's Competition Cosplay Entry for Cosplayzine. She won! I owe you all beers

Mutiny - Open-source C++ Unity re-implementation.
Defile of Eden 2 - FreeBSD and OpenBSD binaries of our latest game.

### #36SIC Games  Members

Posted 07 September 2012 - 04:59 PM

"INTERIOR DEBATE ROOM - NIGHT TIME"

AS THE SMOKE CLEARS, A BIT OF HAZE STILL DRIFTS JUST LINGERING ON THE FLOOR.

Is everyone finished? Cool, Group Hug Everyone! Com'on!
Game Engine's WIP Videos - http://www.youtube.com/sicgames88
SIC Games @ GitHub - https://github.com/SICGames?tab=repositories
Simple D2D1 Font Wrapper for D3D11 - https://github.com/SICGames/D2DFontX

### #37Cornstalks  Members

Posted 07 September 2012 - 05:04 PM

"INTERIOR DEBATE ROOM - NIGHT TIME"

AS THE SMOKE CLEARS, A BIT OF HAZE STILL DRIFTS JUST LINGERING ON THE FLOOR.

Is everyone finished? Cool, Group Hug Everyone! Com'on!

You act like we're fighting... So far, I'd say this has been a very civil discussion People are just giving various opinions and technical facts, which will hopefully leave any future readers further enlightened, albeit perhaps no less undecided.
[ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

### #38swiftcoder  Senior Moderators

Posted 07 September 2012 - 05:07 PM

I dont know if anyone else noticed that Apple has recently deprecated garbage collection in 10.8 for their objective-c. Quite an interesting decision showing that perhaps they feel that manual memory management (or reference counting) isn't much harder than relying on a GC or at least that the performance will be superior etc....
http://developer.app...troduction.html

"Garbage collection is deprecated in OS X Mountain Lion v10.8, and will be removed in a future version of OS X"

Not quite.

Yes, Objective-C garbage collection is being removed (and was never that widely used to begin with, partly because of the lack of iOS support).

However, as your link indicates, Apple is replacing it with a system called 'ARC' (Automatic Reference Counting). Effectively, they have modified their compiler to spit out all those retain/release calls for you, and it does a much more reliable job of it than a human could.

I wouldn't really call that 'manual memory management'. It's still a fully automated garbage collector, just one based on internal reference counting (similar to Python's old garbage collector).

And sadly, it suffers from the age-old deficiency of reference-counting systems: the need to explicitly annotate weak references.

Edited by swiftcoder, 07 September 2012 - 05:09 PM.

Tristam MacDonald - Software Engineer @ Amazon - [swiftcoding] [GitHub]

### #39SIC Games  Members

Posted 07 September 2012 - 05:40 PM

No, Debate is healthy because it helps other learn. So, yes I know there's no social warfare
Game Engine's WIP Videos - http://www.youtube.com/sicgames88
SIC Games @ GitHub - https://github.com/SICGames?tab=repositories
Simple D2D1 Font Wrapper for D3D11 - https://github.com/SICGames/D2DFontX

### #40lawnjelly  Members

Posted 08 September 2012 - 01:18 AM

The other is that there is no question over the time taken over a deallocation / allocation. It is determined by your code and can be tightly determined - usually a constant very short time.

So, I mostly agree with the rest of your post, but this point isn't quite as straightforward as you suggest.

Malloc/new are not deterministic. The cost of an individual allocation is generally much higher than that of a garbage collector, and it is not a fixed cost. But you do get the (to my mind, dubious) benefit that the performance cost is incurred at the call site (whereas garbage collection incurs a performance cost at an indeterminate later date).

If you actually need deterministic allocation cost, then you have to go with other solutions (probably ahead-of-time allocation: pool allocators, SLAB allocators, etc.)

Ahha .. this may be where the confusion lies.

I didn't want to suggest 'using malloc / free at runtime is better than garbage collectors'. Far from it... they both have related downsides.

In c++, if you override new, you don't need to use OS calls for memory management. You can use whatever system you want for grabbing memory from wherever you want, then you have the opportunity to call the constructor yourself with placement new.

In addition there is a distinction between one off allocation / allocations at startup, and their corresponding deletion at shutdown, and dynamic use (i.e. the kind of things you might use lots of times in a frame). The second case is what we are interested in here. For actually reserving your memory at startup, you could use whatever you want .. an OS heap, garbage collected system. Ultimately your memory has got to come from somewhere.

(There is also the slightly less stringent case of level load / unload, where you *could* if necessary be a bit more lenient / take some shortcuts on some platforms).

What we are after in games, in an ideal world, for dynamic allocation (things that happen a lot rather than just startup and shutdown) is stability (no failed calls) and constant time (and fast) allocation and deallocation.

Sorry I should have been more clear on this. I would on the whole use things like fixed size memory allocators (and potentially other constant time allocators) for things that need to be created / destroyed dynamically (see my first post on page 1). You can use this for constant time incredibly fast allocations / deallocations, suitable for things like nodes in algorithms, even particle type systems.

For things that are truly variable size (levels etc) the tradeoff can be to prereserve space at startup for worst case, and work with that. Alright you lose a bit from the theoretical maximum, but you gain in simplicity and stability. On levels with not much geometry, you can e.g. add more sound, or more textures, and vice versa. For your level file you can prepack into the best format possible, with zero fragmentation, and make use of the whole of your budget in megs. If you need to use more than this, then you need to support streaming of level data on the fly (this is a whole other topic with similar concerns, guess what, you can use fixed size bank slots for this too!).

You can do this for GPU resources too .. reserve e.g. 5000 verts for a character and then stick to that budget or lower for your artwork, and you can guarantee they will always fit in that 'slot'.

You can also pre-designate blank 'slots' for various items in the level data RAM allotment to give more flexibility, if it seems a better idea than deciding ahead of time the maximum number of item 'blah'. If you do this you get the benefit of zero fragmentation, and best use of memory for that level.

In short there are lots of handy 'helper' bits of functionality offered to programmers, like 'general purpose' heaps, variable size strings etc. There are whole languages dedicated to making things 'easier' for the programmer where these things are a given (basic, php etc etc). In most situations this is a real benefit because it makes you much more productive as a programmer - less code, simpler code, less potential for bugs, and the 'costs' are not going to appear to the user.

It's just that in some situations, particularly time critical applications, and those on limited memory devices, it can become worth it to not use some of the helper functionality. An extreme example would be missile control software. You might have limited memory. If your program crashes, people die. If your program takes too long to faff around restructuring the heap, people die. It's only if it works predictably and as per spec that the right people die.

Other examples where you have to be a bit more stringent include things like financial software, medical software, some engineering software.

Would you want the nuke heading towards your neighbours house programmed in java with garbage collection, or c++ with no external allocations? I know know which one I'd rather have heading towards my neighbours.

(edit) Some good search terms to google in this area are : 'real time programming', and 'mission critical programming'. (/edit)

Favourite datatype: unsinged int

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.