What improves better memory performance?

Started by
40 comments, last by lawnjelly 11 years, 8 months ago

I think the aversion against GC is a little too strong here and there. Especially from card core C/C++ programmers.
I've just worked on a few console games and seen the fallout. They're absolutely great for developers; there's a lot of good reasons as to why we chose to code the game in Lua rather than C++ -- a big one being that it's simply a more productive language.
However, by the end of the project, we realised that we hadn't taken enough care to avoid creating unnecessary garbage, and so the programming team had to crunch for months to optimise the Lua code and re-write many systems in C++, in order to get the GC set fast enough for real-time use. So my advice would be to make sure you keep an eye on your GC overheads. If possible, make the execution of the GC deterministic so you can accurately profile it.
Advertisement
Have to say I'm with Hodgman on this one.

Using OS calls for allocation and deallocation at runtime in a game is one of the cardinal sins, but GC too? Urggg!! Do you know what these calls do behind the scenes? I wasn't even going to mention it.wacko.png


If you do look into memory pools, stuff can get quite complicated so sometimes it might be nice to leave it to the GC platform's memory pool.


Well it would 'be nice' to be lazy and leave everything to a GC, but unfortunately there are reasons why people don't tend to use this kind of thing for time dependent stuff. I understand looking after memory is 'an extra bother' and 'complicated' but it's necessary if you want to make fast, stable code. I've also had to spend weeks sorting out problems caused by 'programmers' who thought memory management was 'a bother', and delayed shipping products, and left them bug ridden messes.dry.png
This is what this heavy debate conversation reminds me of: :P

Game Engine's WIP Videos - http://www.youtube.com/sicgames88
SIC Games @ GitHub - https://github.com/SICGames?tab=repositories
Simple D2D1 Font Wrapper for D3D11 - https://github.com/SICGames/D2DFontX

Do you know what these calls do behind the scenes?

Yes, I do (and my guess would be so does Hodgman, and a vast number of other people on this forum).

And that's a key issue when it comes to performance in general, not just where garbage collectors are concerned - you need to understand what goes on under the hood.

Don't fear the garbage collector just because you don't understand it. Odds are that if you don't have the knowledge/skills to rewrite the garbage collector from the ground up, you also don't have the knowledge/skills to beat it at performance.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]


[quote name='lawnjelly' timestamp='1347018916' post='4977578']
Do you know what these calls do behind the scenes?

Yes, I do (and my guess would be so does Hodgman, and a vast number of other people on this forum).[/quote]I took that sentence as a rhetorical question -- as in, these calls do god awful stuff behind the scenes!

I took that sentence as a rhetorical question -- as in, these calls do god awful stuff behind the scenes!

And it may have been intended in that light, but it is still incorrect. There is nothing 'god awful' going on in a garbage collector - it's a deterministic process, much like everything else in computer science.

There are, of course, performance tradeoffs involved, and various pitfalls you need to be aware of, but there is nothing intrinsically evil about garbage collection.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Theres a big difference between
List<Callback> toBeCalledAtLevelEndLeaveMeAloneUntilThen;
and
Graph<RootObject> traverseMeEveryFrameMarkingReachedNodes;
List<EveryObject> traverseMeEveryFrameLookingForNonMarkedNodes;

Yes I've pitched the of simplest memory management against the dumbes non-generational mark-and-sweep GC, but the point is that for a lot of problems GC's or even smart-pointers are complete over-engineering. In my made-up straw-man, it's the difference between zero overhead per frame, and several milliseconds of cache-misses per frame, for no effect.

It's not evil, it may just be a lot more complex than is actually required.

Being a systems programmer, when I see anything with random-access memory patterns, such as a GC traversing an graph of all of your objects, it does conjure up the description of "god awful". On my last game we used Lua, which has a very simple GC. We had to customize it quite a bit, and then also re-write a lot of the Lua code to minimize garbage generation, in order to avoid random 8ms spikes in frame-times. We also ran it on a hyper-threaded core during rendering to try and soak up the GC cost for free, which helped, but it also trashed the cache-lines that the renderer was using, slowing it down too.

I'd still use them in the right circumstances, but with guidelines and care!

Being a systems programmer

In my mind, this is really the crux of the matter. I don't disagree with any of your points, but I also am sure that you know enough to correctly implement an alternative to GC (i.e. careful manual resource disposal, or a robust scoped/smart_ptr solution).

A lot of programmers don't have that background, and a surprising amount of the time the garbage collector actually beats naive attempts to manually manage memory (i.e. for many small allocations, it performs much more like a pool allocator than does malloc/new).

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]


A lot of programmers don't have that background, and a surprising amount of the time the garbage collector actually beats naive attempts to manually manage memory (i.e. for many small allocations, it performs much more like a pool allocator than does malloc/new).
Yep, and also avoids all the other nasties like leaks, dangling pointers and random memory corruption biggrin.png

In some circumstances, Keep It Simple Stupid might mean "just use the damn GC and don't try and reinvent the wheel", and in other circumstances, KISS might mean "oh god why are you using another complex tree structure when you don't even need to be managing resources to solve this problem".

As usual generalisations turn out to not be useful unsure.png
See also:
Fixed-Size Block Allocator (FSBAllocator): http://warp.povusers...ator/
Boost.Pool: http://www.boost.org/libs/pool

According to the FSBAllocator's benchmark using either the Boost.Pool allocator or the FSBAllocator might yield significant speed-ups over the default allocator for select cases (e.g., many small allocations that swiftcoder mentioned): http://warp.povusers...ator/#benchmark

IMHO, the fact that you can do that (just plug-in a specialized allocator when you need it / when the defaults don't work for you) is a significant advantage over the one-GC-fits-all solutions (where, when the defaults don't work for you, the best you can do is to experiment with GC tuning).

This topic is closed to new replies.

Advertisement