C# Garbage Collection and performance/stalls

Started by
20 comments, last by DvDmanDT 8 years, 6 months ago

Not to hijack the thread but one concern I always had with C# in regards to game programming is stalls due to garbage collection. Is it really an issue? Are there workarounds? I mean general performance is most likely fine but if you have hiccups every now and again I would find it unacceptable.

-potential energy is easily made kinetic-

Advertisement

Splitting this off. Please start a new thread rather than hijacking another.

For garbage collection, it CAN be an issue. All memory management in games CAN be an issue.

There are many major games, including AAA games, that use C# either as a primary language or as a scripting language. Intelligent policies about memory management go a long way.

Are there workarounds?


The main workaround is to minimize the conditions that cause the GC to trigger. Usually the GC runs when you pass certain thresholds in the total amount of allocated memory, or if something manually calls GC.Collect.

C# makes it easy (too easy?) to allocate things and then throw them away later - making a list in a function, returning it, then discarding it, for example. You want to try to minimize the amount of actual "garbage" you generate. Anything that you keep reusing will minimize the number of allocations you make, which in turn will give you more time between GCs.

This may occasionally cause you to make your code more cumbersome and bug-prone. Making new collections on the fly allows you to prove that no other code could be accessing that collection at the same time, which is 100% safe, but it leads to accumulating more garbage. Immutable data structures are good for parallelism, but making any changes requires allocations (and if the data structure is hierarchical, it can cause ridiculous amounts of allocations just to replace a single leaf element). Reusing a mutable collection instead eliminates some of this safety. You can pool reusable objects, but it means you need to verify that your pooling code doesn't have any bugs.


Splitting this off. Please start a new thread rather than hijacking another.

Sorry it seemed related to his concerns with regards to performance so I figured it would complement his question.


The main workaround is to minimize the conditions that cause the GC to trigger. Usually the GC runs when you pass certain thresholds in the total amount of allocated memory, or if something manually calls GC.Collect.

But garbage collection when it happens is always proportional to the # of live objects right? C# uses mark and prune right? So basically when it is triggered if the data set of the program is large, collection will always cause a fairly large stall right?

-potential energy is easily made kinetic-

But garbage collection when it happens is always proportional to the # of live objects right? C# uses mark and prune right? So basically when it is triggered if the data set of the program is large, collection will always cause a fairly large stall right?


Yes - the more objects you have allocated, the longer the mark step will take, even if none of them get collected. If you can minimize the total amount of objects you have allocated, that will help, but that directly limits the scope of your game. Minimizing the number of temporary allocations doesn't.

Use value types like structs for temporary allocations, these are limited to their scope and are efficiently released when the program's flow exits their scope. Then you don't need to bother the GC with a million tiny 16-byte objects that are created and immediately destroyed every frame.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

Bacterius is right; when structs are used properly, they are a great way to eliminate heap allocations.

structs as local variables, parameters, and return values are allocated on the stack just like they would be in C or C++. These are basically not going to impact the GC at all unless they're currently on the stack when the GC triggers (which will be so rare that it can be ignored).

structs as class members are allocated as part of the class itself, so they aren't treated as a separate allocation node that the GC has to traverse and mark separately (the struct's members CAN be references, though, which will need to be traversed by the GC like any other reference).

structs as the element type of arrays are allocated contiguously in the array, like they are in C or C++. This is also the case for collections that use arrays internally, such as List<T> and Dictionary<K,V> if the generic types are structs. If everything in an array is a value type, the *entire array* becomes a single traversal step for the GC.

But garbage collection when it happens is always proportional to the # of live objects right? C# uses mark and prune right? So basically when it is triggered if the data set of the program is large, collection will always cause a fairly large stall right?

The C# language specification does not contain details how the VM should do garbage collection. https://www.microsoft.com/en-us/download/details.aspx?id=7029

An implementation of a language is free to choose any algorithm that it likes. For this reason, you cannot say "C# does mark and prune" without explicitly stating what implementation you refer to.

You can also not draw conclusions about the language itself.
Languages have no speed, a language allows you to express how to perform a computation.
Like math, it can be compact, elegant, or long, but it's not fast or slow. It's just a static description of the computation.
It has no performance at all, until you get yourself an implementation of the language to compile and run it.

You can talk about performance of an implementation (compiler implementations, run-time system implementations) of a languages, but then you're discussing single instances, which can vary from instance to instance (C#.net and Mono can be very different), and from version to version.
One nice thing about Mono is that you can choose which of several garbage collectors you use.


And about making garbage, there are different kinds of garbage based on the types of garbage collector used.

Compared with C++ and other systems, in those you generally clean up memory during destructors, and destructors are called explicitly when moving through the stack or cleaning up as part of work. In practice this means taking several moments during busy processing times. With GC systems it is possible -- if the system is used well -- for destructors and cleanup to run during idle times rather than run in the middle of your busy processing moments. Done well a GC system CAN improve your memory management performance, CAN be even more cache friendly than traditional heap allocators, CAN give benefits that in c++ require a carefully tuned multithreaded custom allocator.

Or, if you don't use it well, it CAN frequently trigger terrible nightmare scenarios, frequently stopping your entire application right at the moments you most need high performance, and CAN potentially stall for many milliseconds or even full seconds. This was originally the big fear and nightmare when Java gained popularity. You can use any tool badly.



With many of the common garbage collectors, what you really want is to first not make garbage, so make things on the stack if they belong on the stack. If you do create things on the heap, either make them extremely short lived (allocate, use, release immediately within in a function or as a return result) or make them longer lived (lasting across an entire play session or level or map).

Generally the GC can run quietly during idle moments in ways that you never notice. If your memory management is good the GC runs quietly as an idle-time thread that you never notice. GC systems are usually great at very short lived objects, since they can be passively cleaned up during idle times, and are great at very long lived objects since they can be quietly compacted during idle times. GC systems work well if your continuous allocations are kept to a minimum. Then the thread runs quietly in the background during your idle times. Since it is idle time processing rather than processing under load, it is even better performance than C++'s common designs.

Violate those rules, create tons of garbage so you consume your pools, or do lots of memory work continuously so you run faster than the idle-time background thread, or use intermediate-lifetime objects that require frequent compacting but then die soon after, and you'll quickly exceed the limits of your friendly idle-time garbage collector and trigger major 'stop the world' cleanups during your busy work times. That's when things get terrible.

Some people have mentioned that you use structs when possible to minimise heap impact.

If you do use structs in C# you should treat them like that value types they are and hence make them immutable. So readonly fields or get only properties and set these via a constructor.

There are a whole raft of subtle bugs that you can introduce into your code if you allow mutable structs all based around issues like, what are you changing, is it the copy you think it is or has it been boxed. Just far easier to avoid this and make immutable :)

This topic is closed to new replies.

Advertisement