• Advertisement
Sign in to follow this  

[.net] Garbage Collection Question

This topic is 4242 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi guys, I've just stumbled upon some code from an MS training book and noticed that the new keyword has been called multiple times on the same object. This is certainly not something I would do, purely from the affects this has from a C++ point-of-view. However, because it's C#, I'm unsure as to whether this would create a memory leak, or as my gut is telling me, the object that no longer has a reference to it, is marked for garbage collection. Snippet of the code to illustrate:
using System;

namespace test
{
	public class Foo
	{

	}

	class TestClass
	{
		static void Main()
		{
			Foo bar = new Foo();
			bar = new Foo();	// What happens internally here?

			// ...
		}
	}
}


So which one is correct: memory leak or marked for garbage collection? Thanks in advance.

Share this post


Link to post
Share on other sites
Advertisement
It'll get garbage collected. And since .NET uses a generational garbage collection mechanism, chances are it'll be collected quite quickly.

Share this post


Link to post
Share on other sites
Damn that was quick! Cheers! Rate++ didn't seem to do anything to your rating. Thought that counts? [wink]

Share this post


Link to post
Share on other sites
From experience, you should still be as careful as possible, as garbage collection can happen at an arbitrarily long interval.

When dealing with objects that don't have side-effects this is okay because as soon as memory reaches that arbitrary point it gets collected. The problems start occurring when side effects such as calling unmanaged code occur. For example with open connections to a database or file.

Also, garbage collection does run on an idle thread, but may take up cpu eventually if no idle time has occurred.

So, only create objects that you need and dispose and/or null out objects that you are done with.

This will add pressure to the garbage collector and also will free up unmanaged resources more quickly, which is almost always the desired outcome.

Share this post


Link to post
Share on other sites
Yeah agreed mate. As I say, this code came from an MS training book, but I can't see myself using the same approach. Before I'd seen this snippet, I'd never even considered doing this as it gives me the feeling it's just asking for trouble at some point or another.

Share this post


Link to post
Share on other sites
on the first line, the reference 'bar' gets assigned to an object.
So the first Foo() instance has 1 reference pointing to it.

on the second line, bar is made to reference a new instance of Foo. The first instance of Foo no longer has any references.

At this point, generally, nothing will happen.

It is important to know that the GC generally only does a collection when it has to. It will also do it far more often if the CPU is idle. It's very dynamic and *very* smart.

So... assuming later on, there are spare cpu cycles lying around, or the allocator is starting to run out of memory that has been allocated to the application. In such a case, the GC will run a 'first generation' GC. In this case, it will basically look for objects with a reference count of zero (such as the first instance of Foo) and collect that memory. First generation GC is very fast.
You cannot predict when this will happen, and should not force it either (ie, don't use GC.Collect())
If you make an allocation, and not enough memory is free, *and* not enough memory is collected, then one of two things will happen:
more memory will be allocated
or the GC will do a higher generation collection.

Higher generation collections are *very* slow, as they have to take into account the following:

object A references object B.
object B references object A.

Nothing else references either object A or B.

Consider it... Both object instances have 1 reference pointing to themselves, yet both are useless (and need to be collected) because nothing important refernces them.
In such a case the GC will have to do a very expensive test on the memory allocations/reference graphs, to find groups of objects that are no longer being used.
I cannot stress how expensive this is.

Multigenerational GC can also crop up when dealing with boxed structs, and threading can complicate things.

In short, if you get performance problems, it is the first place to look (ie, get the CLR profiler and look for higher generation GC's in the memory allocation timeline, and look for what objects are piling up just before it.

I've had cases where 1 line of code was causing a multi-generational GC on a structure that represented an objects 3d transformation. With a large number of these objects per frame (10,000 - stress test) the performance hit was massive because there was literally many, many MBs of memory needing to be collected as higher generation GC every second. Fixing this boosted perforamnce by a factor of over 10x.

Share this post


Link to post
Share on other sites
Quote:
Original post by BradSnobar
When dealing with objects that don't have side-effects this is okay because as soon as memory reaches that arbitrary point it gets collected. The problems start occurring when side effects such as calling unmanaged code occur. For example with open connections to a database or file.


True. The best solution to this is still to dispose such resources as soon as you don't need them anymore. Also, whenever possible, use the using idiom to avoid forgetting disposal (and it works well with exceptions, too).

Share this post


Link to post
Share on other sites
Quote:
Original post by RipTorn
In this case, it will basically look for objects with a reference count of zero (such as the first instance of Foo) and collect that memory.

No reference counts.

Share this post


Link to post
Share on other sites
Quote:
Original post by Arild Fines
Quote:
Original post by RipTorn
In this case, it will basically look for objects with a reference count of zero (such as the first instance of Foo) and collect that memory.

No reference counts.


Well, it's more complex than that (object graph, and reachable references, etc.), but as far as us, the programmers are concerned, it's just reference counting. You can get jiggy with it and do some cool stuff with WeakReference, but at the end of the day, when there are no more references to an object, it is ready for garbage collection.

Here's a great read on the topic:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/dotnetgcbasics.asp

written by Rico Mariani, who's blog is brilliant:
http://blogs.msdn.com/ricom/

Share this post


Link to post
Share on other sites
Quote:
Original post by joelmartinez
Well, it's more complex than that (object graph, and reachable references, etc.), but as far as us, the programmers are concerned, it's just reference counting.

No, as far as we're concerned it isn't just reference counting - it's a proper GC and so will correctly clean up circular references and other cases that a reference counting implementation can't do.

The term you're looking for is 'unreachable'.

Share this post


Link to post
Share on other sites
Quote:
Also, garbage collection does run on an idle thread, but may take up cpu eventually if no idle time has occurred.


No GC runs on the thread that tried to do the allocation that made the GC kick in. The only threading involved in the .Net GC is the finalizer thread.

The algorithm that .Net uses is generational mark and sweep, as someone else says there is no reference counting at all. See wikipedia garbage collection for details on the algorithms.

There is also nothing special about circular references. The 'mark' part of the algorithm marks all objects as 'available for collection'. Then the sweep starts at root references and walks down the object tree for each of the objects marking things as 'in use'. It stops traversing when it finds objects already marked is in use which means the tree walk doesn't get stuck on circular references. Then the GC starts at the beginning of the heap for that generation and for each object marked as 'available for collection' memory is compacted past it.

The main things to worry about with GC in general (and yes I'm sure there are specific cases you can quote, but IN GENERAL)
#1 Using most of your available memory as this will cause more frequent GCs and promotions. The algorithm for when a GC runs is apparently pretty complicated but low memory is a big part of it.
#2 mid life crisis when frequently used items get promoted into GC1 or GC2. GC0 is almost always stunningly fast. However GC2 and 3 are not and will affect your frame rate. So either make your object stay around for long time e.g. the whole level. Or go away very quickly e.g. local variables. Always call Dispose or us using {} on objects that implement IDisposable because if you don't they will ALWAYS be promoted ready for the finalizer thread which will mean many more GC2's
#3 Don't call GC.Collect to try to outsmart the GC - in general you will mess up the algorithms that the GC uses to optimize itself and may well cause more promotions than without. The only time to do collects yourself is at a time when you know there is a huge amount of memory that needs tidying up e.g. at the end of a level in a game where you dump out all the old meshes and data and load in the new stuff. In this case the GC couldn't possibly know that this was your intention and will probably wait and do the GC2 100 frames into the next level.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement