Jump to content
  • Advertisement
Sign in to follow this  
Jotaf

Improvements to reference counting

This topic is 4809 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I just read some stuff on reference counting vs. garbage collection, and couldn't help but wonder: If the major problem with reference counting is cyclic referencing, haven't there been any efforts to account for this? I thought about some sort of recursion, but after typing a whole post I realized there were some major flaws in my design (it still wouldn't work in some special cases). However, I still think this could be explored more. Does anyone else think that this is feasible?

Share this post


Link to post
Share on other sites
Advertisement
Yes, there has been much exploration of the topic, more people here seem to think: you're not the first ones to think about how to make reference counting "work". [smile]

If you know where the pointers are in the object, it is possible to walk down the chains and to detect whether there is a cycle or not. The problem is then - in which order do you start destroying the objects? Python, which uses reference counting, will detect cycles that way. But if the objects have destructors, it will not collect them, rather it will list them in an "uncollectable garbage" list. It is then up to the programmer to figure out what to do with it. If the objects don't have destructors, it will happily collect them.

I would also like to point out that there is a false dichotomy here. Reference counting is a form of garbage collection. Just like scripting languages are programming languages.

Share this post


Link to post
Share on other sites
You can do pretty well with reference counting by supporting weak references--references which do not affect the reference count. This gets you out of common situations like a tree structure where a parent refers to its children and the children refer back to the parent; make the child-to-parent references weak and the whole tree will be correctly destroyed when outside references to the root are destroyed. Of course, this means users have to do more thinking about which references should be strong and which should be weak. (In my experience, that isn't usually a difficult call.) It's also a little more expensive to do it safely, where weak references become nil references when their referenced objects die, but often that's useful functionality anyway.

Share this post


Link to post
Share on other sites
I've spent the last year working on an engine heavily based on a refcounted scripting lang(Squirrel). We have the main loop in script, level loading and 100% of game code is in script. How many times we had to deal wth reference cycles? let me think... never.
Our project is an MMOG, at full load our server will something like 20000 game objects(esimated) and probably 20-30Mb of memory allocated by the scripting lang. In this case for instance is the GC that just wont work.
Both systems have problems, the only memory model that has't any apparent problem is the linear model(linearity), but very few have researched in that direction and it forces the programmer to learn very different programming paradims than the commonly used in imperative languges.

Alberto

Share this post


Link to post
Share on other sites
Quote:
Original post by fagiano
How many times we had to deal wth reference cycles? let me think... never.


How do you handle mutual references between objects?

Share this post


Link to post
Share on other sites
There are many ways to do garbage collection. Maybe you should start reading the paper of wilsen here:
http://citeseer.ist.psu.edu/wilson92uniprocessor.html
http://citeseer.ist.psu.edu/348183.html
http://citeseer.ist.psu.edu/jones03garbage.html
http://citeseer.ist.psu.edu/2517.html
http://citeseer.ist.psu.edu/wilson95dynamic.html

A garbage collector is allowed to free memory that is not referenced by the application anymore. This way you can prevent dangling pointers, since the memory will not be deleted as there is a reference to it. But you still have to care about memory leaks. Even if Java does GC, doing "myObj = null;" will help the GC. It is similar to a free/delete in C++. Another option would be to use weak pointers.

Another kind of view is that a garbage collector provides a delayed free(). Memory allocation may be as simple as "return (memptr -= size);", which is even faster than what malloc/new provides (that's basically what you can do with generational GC algorithms) and possibly also faster than reference counting. Freeing the memory is done when the cpu is idle, which may result in an increased speed of your application in some cases. On the other hand tracking the references requires additional time (the required time is even bigger if garbage collection is not supported by the language).

Reference counting is a very easy but effective way to do GC. In every operating system reference counting is used, i.e. to keep track of open files, processes etc. A drawback of reference counting is the overhead that happens whenever a pointer is used/assigned. So generally its not a good idea to use it in conjunction with other GC algorithms. Also cycles are problematic, but this issue can be solved easily as already pointed out by Sneftel.

To keep track of the pointers you may either have language support (like Java has), tag them somehow (in C++ via smart pointers) or be conservative (like Boehm-Weiser). In the case of a conservative GC reference counting may provide a help, since there are less memory locations that may have been a pointer.

Fruny mentioned destructors, and that is an interesting point. The semantic behind a destructor is, that the object does not exist anymore after the destructor has been called. Now imagine two objects A and B that point to each other. If you call the destructor of one of those objects, say of A, then the pointer in the other object becomes invalid. So now B contains a dangling pointer, and calling its destructor may be crash the application. Such an error would really be hard to find. Maybe that's the reason why in java such functions are called finalize() with a different semantic. If you provide both, allocation of memory on the stack and on the heap, you should really think about this point.

Another thing when talking about GC is multithreading. You can either stop the memory allocator while doing the GC. This is a good solution for simulations etc. where a small break does not create problems, but it is a bad solution for interactive applications like games. Another way is to do incremental GC, that is each time memory is allocated, the GC goes a little step ahead. But this will effectively reduce the overall speed of the application, no 'delayed free' anymore. A third solution would be to use coloring. Whenever a pointer is read/written, it changes color. The GC runs as a low priority thread in the background a marks the memory from the roots (global data, stack, cpu registers!), then performs a delete on all untouched memory objects.

I hope this gave you some ideas.
Good luck.

Share this post


Link to post
Share on other sites
Ok, I see that after all there has been a lot of research on the topic before :)

The problem with recursion is that it's possible to have a cycle with some elements in the middle being referenced by various pointers. This way, you'd have to traverse the whole cycle to find out that it's still connected to a safe pointer somewhere -- and even then, how can you be sure that it's safe without checking a whole bunch of other pointers?

There's also the issue of deleting an element in the middle, how do you figure out which elements are now unreferenced? You might need to delete a whole chain, or not do anything at all! It's not exactly linear.

I'm sorry if I rambled a bit, this is hard to put into words. The bottomline is, it will make a lot of unnecessary searches just to find that everything is all right after all; and this basically kills the advantage of RC over brute-force GC. Maybe this is not the way to go.


I'm not too crazy about weak pointers. It forces you to think in terms of the limitations of the machine. Pointers can be quite a hassle sometimes, and maybe it's to much, having to think about how the GC works to chose between weak or strong pointers. It would be nice if the programmer's life was made easier.

I just had an idea. What about if an object was forced to always have a strong pointer, but only one, and all the others were weak pointers?
You could think of it as if the first pointer, defined at the time of creation, was the "physical location" of the object. All others would be simple "arrows" pointing to it.
This probably needs some critique and improvement :) It might be a terribly useful idea, or a terribly stupid one!


Fagiano, you make a great point that in the real world these issues might be rare. However, I don't understand your last paragraph about a "linear model".
Concerning mutual references between 2 objects, would it be too bad if each pointer always worked both ways? It wouldn't take much more memory, and it would make connecting 2 objects in any situation a no-brainer.
The issue of dangling pointers would be solved, since when deleting an object, you would follow the pointers back and set them to NULL.

Hmm... I'm getting the impression that I might be on a stupid ideas rampage (hangovers suck), so however good they might sound to me right now, maybe I'd better shut up ;) Comment please!!

[EDIT]
nmi: I was just interested in the basic core of GC, but you're right that there are lots of other problems with destructors and all. Thanks, I'll check out those links.

Share this post


Link to post
Share on other sites
Quote:
Original post by Kylotan
Quote:
Original post by fagiano
How many times we had to deal wth reference cycles? let me think... never.


How do you handle mutual references between objects?


Squirrel implements weak references. a weak reference behaves like a normal reference to an object but does not increment the refcount. When the referenced object get destroyed, the weak reference becomes 'null' automatically.

look at the following Squirrel code


local a = { stuff = null}
local b = {}

a.bref <- b
b.aref <- a.weakref();
//now you can use 'aref' as it would be a normal(strong) reference
b.aref.stuff = "I'm stuff"
//but...

a = null;
//from now on 'b.aref' is null


I must say we use very little(or none) weak references, I gotta ask the guy that does most of the gamecode.
The fact that in gamecode you write very high level code so mutal references are very rare. You'd never have stuff like a double linked list or similar.
However I plan to use them for some funky resource loading policy that I have in mind. [OT] I recently arrived to the conclusion that using scripts for managing your data lifetime is amazing.

I hope this helps
Alberto

Share this post


Link to post
Share on other sites
Quote:
Original post by fagiano
Both systems have problems, the only memory model that has't any apparent problem is the linear model(linearity), but very few have researched in that direction and it forces the programmer to learn very different programming paradims than the commonly used in imperative languges.

Would you mind providing any references at all? I couldn't find anything on google.

Share this post


Link to post
Share on other sites
Quote:
Original post by Jotaf
I'm not too crazy about weak pointers. It forces you to think in terms of the limitations of the machine. Pointers can be quite a hassle sometimes, and maybe it's to much, having to think about how the GC works to chose between weak or strong pointers. It would be nice if the programmer's life was made easier.

I don't agree. Even in the absence of a need to think about machine limitations, it's useful to have the concept of "ownership". Weak pointers can be seen as non-owning refernces. That might sound a little like the tail wagging the dog, but I've found myself annotating certain references as "non-owning" in languages that do not support weak pointers, purely as a concept check.

Quote:
I just had an idea. What about if an object was forced to always have a strong pointer, but only one, and all the others were weak pointers?
You could think of it as if the first pointer, defined at the time of creation, was the "physical location" of the object. All others would be simple "arrows" pointing to it.

Well, that in a nutshell is the idea behind primary ownership. I would say, though, that there are instances where you want more than one owner.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!