Timing my garbage collector

Started by
10 comments, last by Zotoaster 14 years ago
Hi all. I'm making myself a software platform - call it for practice, lol. Anyway, I've made a simple mark-and-sweep garbage collector, which works fine. My only concern is when it should be run. I don't have any particular heap reserved just for the platform. I use a vector to keep a list of all objects, and then just use 'new' and 'delete' to do the memory operations. I'm not sure when the ideal time is to run the GC. At first I thought it would be ok just to run it on certain intervals, but consider this problem: it allocates some memory, and it's *just* about to execute an 'assign' operation. But before that happens, it just happens to do a garbage collection. Well, in this case the object wont be marked, so it will be deleted, then it will try to do the assignment. Anyone know a solution to this problem? Thanks
Advertisement
Just to clarify, are you trying to make a collector that works for the C++ language, or for another language whose runtime you are implementing in C++?

Quote:
I'm not sure when the ideal time is to run the GC. At first I thought it would be ok just to run it on certain intervals, but consider this problem: it allocates some memory, and it's *just* about to execute an 'assign' operation. But before that happens, it just happens to do a garbage collection. Well, in this case the object wont be marked, so it will be deleted, then it will try to do the assignment.


The answer to this depends on a lot of factors. Can you show what your GC system looks like? Some example code, I mean.

Incidentally, there are other ways of doing resource management in C++ that are still automatic. For many things they're better than GC too, especially where deterministic destruction is desirable.
It's a collector for a runtime I'm making, using C++, yeah.

My code is extremely simple.
This is basically what it looks like for the object class:
void Mark(){    Marked = true;    for (int f = 0; f < Size; f++)        if (!Field[f]->Marked)            Field[f]->Mark();}



Then, for the list of objects:
// Mark from all rootsfor (size_t r = 0; r < roots.size(); r++)    roots[r]->Mark();// Loop through objects, remove unmarked onesfor (size_t o = 0; o < objects.size(); o++){    if (objects[o]->Marked)        objects[o]->Marked = false;    else    {        delete objects[o];        objects.erase(objects.begin() + o);        o--;    }}



It's that simple really.
Quote:Original post by Zotoaster
It's a collector for a runtime I'm making, using C++, yeah.


So in that case, can your runtime determine when an assignment is in progress (so to speak)?

Or can it be made to run only between statements, rather than between instructions?
Well it's just a bytecode processor, so yeah, it's possible to know when an assignment is happening, but there's no way of telling when a statement ends or not, and I don't want to waste valuable time by putting a NOP between where all statements should be.
I don't know what your bytecode or objects look like, but there are a few ways out.

* Your object shouldn't be in the object list until it's constructed and referenced for the first time.
* Assignment should happen as early as possible, before you even fully construct it, but that depends on what your bytecode looks like.
* Your GC can be cooperative; insert GC-yield bytecodes into your methods at appropriate places. Not my favourite strategy but it's actually used.
Thanks for the responses. I finally realised a way to solve it. Operations are done using stack-based operations (think RPN). As soon as an object's created it's pushed onto the operand stack, so that can count as a root. It's a little slower obviously having to scan the entire stack (for every method in the call stack as well), but it's worth it.
Quote:Original post by Zotoaster
Then, for the list of objects:
*** Source Snippet Removed ***


It's that simple really.


!

Is that container a std::vector or a std::list? If it's a list, you're paying for 2 pointers of overhead for each object-pointer in the objects list. If it's a vector, you're performing an O(N) operation for each erasure in the middle, needlessly (you could instead, for example, swap-and-pop; or null the pointer and perform a single erase-remove sweep for nulls at the end).

Either way, you don't appear to be doing any compacting afterwards - which involves moving the objects around in memory and thus reassigning pointers. Unless, I suppose, you do use a std::list for the object-pointer list, and the field pointers actually point into that list instead of directly to objects (i.e. are doubly-indirected handles). But then you still have to worry about how the list nodes themselves are allocated, or else you may just miss the point of doing the compacting if you aren't careful :)
Couple of possible suggestions:

  • Hint the GC to run whenever you exit a scope

  • Allow programmers to control when the GC collects (note that this is not mutually exclusive with automatic collection!)

  • Examine your loop logic very, very carefully. What happens if you try to free cell #0 in your current implementation, for example?

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

ApochPiQ, thanks very much for the useful tips. I will be thinking about those in that case!

Zahlman,

I'm using an std::vector. The difficulty with compacting is that I don't keep track of the references to the objects, so, if I move any memory around, I need to also re-point the pointers, which I don't obviously have access to.

This topic is closed to new replies.

Advertisement