Sign in to follow this  
Zotoaster

Timing my garbage collector

Recommended Posts

Hi all. I'm making myself a software platform - call it for practice, lol. Anyway, I've made a simple mark-and-sweep garbage collector, which works fine. My only concern is when it should be run. I don't have any particular heap reserved just for the platform. I use a vector to keep a list of all objects, and then just use 'new' and 'delete' to do the memory operations. I'm not sure when the ideal time is to run the GC. At first I thought it would be ok just to run it on certain intervals, but consider this problem: it allocates some memory, and it's *just* about to execute an 'assign' operation. But before that happens, it just happens to do a garbage collection. Well, in this case the object wont be marked, so it will be deleted, then it will try to do the assignment. Anyone know a solution to this problem? Thanks

Share this post


Link to post
Share on other sites
Just to clarify, are you trying to make a collector that works for the C++ language, or for another language whose runtime you are implementing in C++?

Quote:

I'm not sure when the ideal time is to run the GC. At first I thought it would be ok just to run it on certain intervals, but consider this problem: it allocates some memory, and it's *just* about to execute an 'assign' operation. But before that happens, it just happens to do a garbage collection. Well, in this case the object wont be marked, so it will be deleted, then it will try to do the assignment.


The answer to this depends on a lot of factors. Can you show what your GC system looks like? Some example code, I mean.

Incidentally, there are other ways of doing resource management in C++ that are still automatic. For many things they're better than GC too, especially where deterministic destruction is desirable.

Share this post


Link to post
Share on other sites
It's a collector for a runtime I'm making, using C++, yeah.

My code is extremely simple.
This is basically what it looks like for the object class:

void Mark()
{
Marked = true;
for (int f = 0; f < Size; f++)
if (!Field[f]->Marked)
Field[f]->Mark();
}




Then, for the list of objects:

// Mark from all roots
for (size_t r = 0; r < roots.size(); r++)
roots[r]->Mark();

// Loop through objects, remove unmarked ones
for (size_t o = 0; o < objects.size(); o++)
{
if (objects[o]->Marked)
objects[o]->Marked = false;
else
{
delete objects[o];
objects.erase(objects.begin() + o);
o--;
}
}




It's that simple really.

Share this post


Link to post
Share on other sites
Quote:
Original post by Zotoaster
It's a collector for a runtime I'm making, using C++, yeah.


So in that case, can your runtime determine when an assignment is in progress (so to speak)?

Or can it be made to run only between statements, rather than between instructions?

Share this post


Link to post
Share on other sites
Well it's just a bytecode processor, so yeah, it's possible to know when an assignment is happening, but there's no way of telling when a statement ends or not, and I don't want to waste valuable time by putting a NOP between where all statements should be.

Share this post


Link to post
Share on other sites
I don't know what your bytecode or objects look like, but there are a few ways out.

* Your object shouldn't be in the object list until it's constructed and referenced for the first time.
* Assignment should happen as early as possible, before you even fully construct it, but that depends on what your bytecode looks like.
* Your GC can be cooperative; insert GC-yield bytecodes into your methods at appropriate places. Not my favourite strategy but it's actually used.

Share this post


Link to post
Share on other sites
Thanks for the responses. I finally realised a way to solve it. Operations are done using stack-based operations (think RPN). As soon as an object's created it's pushed onto the operand stack, so that can count as a root. It's a little slower obviously having to scan the entire stack (for every method in the call stack as well), but it's worth it.

Share this post


Link to post
Share on other sites
Quote:
Original post by Zotoaster
Then, for the list of objects:
*** Source Snippet Removed ***


It's that simple really.


!

Is that container a std::vector or a std::list? If it's a list, you're paying for 2 pointers of overhead for each object-pointer in the objects list. If it's a vector, you're performing an O(N) operation for each erasure in the middle, needlessly (you could instead, for example, swap-and-pop; or null the pointer and perform a single erase-remove sweep for nulls at the end).

Either way, you don't appear to be doing any compacting afterwards - which involves moving the objects around in memory and thus reassigning pointers. Unless, I suppose, you do use a std::list for the object-pointer list, and the field pointers actually point into that list instead of directly to objects (i.e. are doubly-indirected handles). But then you still have to worry about how the list nodes themselves are allocated, or else you may just miss the point of doing the compacting if you aren't careful :)

Share this post


Link to post
Share on other sites
Couple of possible suggestions:

  • Hint the GC to run whenever you exit a scope

  • Allow programmers to control when the GC collects (note that this is not mutually exclusive with automatic collection!)

  • Examine your loop logic very, very carefully. What happens if you try to free cell #0 in your current implementation, for example?

Share this post


Link to post
Share on other sites
ApochPiQ, thanks very much for the useful tips. I will be thinking about those in that case!

Zahlman,

I'm using an std::vector. The difficulty with compacting is that I don't keep track of the references to the objects, so, if I move any memory around, I need to also re-point the pointers, which I don't obviously have access to.

Share this post


Link to post
Share on other sites
Nope. I don't keep track of free space. All the allocation and deallocation of objects is just done using 'new' and 'delete'.

Consider when an object is made:


case OP_NewObject:
{
CObject* obj = new CObject();
_objectList.push_back(obj);
break;
}


That's roughly how it happens.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this