Sign in to follow this  
Aardvajk

Mark and Sweep GC - convert recursion to iteration

Recommended Posts

I must be being a bit thick at the moment, but I'm thinking about implementing a very simple stop-the-world mark-and-sweep garbage collection system for my scripting language and I can't for the life of me figure out how to turn the mark phase from a recursive approach into an iteration.

Recursive is trivial. Something like (not real code):

class Entity
{
public:
    std::vector<Entity*> refs;
    bool marked;
};

void mark(Entity *entity)
{
    entity->marked = true;
    for(auto &r: entity->refs)
    {
        if(!r->marked) mark(r);
    }
}

void gc(std::vector<Entity*> roots)
{
    // all entities mark set to false
    for(auto &r: roots)
    {
        mark(r);        
    }
    
    for(auto &r: heap)
    {
        if(!r->marked) reclaim(r);
    }
}
Running out of stack space isn't really a concern for my little toy language, but I believe it is always possible to convert a recursion to an iteration so was wondering if anyone could help me stumble over this little mental obstacle?

By the way, I am aware there are better algorithms for GC than this naive approach. I'm not worried about that at the moment, just want to get a basic system working first so I have something to profile against when I improve it later.

Thanks.

Share this post


Link to post
Share on other sites

How about using a stack to replace the recursion? 

void gc(std::vector<Entity*> roots)

{



   // all entities mark set to false

  std::stack<Entity*> entities(roots);

   while (!entities.empty())

   {

      Entity* entity = entities.top ();

      entities.pop();



      if(!entity->marked)

      {

         entity->marked = true;

         // push entities->refs to the stack

      }

   }



   for(auto &r: heap)

   {

      if(!r->marked) reclaim(r);

   }

Share this post


Link to post
Share on other sites

No, you need something to store 'this node needs further exploration'. If you really want to get rid of the stack, you can of course make a single linked list through the nodes, but that's basically distributing your stack over the nodes.

In essence, you're doing breadth-first path-finding to all reachable locations :)

Share this post


Link to post
Share on other sites

Although not helpful to your work in a modern environment.... If you want to take a real head trip, look back at how Smalltalk-80 did its mark pass in constant stack and heap space, by temporarily rewriting objects' fields as it went to become pointers back to their "parent" object (along the discovery path, that is), thus requiring only a constant amount of space (on the stack, during garbage collection) to track the nearest traversal parent, current object under inspection, and the current object-field index under inspection.

http://www.mirandabanda.org/bluebook/bluebook_chapter30.html#GarbageCollection30

 

Followup: also, in my personal experience, for a typical toy scripting language it is easier to write a generational garbage collector (don't mark & sweep; instead, move survivors to a separate clean heap space, leaving behind a breadcrumb saying "I was moved to gen Y offset X") than to write a two-pass mark/sweep collector, and it produces a zero-fragmentation memory model as well (which can be useful in producing error dumps or other debug checkpointing).

Edited by Wyrframe

Share this post


Link to post
Share on other sites

Thanks Zipster. Of course then the memory would be on the heap rather than the stack which is better.

I was wondering though if this was possible without using a stack at all? Perhaps not.

 

You could also use an allocator which attempts to use stack memory before heap memory, e.g. : https://chromium.googlesource.com/chromium/chromium/+/master/base/stack_container.h

That way you get the faster behaviour as long as you don't have too many nodes to handle each time.

You could also get a similar effect by allocating a fixed size array and using that as your stack, but calling out to the old recursive version if you run out of stack space.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this