• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
  • entries
    625
  • comments
    1446
  • views
    1006398

More adventures in garbage collection

Sign in to follow this  
Followers 0
ApochPiQ

588 views

One of my least favorite types of bug to unravel is the Ball of Yarn.

A Ball of Yarn is not just one bug. It's a large number of bugs, all in close proximity, which influence and interact with each other. If you pull on one of the threads, one of two things will happen: either you will slowly gain insight into the mess, or (more likely), you'll inadvertently tighten a knot.


Garbage collectors are a prime source of Ball of Yarn bugfests. So many subtle factors can influence them going off the rails that when a GC blows up the cause is probably highly nontrivial. Here's a summary of the threads I've yanked on this past week.


The Heisenbug
These are always scary. You have a bug, that's bad enough. But then you go to investigate it, and the closer you peer, the more the behavior changes. This is a classic recipe for disaster and requires immense patience and focus to solve.

I had one earlier that was classic bizarre-o-world: if I ran my entire unit test suite against the compiler/runtime (and, by extension, the garbage collection system), certain GC-related tests would fail. Predictably. Every time.

If I ran the failing tests individually, they would pass. Predictably. Every time.

Wat.

After scratching my head for a little while it occurred to me that a previous test in the suite must be setting up some kind of condition that affected the GC behavior. This was supposed to be impossible since I've gone to great lengths to isolate each unit test from the others.

Or so I thought.

Long story short: turns out global variables are bad, mmkay? No, but seriously. It wound up being a single innocuous global that did, in fact, alter the behavior of the GC indirectly. Resetting its value between unit tests fixed the Heisenbug.


Which way is up?
Epoch is compiled to an abstract machine language by the Ahead-of-Time compiler tools. This machine language is no longer spoken by anything, since I killed the VM that ran it a few months ago. However, it does serve as a first-pass form that is mechanically translated into LLVM bitcode and then JIT-compiled to native machine code when the program is loaded.

(Eventually I want to just serialize LLVM bitcode into the program image and load/JIT it directly instead of recreating it at runtime, but that's just an optimization for later.)

LLVM has a nice feature which is integral to doing precise garbage collection, known as stack maps. A stack map is simply a data structure that describes where certain variables live on the machine stack at a given moment in time.

This is vital because stack mapping allows us to know what roots exist in the program. Roots are the variables that, via traversing pointers/references, lead us eventually to all the valid objects that are allocated and need to be kept alive instead of recycled by the garbage collector. Without roots, our option is to do conservative garbage collection (i.e. if it looks like it might be a pointer to a legitimate object, treat it as one, even if it's just a lucky-shaped int or something). Conservative collection is gross and I refuse to do it for Epoch, so stack mapping it is!

Unfortunately, LLVM's documentation around how stack maps work is... lacking, to be extremely gracious.

I spent a lot of time this week waffling on how to actually use the data in a stack map from LLVM. Turns out that if a certain number is positive, you interpret it one way; and if it's negative, you interpret it an entirely different way.

Of course none of this is written down anywhere and trying to unravel the LLVM source to understand it was less than helpful. (Bonus points: another addition to the "less than helpful" camp is the LLVM community itself, which seems to treat any work on garbage collection as some kind of plague.)

Bottom line, it was hard enough to figure out my first semi-working attempt at understanding stack map data. Second-guessing myself all week did not help, and it turns out I was probably right the first time.


Validation is important, kids
A couple other fun bugs arose from not validating data correctly. The GC formerly had a few places where, if it couldn't recognize the type of an object, would just punt and assume it was legit and had some arbitrary type. This is a terrible idea.

What would happen is the GC would start traversing memory, and come across an object it didn't recognize. It would then dutifully add that to a list of things to go back and check up on later. In some cases, the bogus data would just be ignored, and everything would continue on as if nothing had happened. In other cases, the result was silent failures to correctly mark live objects, which would then be swept by the GC and destroyed while the program was still using them. Oops.

This kind of nonsense accounted for not one but three semi-independent bugs in the GC that I fixed over the past week.


FPO is evil
I discussed this in a previous entry, but the bottom line is that some compiler optimizations aren't worth the trouble. FPO is one of them.

I managed to work around the problem by introducing the concept of a "bookmark." A bookmark is a simple list of all the times we exit an Epoch stack frame and enter a foreign stack frame, such as when calling Win32 API functions. During garbage collection, we traverse stack frames that have been bookmarked in addition to any frames we can find using standard frame-pointer walking. This covers the gaps very effectively, at a small performance penalty. I'm planning on revisiting this someday and cleaning up the cases where bookmarks are stored so I can reclaim some of that speed.


But wait, there's more!
As of now, there's still at least one bug left in the GC. It either relates to algebraic sum types, variadic functions, or both. I'm not entirely sure how to resolve this last one, aside from just hammering away at it for a few days and hoping for the best.


Despite all this, there's definite and clear forward progress being made. I hope to wrap this up soon-ish and return to the self-hosting compiler project, although I have to admit it's been an entertaining diversion.

4
Sign in to follow this  
Followers 0


1 Comment


I spent a couple of hours staring at those damn stack maps, and pretty much decided it wasn't worth my time to figure out how to use them. Good for you, figuring them out.

0

Share this comment


Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now