Jump to content

April 2017 »

23 242526272829
- - - - -

Improving memory allocation patterns in the Epoch compiler

4: Adsense

As part of my ongoing effort to make the Epoch compiler idiotically fast, I've turned my attention to one of the primary killers: dynamic memory allocations.

Part of this is unavoidable, since dynamic memory has to be allocated when constructing the parse tree. But for some reason, I kept seeing really large numbers of allocations in places that just didn't warrant them. My first hunch was that I just needed to replace the allocation with something faster; so I wrote a quick linear allocator that would just spam allocations into a massive, pre-sized buffer, and then throw the whole thing away after compilation finishes.

After some hiccups, I got the whole thing to build (which takes forever now) and ran the first test. Paradoxically, it made no difference to execution speed, and it increased memory usage far more than it should have. Worse, I'm running only release-mode builds, because debug builds literally can't cope with my sample input in less than an hour or so. (That's how much excess cruft qi generates... eurgh.) This means that my profiler can't see true call stacks, but rather has to cope with the maze of semi-collapsed templated inline calls that fill the parser.

It took forever, therefore, to make a simple discovery that's had me kicking myself for several minutes now.

The AST structure for Epoch makes heavy use of boost::variant. The AST is also self-recursive, meaning that several nodes are defined in terms of themselves. This necessitated (at one point) the use of boost::recursive_wrapper<> to ensure that the compiler could figure out the circular definitions.

That requirement went away a long time ago when I started doing deferred construction of AST nodes using my own wrapper template; but I never updated all of the uses of recursive_wrapper, because they never showed up in profiling - just allocation calls.

Out of random curiosity and frustration, I finally cracked open the code for recursive_wrapper to see how it does its magic.

Yep, you guessed it: it dynamically allocates the contained type.

That means that I'm not only allocating memory for the AST nodes, I'm allocating memory for the deferred variant that points to that AST node - and adding a level of pointless indirection to boot.

The good news is, removing recursive_wrapper isn't too hard, and speeds things up, although just a tiny bit.

The bad news is, now I have a rampant memory leak, and I'm utterly stumped as to why.

One step forward, thirty steps back, it seems.

Jul 20 2011 01:10 AM

Rule Number One of writing a linear allocator in C++: you still have to manually invoke destructors even if freeing the memory is a no-op.
Jul 20 2011 01:15 AM
So what are the parse times down to now?
And for the record, how long exactly does it take to compile the compiler?
Jul 20 2011 01:25 AM
Compiling the compiler from scratch takes about 25 minutes on my laptop, which is what I've been doing all my work on recently. I really need a beefier laptop... or I need to get off my lazy ass and unpack my Core i7 workstation that I used to do all of this on before I moved.

Anyways! Parse times are hovering around 15ms, which isn't really much of an improvement considering how much work I've sunk into the last few tweaks, so obviously I've run smack into the wall of diminishing returns and it's questionable how much more effort is justified.

But I'm a total addict, so I'll probably run a few more profiles on it and see if there's any stragglers of sluggishness that I can banish.
Jul 20 2011 01:31 AM
You know, im pretty sure a few years ago that you yelled at me for getting too caught up in premature optimizations... I think you should take that piece of advice! I want to see more features dagummit! But geez whats up with those compile times, are you building Boost too??? Thats crazy!

Anyways, can you please get back to adding awesome-ness-tic features to the language :) Pretty please!? KTHX.
Jul 20 2011 01:51 AM
Hey, this is hardly premature - if anything it's long overdue. The Epoch compiler has been painfully slow for way too long, and now that I'm dogfooding it pretty heavily with Era development, I want my iteration times to be faster. This is also groundwork for adding a realtime syntax highlighting component to Era that does things like color variable and function names automatically. (The current syntax highlighter is totally unaware of the semantics of Epoch programs.)

So yeah. This is all to enable the next round of juicy features. The faster I can compile Epoch programs, the faster I can churn out new features for those same programs :-)

Note: GameDev.net moderates comments.