Jump to content

  • Log In with Google      Sign In   
  • Create Account






Yeah, I'm rewriting my compiler again.

Posted by ApochPiQ, 10 July 2011 · 270 views

Release 10 of Epoch was a big deal, because it represented a complete overhaul of the language implementation - pretty much everything was redone, from the parser grammars on up to the standard library implementation. It took a long time, dramatically cut back the number of features that still work, and generally destroyed more brain cells than I care to think about.

So, it may seem unusually stupid to be rewriting the Epoch compiler again as the core work behind R12. After all, R11 is barely out the door, and surely there's time to do something like, oh, say, make the language a bit more powerful?

Unfortunately, adding more features and richness to the language has hit a point of diminishing returns - not because of the language's completeness (which is not to say it's anywhere near complete), but rather because of iteration time. It's gotten to the point where compiling even the relatively trivial Era source code takes about 30 seconds. This is not acceptable.

R12 will, therefore, feature a rewritten compiler. The VM and remainder of the toolchain will remain in place; all that's really going away is the parser and the AST decoration layer.

Here's an executive summary of the changes coming in R12:

  • Replacing boost::spirit::classic with something faster
  • Replacing the extensively ad hoc code generation mechanism with a proper AST decoration layer
  • Optimizing certain elements of the compiler chain for speed and memory usage
  • Implementing separate compilation to further reduce program compile times

Right now I'm playing with boost::spirit::qi, the successor to spirit classic; it reports a roughly 20x speed increase in certain small test cases, but has some downsides. In particular, I can't seem to reliably compile the parser using qi, because it generates so much code that Visual C++ literally reports an out-of-memory error. Moreover, I don't know how well qi scales to nontrivial grammars like that used by Epoch.

So there's a good chance I may bite the bullet and go deploy a more traditional parser generation framework for the next release, because I can't afford to have my tools crapping themselves on relatively straightforward code. I really don't want to do this, because it'll make dynamic parsing a righteous pain in the ass, but what has to be done has to be done.

I hope to improve the ~30 second build time for Era to under a second. That ought to be enough for the next Epoch release, I think!




So I got qi working by decomposing the grammar into dozens of tiny little rules; this seems to make the compiler not barf, but build times are still measured in minutes for the Epoch compiler, which is kind of disappointing.

On the plus side, I'm a good bit down the road of generating a proper AST representation using qi and fusion, which is actually going fairly well and hasn't been much of a pain. So far function definitions are being parsed correctly and the AST is populated with them, but there's no support for statements or general expressions or entity calls yet. So obviously lots of work left to do.

I had dimly hoped, at one point, to get the compiler running again by the end of the weekend; but with the complete redesign of the AST generation mechanism, it'll take the rest of today just to reach a point where the AST is getting populated to begin with. After that I'll have to rewrite all the decoration logic, the semantic checking logic, and the code generation subsystem.

Blargh. Obviously this will end up being the multi-week ordeal that I dreaded it would become.
Statements are in, and I'm currently waiting on the last compile that should fix up the remaining issues with general expressions. Entity calls will probably have to wait until later this week, as I'm completely exhausted and really only forcing myself to stay awake until I can get expression parsing done.

In other news, my head is killing me.
I love the irony that build times for Epoch are measured in ... epochs. ;)
I am not exactly sure how you are doing things currently I have not really been keeping super close track on your project. By the sounds of it, if I am understanding correctly, you are going straight from source into an AST. If this is the case there is a lot of overhead there. The solution would be to use a recursive parser to walk the syntax and generate a intermediate system that would then be assembled into the AST. I am not sure if that is what qi is doing for you or not.

The issue is that boost in general is very generic. You know your syntax better then boost does so in essence if you want the compile time to go down you are going to need to drop boost and create your own parsing process manually. This will allow you to make assumptions based off syntax that qi will never be able to make. These assumptions are what are going to drop your compile time dramatically but it may take a long time to get right but leaves more room for optimization down the road.
PARTNERS