Jump to content

  • Log In with Google      Sign In   
  • Create Account






Self-hosting the Epoch Compiler: Day Two

Posted by ApochPiQ, 11 December 2013 · 409 views

Epoch language design
A large number of the errors emitted by attempting to self-host the compiler have turned out to be caused by a relatively small number of bugs.

Hex literals had no support at all in the compiler, so I added that, and crushed a bunch of errors. I forgot to special-case 0 so anything that evaluated to 0 would not be treated as a number (the compiler assumed that it was a string since it couldn't be "cast" into a non-zero number). That was easy enough to fix.

The next big bug was a pretty painful face-palm moment. The type checker would permit certain combinations of types when algebraic sum types are involved (correctly) but it wouldn't consider any particular sum type to be compatible with itself. Derp. Another easy fix.


Running the compiler is a major chore. Parsing alone is 40 seconds or so, and semantic analysis takes many minutes to run each time I want to attempt doing a self-hosting pass. Progress is slowed primarily by this fact, and also by my incessant distractability that leads me to do other things while waiting on the compiler... and then I forget where I was... and so on. It's exactly the annoyance of long compile times that made me hate C++ in the first place.


I still suspect there's a bug in templated sum types, but it's hard to pin down. I may need to write some new test cases to figure out exactly what causes the type checker to barf on template instances.


Just for the hell of it, I attached a profiler to the running compiler. Turns out that garbage collection is still dominating execution time, by a huge margin. I suspect that each invocation of the garbage collector is taking many seconds to complete, and the semantic analyzer generates a lot of garbage, so it stands to reason that eventually the compilation process would become painfully slow. I remain hopeful that I can get the compiler significantly faster just by tuning the garbage collector implementation.


I borked the case where hex literals could be greater than INT_MAX (i.e. if they're unsigned), so that took another re-run to get fixed. Thankfully it's all easy stuff to tweak at this point.

I feel like an idiot at the moment; a couple of times I've "mysteriously" lost work from the Era IDE. Originally I suspected some deep bug, but it turns out that I'm just stupid. I was trying to save a file that was open in another process, and the Era implementation doesn't message failed saves at all yet, so I blindly assumed that the save was OK and closed the IDE, clobbering all my work. This should be easy enough to avoid though, as long as I pay attention.

Or maybe I could just fix the damn IDE bug.


Getting impatient with the slow compiles, so I'm building a version of the runtime that totally disables the garbage collector, just to see how much difference it makes...

... aaaaand OUCH.

Without GC, the compiler parses in 2.8 seconds (versus 40!) and then goes on to do more semantic validation in a couple of seconds than it used to do in several minutes.


This is revealing, and important. I'll have to do more work on the GC soon.

For the meantime, though, it looks like there are still some miscompiles to deal with. More interestingly, there's a crash in the compiler when it gets far enough along; something doesn't like what it's being fed. The crash also occurs with GC enabled (it just takes 10 minutes to get there), so I'm confident that it's not a red herring. It'll take time to pin down though, because the sheer size of the compiler source makes it hard to accurately determine where things are blowing up.


I had a couple leads on possible explanations for the crash, but they all turned out to be dead ends. So I'm pretty stumped, although prior error messages might be informative (there are some type checker failures that are bizarre).

And naturally it turns out that fixing the type checker causes the crash to stop. I suppose I should be happy.

Now I'm getting an assertion failure because the type "0" doesn't exist. Well, duh. But what's asking for something of type 0?


Setting that aside for a moment, I wanted to start at the top of the compilation error list and start looking for other bugs to fix. The first one was interesting: it's not actually a bug in the Epoch compiler. It's a bug in the C++ version of the compiler, which happens to break type safety, but it silently works by sheer accident!

So it seems that rewriting the compiler is already a win, because it caught some type checking bugs that I probably never would have found otherwise.


After a handful of other little fixes and improvements, I'm seeing a dramatically smaller set of error spam, which is encouraging. For now, though, I think it's time to call it a night.




So I totally fail at quitting things on time.

Managed to get a bunch of extra bugs worked out, and finally have the compiler completing semantic analysis on itself. It still fails, mind you, but it's at least analyzing the entire program. It takes about 48 seconds to do it, which isn't half bad considering the algorithmic stupidity of most of the implementation.

Optimizing this sucker is going to be entertaining...

October 2014 »

S M T W T F S
    1 234
567891011
12131415161718
19202122232425
262728293031 
PARTNERS