Jump to content

  • Log In with Google      Sign In   
  • Create Account

The Bag of Holding

More self-hosting goodness

Posted by , 22 April 2013 - - - - - - · 576 views
Bytecode generation is done... more or less. I have a feeling there's a few instructions that are still missing but the compiler test suite passes so obviously there's plenty of stuff that does work...

Instead of spamming my journal here with noise about this every couple of days, I'm going to keep a running thought log over on the Epoch wiki.

Code generation from the IR is next, and it's scary.

Self-Hosting Progress

Posted by , 20 April 2013 - - - - - - · 559 views
As I've discussed previously, my goal for Epoch Release 15 is to get the compiler self-hosting. In a nutshell, that means that an Epoch program will be used to compile all other Epoch programs, including itself.

To do this, I'm working backwards from the compiler back-end first to the lexer/parser last. This allows me to retain all of the language's features at all times, without worrying about anything falling through the cracks along the way - which would be a serious concern if I rewrote everything from the front-end out.

Tonight I finished a nice milestone in the process: bytecode generation is now handled (mostly) by an Epoch program instead of a C++ program. This is probably the simplest piece of code to move over to Epoch because all it really amounts to is a bunch of byte sequences streamed into a buffer and then spat back out.

A handful of instructions aren't supported yet, but I'll be working on that over the weekend, and ideally sometime soon I'll have the entire compiler test suite passing with the new bytecode emitter.

Once that's in place, it's time to work on the code generator itself, which traverses the marked-up semantic IR of the language and generates bytecode streams from that. That's a much bigger piece of work and will take a lot longer, but it should be entertaining either way.

Epoch Plans for the Future

Posted by , 17 April 2013 - - - - - - · 1,158 views
Release 14 of the Epoch programming language is now live!

That brings us to the pertinent and slightly bothersome question: what will be worked on for Release 15?

There are a number of features I'm interested in improving and/or implementing, ranging from object lifetime semantics to parallelism functionality. Strictly speaking, any and all of these features could be done in the existing compiler/runtime framework in a relatively straightforward manner.

However, given the recent success of my aggressive push to destroy the Epoch VM, I'm rather in the mood to continue decreasing the amount of C++ code I have to support in the project going forward. The next obvious candidate for removal from the C++ pool is the compiler itself. I want to rewrite the compiler in Epoch and use that as the foundation for future feature work.

In other words, it's finally time to start working on self-hosting Epoch.

I'm pondering which direction to attack this from. On the one hand, I could start by writing a basic parser, parsing simple Epoch files, and then build out the back end and slowly re-add functionality. However, I've done multiple compiler rewrites in this fashion, and every time I do one, I lose fragments of functionality for a long time.

In the interests of avoiding this sort of regression, I'm seriously considering a new tactic. Instead of starting with a new compiler front-end, I'm thinking about starting with the back-end instead. The parser already interops with Epoch nicely (I use it for syntax highlighting in the Era IDE prototype) so I could potentially leave the parsing in C++ for a while and build a new code generation layer in Epoch.

Here's roughly how that would work:
  • Replace the bytecode emitter with an Epoch program
  • Replace the code generation layer with an IR traversal that shells out to an Epoch program which in turn uses the new bytecode emitter
  • Replace the AST->IR conversion with Epoch code
  • Replace the parser->AST generation layer with Epoch code
  • Finally, replace the parser itself
The advantages to this are twofold. First and most importantly, I never lose any existing Epoch features. Second and almost as interesting, I can release at any time once one of these steps is finished. This gives me a perfect opportunity to ensure the language doesn't move backwards in terms of features, while simultaneously moving it forwards toward the eventual goal of self-hosting.

Another perk of this approach is that if I do decide to add language features to help make self-hosting more practical, I can do so during any phase with minimal disruption to whatever code currently exists. I can also progressively delete code from the C++ codebase as I finish various steps instead of having to retain two independent "forks" of development.

All in all, I think this approach makes the most sense, and I'm already excited about the possibilities.

Once Epoch is self-hosting, the next step is to start on development tools. Era will be an interesting project for sure, given that my goal is to more or less invent a new IDE paradigm in the process of creating it. It's tempting to work on a debugger and other facilities prior to self-hosting, but the advantage of waiting is that I can prove that the entire tool-chain is self-contained end to end, rather than a hideous Frankenstein blend of Epoch and C++ components.

In some ways, I feel like self-hosting and creating a decent IDE for Epoch are the last two steps to having a project that is ready for serious prime-time usage. There's an almost infinite amount of refinement and feature work that can go into any language, obviously, but a language that can compile itself and offers robust tools for creating new software is hard to argue with.

It'll be fun to see if the rest of the programming world decides this language is as fun as I think it is :-)

Release 14 imminent...

Posted by , 14 April 2013 - - - - - - · 549 views
So I've been looking over my past notes and decided that I'm pretty happy with the state of affairs over in Epoch land. The garbage collector is running, tuned for reasonable performance, and successfully keeps the native-code realtime raytracer clamped at a decently small degree of RAM usage.

The only thing that really bugs me is that the GC has imposed a serious performance hit on the raytracer - to the tune of halving its old performance. I still need to do a full release build of the runtime to make sure I can't improve on that any, but I suspect the GC root spilling in LLVM is taking a toll on optimization passes. I might need to work on improving the front-end of the compiler to generate more optimal code in the first place so that I don't have to spill so many registers onto the stack all the time.

It'll take some analysis and study of the emitted machine code, but I'm optimistic about being able to reclaim at least some of the performance.

If nothing else, this is a good motivation for introducing a "manually reclaimed" semantic to object allocations in Epoch, so programs that don't want to incur GC overhead can bypass it entirely.

[As I wrote this, I kept alt-tabbing over to Visual Studio and mucking around. I managed to suppress garbage collection invocation for a large percentage of functions, which got me back to 21FPS - compared to 24FPS previously. I'm happy.]

Victory is mine!

Posted by , 14 April 2013 - - - - - - · 604 views
It was an epic fight, but I finally managed to subdue the last few LLVM garbage collection bugs and get a full test suite pass.

I'm kind of tired (it's 5:30AM and I've been running all night) so I'll try to reword that a little more clearly:

Epoch is, as of right now, passing all compilation and runtime tests (all 62 of them) with full, aggressive, and accurate garbage collection, running as 100% native code.

This support includes correct handling of circular references, correct handling of algebraic sum types (aka discriminated unions), correct reclamation of resources besides pointer types (strings and buffers, in Epoch parlance), and even works across multiple-dispatch calls.

I have a feeling there's some edge cases missing from the test suite, but for now, I'm damn happy with the results of the last several days of work. I figured on GC being a nightmare (and don't get me wrong - it was) but overall the experience has been completely worth it.

I still want to ponder submitting an LLVM patch to make all this less hackish, but I also sort of don't really care because my changes are really minimal and doing them "right" would involve a lot of heavy lifting. Maybe I'll tackle it some other time.

For now, I'm going to just sit here and bask in the glow of all 62 tests passing, and maybe drool on myself a little bit as I fall asleep.

Real-world garbage collection with LLVM

Posted by , 10 April 2013 - - - - - - · 2,266 views
The last major task for release 14 of the Epoch programming language is integrating garbage collection with the LLVM-supported JIT native code generation layer.

Ostensibly, LLVM supports hooks for making garbage collection possible. It doesn't take much digging to find out that this is a complete lie, and LLVM is actually pretty terrible at interoperating with garbage collection schemes.

First of all, you can only mark stack allocations as holding GC roots. Registers must be spilled into stack values prior to running a GC pass or you can miss roots. More annoyingly, you can only declare GC root stack allocations in the prolog of a function - you can't mark them inside, say, the body of a loop. This means you either generate a lot of excess stack reads/writes, or you do something twisted to work around the problem.

I'm proud of Epoch's runtime performance right now, so I opted to go the twisted route.

The Epoch garbage collector will now only trigger on the exit from a function or after a certain number of iterations of a closed loop (to avoid excess memory consumption in tight loop code). This means that you will never trip a garbage collection in the middle of a complex expression, which in turn limits the number of stack hits I have to perform. So that much I can suffer through.

To do accurate garbage collection, we need three things:
  • A map of all stack slots that can contain GC roots
  • A way to get this map into the GC so it can use it
  • A way to crawl the machine stack based on this map and begin mark/sweep tracing from the discovered roots
So LLVM has a nifty @llvm.gcroot intrinsic which lets you accomplish the first goal; the code generation layer will output a stack map for you.

Except it doesn't.

What it does is create a stack map and then make it totally inaccessible to JIT frontends, which strikes me as more than a bit stupid. I quickly found the comment buried in the documentation suggesting that you can only access the stack map when outputting assembly language listings. WTF?! It was at this point that I started really feeling the truth that others have remarked about on the internet: LLVM just sucks for garbage collectors.

Me being stubborn and thick-headed, though, I decided to just plow forward and hope for the best.

I hacked around the unavailable stack map issue by implementing a custom GCStrategy class with CustomSafePoints set to true. When findCustomSafePoints() is called, I do the exact same logic as the LLVM default implementation of safe-point location, with two tweaks:
  • All safe points are cached in a location accessible to the frontend (and, later, the GC as running against the JITted code)
  • I actually implemented support for GC::Return safepoints so I can call garbage collection routines upon function exit. LLVM provides GC::Return and GC::Loop "kinds" of safepoints but then stupefyingly fails to implement support for either one.
The next annoying omission in LLVM is the lack of a stack walking function to actually use stack maps. I suppose this can be forgiven since LLVM is not (ostensibly) in the business of target-specific stack walking code. But it's still annoying.

Thankfully I've spent enough time doing reversing and assembly-level hacking that I'm comfortable with the idea of crawling the machine stack myself. It didn't take long to put together a function for this, although getting it to not clobber the machine state was a bit trickier.

There are two tricks to keep in mind when doing post-function-body garbage collection invocations:
  • Back up an instruction before the RET before you CALL the garbage collector. This lets you inject the CALL prior to the POP EBP that destroys the function's stack frame and makes this whole exercise pointless.
  • Don't forget that you were about to RET from a function; write some custom prolog/epilog code to ensure you don't clobber EAX during garbage collection. Or at least save it off someplace so you can restore it once you're done clobbering it.
Armed with a stack crawler, I found one final hurdle that is probably the most shining example of LLVM's complete lack of consideration for garbage collector authors: once you generate a safe point, you can't actually determine what machine address it's at in the emitted machine code! Derp.

So I wound up making a minor API tweak to the ExecutionEngine class in LLVM, adding in plumbing to get to the underlying JIT engine and ask for the address of the MCSymbol corresponding to the injected GC safe point. It took some minor arithmetic to adjust the reported address to line up with actual return addresses in each stack frame, but nothing too painful.

The results speak for themselves:

Attached Image

So as of now I have a working way to crawl the stack and look for GC roots. I'm starting out only supporting GC on strings since that's easier than dealing with all the other types in the language; but that'll come soon enough. I want to perfect the actual mark/sweep logic first and then mess with extending it to handle the whole type system.

Next step is to go through and actually do mark/sweep and emit a list of live strings versus garbage strings. Should be entertaining.

Overall, although I've been cursing at LLVM a lot today, it's not been too terrible. It certainly beats having to build my own machine code emission layer. Part of the reason I've stuck with it is because it does do a great job of just generating fast machine code, so I'm not totally convinced it's worth abandoning and using some other JIT engine, just for the sake of GC implementation.

If anything, I'd rather find ways to actually contribute some useful changes to LLVM that make the whole GC song and dance a little less unpleasant for the next guy.

Epoch Release 14 Finalization Plan

Posted by , 08 April 2013 - - - - - - · 729 views
Last night at some unholy hour I finished a first pass at marshaling Epoch data back and forth across C-ABI boundaries. In less annoying terminology, this means that Epoch programs can do things like call Windows API functions, C-compatible APIs in other DLLs, and so on. More interestingly, those APIs can be given Epoch functions as callbacks, so there is seamless interoperability between C-compatible programs and Epoch programs.

This marks a great milestone in Epoch Release 14's development. There are now 58 compiler tests in the suite and all of them pass as fully native JIT compiled machine code.

As I mentioned previously, the last piece remaining to get Epoch fully-featured in native code is to implement the garbage collector. There are also some minor bugs related to Win32 interop that mean the raytracer demo is currently non-functional, so I probably have some work to do on the marshaling layer to get that all squared away.

My plan for Release 14 therefore looks something like this:

  • Fix whatever bugs remain in marshaling
  • Get the realtime raytracer benchmark running as 100% native code
  • Map out garbage collection implementation (will post on this later)
  • Implement garbage collection at least in basic form
  • Final compiler test suite and probably a test harness for ensuring garbage collection works properly
  • Package and ship R14

Of course, if you've followed the Epoch project at all in the past, you probably know as well as I do that I rarely stick to my release plans, so we'll see how it goes :-)

Once R14 ships, there's a few pieces I need to go back and hit. Here's the gist of what I'd like to do in the future with the language, in no particular order:

  • Better reference semantics for controlling object lifetime and interlinked data structures
  • Implementation of tasks (green threads)
  • Implementation of message passing model
  • Separate compilation and module support
  • Enhanced template support for better generic programming features
  • Namespace support
  • Native support for array types

Naturally one of my big goals for the language moving forward is to build a self-hosting compiler and a proper IDE. Most of the above features are geared around making my life easier so I can actually accomplish those two projects. Of course, it may not be worth writing too many compiler features in the current system before moving to the self-hosting model...

So (with the above caveat of not following my plans very well) I just might wind up working on self-hosting instead. It all kind of depends on how entertaining this stuff is and where my whimsy directs me.

Another major thing I need to do that I have lamented about repeatedly is build a proper debugging facility for the native code model. Right now any bug in the code leads to lots of hunting through cryptic disassembly listings and requires a lot of painstaking effort. A proper debugger would probably fall under the Era IDE project heading, though, and thereby might wait a while. Even though I really need it.

Anyways... lots of speculation but as always we'll just have to see what the future brings. I'm super excited about all this and really looking forward to getting the language to a point where I can write hardcore software in it.

After all, that was kind of the point from day one :-)

Almost there...

Posted by , 07 April 2013 - - - - - - · 512 views
Today I wrapped up the last Epoch compiler test suite failure besides C-API marshaling. That means 53 of 54 compiler tests are passing as pure native code - no VM to get in the way and slow things down.

Once C marshaling is in place, I'll need to rework the garbage collection implementation a bit, and then I'll have a complete Epoch implementation in JITted native code.

Needless to say, I'm pretty stoked.

Of course, C API marshaling is going to be fairly nasty to reimplement in the JITter, since I'll have to trampoline not just into the VM but into native code. This will require some calling convention wizardry and more than a little assembly language hacking, but at least that stuff is fun.

I'm not so much worried about the actual implementation details as the debugging of all the inevitable things that I'll get wrong on the first pass. Not having a first-class debugger makes this whole process rather painful, as it isn't always clear what Epoch code maps to what JITted machine code. I really need to work on improving that, possibly by writing a custom LLVM pass to emit debug information during the JIT process. In any case, it should be interesting.

I'm waffling on when to pull the trigger on Release 14 in the meantime. I at least want the compiler suite to pass, but having garbage collection is more or less mandatory, and getting a reasonable implementation of that will take time. Stack frame management is possible in LLVM but object lifetime determinacy is kind of a pain and a classic mark/sweep collector like the VM used might get messy.

Either I'll find some good papers on the subject and just copy someone else's approach, or I'll be doing a little pioneering in that department, in which case I'll be sure to write up my findings here.

Stay tuned... Epoch is really close to being officially a massively amazing language implementation. I can hardly wait to get back into some of the more esoteric parallelism features and other nifty stuff that's been on the back burner for so long.

Implementing Higher Order Functions on Top of LLVM

Posted by , 07 April 2013 - - - - - - · 1,154 views
One of my favorite features of the Epoch programming language is the inclusion of first-class higher order functions:
apply : (thefunction : )

apply : string param, (thefunction : string -> string) -> string ret = thefunction(param)

mutate : string param -> string ret = param ; " foo"

entrypoint :
    string s = apply("test", mutate)
    assert(s == "test foo")
In the old VM model (which, as I've written about quite a bit recently, is going away) this is pretty easy to do. I just store the name of the function I want to invoke on the stack, and the VM knows how to jump into that function using an instruction called INVOKE_INDIRECT.

This works because the calling of functions is not checked at runtime - it's verified statically by the compiler, but once the bytecode is generated, there's no sanity checking to ensure that I've bound the correct number of parameters to the stack before invoking a function. Not having any checking allows higher-order functions to be invoked trivially in Epoch's calling convention: the callee simply reads off the expected parameters, and everything just works.

However, moving to JIT native code implementation on LLVM has proven tricky in this particular area. LLVM does not let you simply call arbitrary functions without knowing their signatures - which is a totally reasonable restriction.

Our saving grace is that we know the signature of the function already - it was checked at compile time, after all - so we just need to look at the Epoch bytecode metadata to figure out how to tell LLVM to invoke a function.

Remember how I mentioned that the VM used to call functions by name? This is a minor headache, because while the function signature is statically known, the mapping of the name to actual code can change dynamically at runtime. (If it couldn't, higher order functions would be kind of useless.) Suppose my example program selected a function to pass to apply() based on user input; the JIT layer wouldn't know which function call to embed into the LLVM bitcode.

The lazy solution would be to store the mapping in memory and use a thunk to translate the function name on the stack back into a code address. However, this would represent an annoying runtime cost, and might obliterate the possibility of doing certain inlining optimizations for simple uses of higher-order functions. So that option is out.

The proper fix is to change the compiler a bit. We'll still emit function names and push them onto the stack in the bytecode; however, we'll include a type annotation that mentions that this isn't just another string handle, but actually a function name handle.

Next, once the bytecode is loaded into the JIT layer, we'll detect these annotations and note each time a function's name is used as a higher order function parameter. Instead of emitting code to store the name on the machine stack, we'll look up the LLVM generated function for the target, and emit a function pointer instead. Remember, we know the signature statically, so we can get away with this without violating type safety in LLVM.

Last but not least, we implement INVOKE_INDIRECT in such a way that it looks at the function pointer it's being asked to call, and then fixes up the machine stack and calls it.

Unfortunately, it turns out that adding the metadata emission of function signatures is a nontrivial modification to the code. But it's the right thing to do, so I'm going to tackle it either way.

So far I'm an hour or so into this project and it looks like it might take a hefty couple of days to get it done properly. Such is life, I suppose.

I look forward to having 3 or 4 more of the compiler tests in the suite pass once this change is in!

[edit a few hours later] It's now 3:22 AM. 51 of 54 tests are passing. I am a happy, happy hacker.

April 2013 »