For comparison: at the end of my hacking marathon the other night, I had the VM running at about 950ms for the benchmark, and the JITted native code at around 720ms.
I'm now down to 920ms for the VM and 370ms for the native code.
The VM speed boost came from some minor optimizations to the instruction decoding loop, and the native code boost is from eliminating the interaction between native code and the VM's stack. In the first draft of the JITter, native code would actually read and write local variables to and from the VM's stack space as it ran; I've since modified this logic so that the native code will use the native machine stack instead for all its locals.
The LLVM assembly code for the benchmark inner function has gone from about a page to a simple mul/add instruction pair wrapped in a brief prolog and epilog code sequence. The prolog simply copies parameters from the VM stack onto the native stack, and the epilog pushes the function's return value back onto the VM stack. Everything else is done in native code.
I'm impressed with LLVM at this point. The inner function compiles to (IIRC) 8 machine instructions: a couple of mov's to copy the stack data around, an imul, an add, and a couple more mov's to shuffle the return value back onto the VM stack.
Obviously there's still substantial overhead in marshaling in and out of the VM when invoking native code in a tight inner loop. The next phase of this project will be to allow flow control logic to JIT compile as well, meaning I can move the entire benchmark loop into native code instead of relying on the VM's execution model. Once the loop itself is native, I have no doubt whatsoever that the JITter will absolutely massacre the VM code path.
At this point, I'm honestly leaning towards scrapping any plans on further VM maintenance, and just moving over to entirely native code. I probably won't do that for Release 12, just because I technically need to do a lot of bug fixing still in the compiler itself, but certainly by R13 I think the VM will basically just be a husk for what fragments of the language I haven't moved over to native code generation.
Frankly this is really, really exciting. Epoch is hurtling towards the milestone of being a language I'd actually want to use for production code.
My roadmap for future work is something like this:
- Release 12: finish compiler rewrite, get core language features all working and unit tested
- Release 13: extend native code generation as far as I can reasonably manage
- Release 14: new language features with the aim of enabling self-hosting compilation
- Release 15: self-hosted compiler for Epoch
- Release 16: official termination of the VM and conversion to purely native code with a skeletal runtime for supporting richer language features; improvements to garbage collection
- Release 17: start focusing on writing tools for Epoch and building libraries/new language features as needed
- Release 18: begin reintroducing parallelism features (threading, vectorization, GPGPU support, et. al.)
So sometime around R18 I'd like to have a language implementation that I could practically use for, say, writing games or small applications.
Granted, that's a long way in the future still, and given how easily distracted and sidetracked I am on this project, chances are good that the actual unfolding of events won't look anything like this roadmap. But it's nice to have plans... even if I intend to ignore them :-P
Even if things go perfectly according to plan, we're looking at a long damn time before R18 ships. Now if only I could convince someone to pay me to work on this full-time...