Epoch runtime: now with StupidFast

posted in The Bag of Holding

Published April 08, 2012

Did some more experimentation with LLVM and Epoch's JIT system; and damn I am pleased with the results.

After finally muddling through enough of the llvm-opt source code to figure out how to order all the optimization passes that LLVM offers, I got a lot of basic optimization and dead-code elimination working on the bitcode that the Epoch VM JITs out. The single biggest contributor to the new-found speed, though, is my addition of simple flow control to the native code emitter.

In a nutshell, this means you can write while() loops in Epoch code that is tagged as [native]. The practical upshot of this is that loop unrolling and constant propagation can now apply to Epoch code, thanks to LLVM. In turn, this means that my nice little brute-force benchmark from the last few days is basically now a couple of multiplications and nothing else.

To recap, here's the modified benchmark using native-code-generation of loops:

//
// JIT.EPOCH
//
// Just in time compilation test for Epoch
//


timeGetTime : -> integer ms = 0 [external("WinMM.dll", "timeGetTime")]


entrypoint :
{
  integer four = 1 + 3
  assert(four == 4)

  vmbench(2, 3)
  jitbench(2, 3)
}


vmbench : integer a, integer b
{
  integer begintime = timeGetTime()
  integer result = vmmath(a, b)
  integer duration = timeGetTime() - begintime
  string durationstr = cast(string, duration)
  print("VM benchmark lasted: " ; durationstr)
  string resultstr = cast(string, result)
  print("Result: " ; resultstr)
}


jitbench : integer a, integer b
{
  integer begintime = timeGetTime()
  integer result = jitmath(a, b)
  integer duration = timeGetTime() - begintime
  string durationstr = cast(string, duration)
  print("JIT benchmark lasted: " ; durationstr)
  string resultstr = cast(string, result)
  print("Result: " ; resultstr)
}


vmmath : integer a, integer b -> integer ret = 0
{
  integer counter = 0
  integer result = 0

  while(counter < 1000000)
  {
	result = a * b
	ret = ret + result
	counter = counter + 1
  }
}


jitmath : integer a, integer b -> integer ret = 0 [native]
{
  integer counter = 0
  integer result = 0

  while(counter < 1000000)
  {
	result = a * b
	ret = ret + result
	counter = counter + 1
  }
}

Since the two seed values come from outside LLVM's scope of knowledge, it can't quite turn the benchmark into a no-op, but it can get it down to this:

; ModuleID = 'EpochJIT_4181'

define void @dostuff(i32, i32) nounwind {
entry:
  %2 = inttoptr i32 %0 to i32**
  %3 = load i32** %2, align 4
  %4 = load i32* %3, align 4
  %5 = getelementptr i32* %3, i32 1
  %6 = load i32* %5, align 4
  %7 = mul i32 %6, %4
  %8 = mul i32 %7, 1000000
  store i32 %8, i32* %5, align 4
  store i32* %5, i32** %2, align 4
  ret void
}

In the quite-likely scenario that you haven't memorized LLVM's assembly syntax, here's what's going on:

We have a comment indicating that this is an Epoch-JIT module
There's a function definition indicating that the JITted function takes two parameters and returns void
There's an entry-point label for the function
We take the 0th parameter to the function and cast it from a 32-bit integer to an int** (in C parlance)
We then dereference this to get an int*, which is the stack pointer from the VM
Next we load the value off the stack into a local register
Then we increment the stack pointer by one int
...and grab another value off the VM stack into a local register
Then multiply the two values
Then multiply by 1 million to account for the loop that used to be there
Then we store the result back onto the VM's stack, as it expects
Lastly, we update the VM's stack pointer
And return!

Here's a dump of running the new JIT test:

Epoch Language Project
Command line tools interface

Executing: D:/Epoch/Programs/JIT Tests/jit.epoch

Parsing... finished in 7ms
Validating semantics... finished in 2ms with 0 error(s)
Generating code... finished in 0ms

DEBUG: VM benchmark lasted: 3780
DEBUG: Result: 6000000
DEBUG: JIT benchmark lasted: 0
DEBUG: Result: 6000000

That's right... counting to 6 million takes the VM 3.78 seconds, and the JITted native code is so fast it can't even be measured by the benchmark. Which stands to reason, considering it's just a couple of imul instructions.

I now stand by my tentative decision to complete deprecate the VM as an execution model. I might keep it as a convenient IR for code and as a jumping-off-point for the JITter, but... going from over 3 seconds to unmeasurably fast is just too cool to pass up. There's really no major wins to be had in hacking on the VM anymore if I can get such excellent performance from JITted code and invest my time in making the JITter more powerful instead.

Next up, I need to do some major refactoring of the JITter and make it not... suck. Then I'll see about finishing up the remaining work items for R12, and kick that puppy out the door.

Woot!

Previous Entry LLVM Madness

Next Entry New scribbling!

1 likes 1 comments

Comments

Alpha_ProgDes

You really should trademark that!

April 14, 2012 02:33 PM

You must log in to join the conversation.

Don't have a GameDev.net account? Sign up!

ApochPiQ

Author

Epoch runtime: now with StupidFast

Comments

ApochPiQ

Latest Entries

A Few Farewells

Code Reuse In Actual Practice

Source-Level Debugging For Epoch Programs

Using Poison to Reverse Engineer Code

Using Poison to Reverse Engineer Code

Debugging Information Success

Debugging Information Success

Debugging Epoch Programs

Debugging Epoch Programs

Epoch 64-bit compiler progress

Epoch runtime: now with StupidFast

Comments

ApochPiQ

Latest Entries

A Few Farewells

Code Reuse In Actual Practice

Source-Level Debugging For Epoch Programs

Using Poison to Reverse Engineer Code

Using Poison to Reverse Engineer Code

Debugging Information Success

Debugging Information Success

Debugging Epoch Programs

Debugging Epoch Programs

Epoch 64-bit compiler progress

Reticulating splines