I've also been working on the various debugging tools for the emulator. The only one that's finished is the disassembler, which I finished about an hour ago and helped in fixing a few broken opcodes to get the above program running.
Edit: Just did a speed test. Around 18000000 instructions per second before it drops below 60FPS at 3x resolution, more than enough.