(In the next 10 years I would go on to exceed both my dad and my friend when it came to code writing ability, so much so that the control program for my friend's BSc in electronics was written by me in the space of a few hours when he couldn't do it in the space of a few months ;) )
One of the best times I had however during this time was when I got an Atari STe and moved from using STOS Basic to 68K Assembly so that I could write extensions for STOS to speed operations up. I still haven't had a coding moment which has given me as much joy as the time I spent 2 days crowbaring a 50Khz mod replay routine into an extension, I lack words for the joy which came when that finally came to life and played back the song without crashing You think you've got it hard these days with compile-link-run debug cycles? Try doing a one of them on a 8Mhz computer with only floppy drives to load from and only being able to run one program at a time ;)
The point to all this rambling is that one of the things I miss on modern systems with our great big optimising compilers is the 'to the metal' development when was the norm back then; and I enjoyed working at that level.
In my last entry I was kicking around the idea of making a new scripting language which would natively know about certain threading constraints (only read remote, read/write private local) and I was planning to backend it onto LLVM for speed reasons.
During the course of my research into this idea I came across a newsgroup posting by the guy behind LuaJIT talking about why LuaJIT is so fast. The long and the short of it is this;
A modern C or C++ compiler suite is a mass of competting heuristics which are tuned towards the 'common' case and for that purpose generally work 'well enough' for most code. However an interpreter ISN'T most code, it has very perticular code which the compiler doesn't deal well with.
LuaJIT gets its speed from the main loop being hand written in assembler which allows the code to do clever things that a C or C++ compiler wouldn't be able to do (such as decide what variables are important enough to keep in registers even if the code logic says otherwise).
And that's when a little light went on in my head and I thought; hey.. you know what, that sounds like fun! A low level, to the metal, style of programming which some decent reason for doing it (aka the compier sucks at making this sort of code fast).
At some point in the plan I decided that I was going to do x64 support only. The reasons for this are two fold;
1) It makes the code easier to do. You can make assumptions about instructions and you don't have to deal with any crazy calling conventions as the x64 calling convention is set and pretty sane all things considered.
2) x86 is a slowly dying breed and frankly I have no desire to support it and contort what could be some nice code into some horrible mess to get around the lack of registers and crazy calling conventions.
I've spent the better part of today going over alot of x86/x64 stuff and I now know more about x86/x64 instruction decoding and x64 function calling/call stack setup then any sane person should... however it's been an intresting day
In fact x64 has some nice features which would aid the speed of the development such as callers seting up the stack for callees (so tail calls become easy to do) and passing things around in registers instead of via the stack. Granted, while inside the VM I can always do things 'my way' to keep values around as needed but it's worth considering the x64 convention to make interop that much easier.
The aim is to get a fully functional language out of this, once which can interop with C and C++ (calling non-virtual member functions might be the limit here) functions and have some 'safe' threading functionality as outlined in the previous entry.
Granted, having not written a single line of code yet that is some way off to say the least So, for now, my first aim is to get a decoding loop running which can execute the 4 'core' functions of any language;
After that I'll see about adding more functionality in; the key thing here is designing the ISA in such as way that extending it won't horribly mess up decode and dispatch times.
Oh, and for added fun as the MSVC x64 compiler doesn't allow inline assembly large parts are going to be fully hand coded... I like this idea