Sign in to follow this  
Max_Payne

Virtual Machine Design

Recommended Posts

Hello there. I decided to design a simple virtual machine for fun. This is my 4th attempt. This one is designed to be practical... So that it is possible to easily compile C/C++ like code to bytecode that will run in the VM. When it is complete, I will probably make it open-source, and perhaps make a tutorial about how it was made, and how this can be used for game-scripting. The purpose of this thread is to gain feedback. For this VM, I decided to go with a register-less approach. Essentially, all variables will be on the stack. I also decided to go with a unified Virtual RAM memory space. My previous VM had a heap, a stack and registers all in 3 separate areas, but this made it very tedious and impractical to operate on data that is beyond the level of complexity of integers and floats (ie: C++ objects). The previous VMs also couldn't have pointers to data on the stack, which seemed rather restrictive to me, as it would have forced me to implement my language in a java-like fashion, where all objects must always be on the heap. I don't know if any of you ever did MIPS assembly. This design was partly inspired from it. Except that bytecode instructions will not always be the same size. VRAM layout: ############# ************ (VRAM upper boundary) ------------ ... Heap ------------ (Stack base + stack size) ... Stack ------------ (Stack base) ... Data segment (globals) ------------ ************ (address 0 - not addressable) The data segment (globals) are directly addressable. The code segment is not stored into the VRAM to prevent exploits. The main reason for this addressability model is that a contiguous memory system greatly simplifies addressing. It will be possible, for example, to have pointers to data in the stack or in the heap. The upper boundary of the VRAM can also be extended if there is a need for more heap space. This VM design offers the possibility of operating on all local stack variables directly, of moving objects between the stack and the heap all at once, and should have a bytecode that is reasonably suitable for eventual JIT compilation. I have also chosen to make the stack management implicit. You can request stack space to store more variables, or shrink your stack frame without touching a "stack pointer". The VM will also do bounds checking on the stack. It will also handle memory allocation in the heap. For flexibility, the heap is at the top of the memory, so that if the VM ever runs out of heap space, the VRAM can be resized at runtime. By default, it might start with a VRAM size of 4MB or so. At the bottom of the VRAM, the data segment of compiled programs is copied. This is so that you can store global variables, text strings, all kinds of datas in compiled bytecode files, and they are readily accessible to you. So you can have pointers to them. They actually start at address 1, so they are always in the same location. This data can also be modified at runtime. For obvious reasons, the code segment is not copied into the VRAM and executed from there. Instructions: ############# Memory management: movss (move stack->stack) movsg (move stack->global) movgs (move global->stack) movgg (move global->global) alloc (allocate heap bloc) free (free heap block) push (grow stack frame) pop (shrink stack frame) gsr (get stack reference) Integer math: isft iadd isub imul idiv Floating-point math: fadd fsub fmul fdiv Logic: or (bitwise or) and (bitwise and) xor (bitwise xor) gtui (greater than - uint) gti (greater than - int) gtf (greater than - float) eq (equality testing) Flow-control: jnoz jisz jump Subroutines: call ret System calls: sysi (obtain system call index from name) sysc (perform system call from index) The system call instructions are to allow developers to "bind" their own functions inside the VM. Essentually, what this will do is allow you to register a name for a function, and then give you access to a part of the stack frame of the calling function in the VM when your function is called, essentially allowing you to pass parameters. Your functions will also be able to return parameters that will be placed on the stack of the VM. This will be used for things like the "standard library", which are utility functions (for things like string manipulation and console output), but would also allow you to interface the VM with a 3D engine and provide access to its functions for game scripting. It will not provide direct access to external data, however.

Share this post


Link to post
Share on other sites
Surely someone has some comment of some kind ;)

Oh well. I am thinking of also adding instructions to perform operations in "immediate mode", where the value is supplied in the instruction. This might be useful in making this faster since the compiler will most likely implement the instruction case switch as branching address table... So the VM would be advantaged to be designed for CISC, instead of trying to run a ton of small instructions.

Although I suppose I can start with a "plain" instruction set and then add on later. I could probably start by adding an instruction to set a 32-bit word value in a stack memory location.

Instructions like:
setws (set word - stack)
setwg (set word - global)

Share this post


Link to post
Share on other sites
It sounds like a good foundation, however I don't recommend having a dynamically sizable heap, or VRAM in general. Because if this VM running a program in the background of a game and all of a sudden the heap needs to be resized, you're looking at allocating a single large block of memory (which takes time), and then copying your existing VRAM to this new location. Even if you go with an allocation scheme were you double the size of the VRAM, you go from not having enough memory to having too much memory. A better idea might be to have a fixed VRAM size, but let the code determine the size of this space before execution. So if a piece of code needs more space to run it can get it, without putting a run-time burden on the VM to meet memory demands.

Also, a common way to manage two virtual memory pools in the same physical space is to have them grow in opposite directions of the address space. So given a 4MB VRAM, if you let your stack start at 0x00001001 (to account for a 4KB data segment and a single unaddressable location) and grow up, and have your heap start at 0x003FFFFF and grow down, then your stack isn't restricted by a heap sitting above it and both grow toward each other. If the heap and the stack happen to meet, then you're out of memory in general, and depending on the request that triggered the collision you either display "stack overflow" or "out of memory". This implies that you data segment is fixed from the beginning, which sounds like a good idea for static data embedded in the bytecode. I wouldn't make it modifiable either, since that's recipe for more headaches. If the user wants to take a value in the data segment and manipulate it, have them copy it to the stack/heap and work on it there.

Otherwise everything else looks like a good start. It's definitely worth having a few extra addressing modes available as well. Immediate mode is a must, and I think base-offset and indirect modes will be extremely useful too.

Share this post


Link to post
Share on other sites
If you can regester function Foo such that your bytecode can "call Foo", then will you be able to regester funtion Foo such that you can call a function running in your scripts? ie "VM->Call("Foo")"

On the same lines, it could be usefull to provide a similar interface for variables. So that you can regester a named variable inside the VM for access outside,
or to regester a variable outside the VM for use inside.

Share this post


Link to post
Share on other sites
Quote:
Original post by KulSeran
If you can regester function Foo such that your bytecode can "call Foo", then will you be able to regester funtion Foo such that you can call a function running in your scripts? ie "VM->Call("Foo")"
.

Well. The VM doesn't have any concept of functions because it works at the bytecode level. It only has subroutines... But it might be possible for the VM to pass "addresses" of its functions to engine functions, and have the engine control the VM to call them.

Quote:
On the same lines, it could be usefull to provide a similar interface for variables. So that you can regester a named variable inside the VM for access outside, or to regester a variable outside the VM for use inside.


Well, if you have functions you can call, you can access data through them. It's safer like that.

Quote:
It sounds like a good foundation, however I don't recommend having a dynamically sizable heap, or VRAM in general. Because if this VM running a program in the background of a game and all of a sudden the heap needs to be resized, you're looking at allocating a single large block of memory (which takes time), and then copying your existing VRAM to this new location. Even if you go with an allocation scheme were you double the size of the VRAM, you go from not having enough memory to having too much memory. A better idea might be to have a fixed VRAM size, but let the code determine the size of this space before execution. So if a piece of code needs more space to run it can get it, without putting a run-time burden on the VM to meet memory demands.


Well, I've thought about giving my bytecode program a header that specifies such parameters, and it's not such a bad idea, for the most part. However, I would much rather have the VM freeze for 0.01 second while it expands its memory from 4 to 8 MB than having it crash in the middle of something. I also don't really like the idea of programs specifying their memory requirements because then people could put any kind of crazy number that will just uselessly tax your system.

As for the addressing mode, I am planning on having offsets in them all, so it can be used to access objects (and the multiple variables on the stack)... Basically, it will be a pair 32bit address + a 16bit offset for the source and one for the destination, along with a 16bit size tag in "global addressing", and just two 16bit source and destination offsets with a 16bit size tag in "stack addressing", which will be used to manipulate local variables without knowledge of the stack frame pointer. But it will be possible to use the gsr instruction to get a pointer to data on the stack that will be usable in global addressing mode.

Share this post


Link to post
Share on other sites
One change I would make - instead of having conditional jumps, I would give each instruction a 3-bit 'condition' field that could be 0 for 'always', 1 for 'when equal flag set' etc. This would allow you to do conditional everything, and you would only need to make a 'jump' command that is controlled by the condition field.
In the assembly, I'd specify it like 'ceq' for 'when equal flag is set', 'cgt' for 'greater than' etc so in the end a conditional jump might become 'ceq jump Label1' and a conditional move could be 'ceq move A, B' etc

Because you only have 30 instructions (after eliminating all but 1 jump), you can fit that into 5 bits and the condition field into the remaining 3 for an 8 bit byte. Of course, if you were planning to pack the operands in there, it might end up messier.

Share this post


Link to post
Share on other sites
Quote:
Original post by Extrarius
One change I would make - instead of having conditional jumps, I would give each instruction a 3-bit 'condition' field that could be 0 for 'always', 1 for 'when equal flag set' etc. This would allow you to do conditional everything, and you would only need to make a 'jump' command that is controlled by the condition field.
In the assembly, I'd specify it like 'ceq' for 'when equal flag is set', 'cgt' for 'greater than' etc so in the end a conditional jump might become 'ceq jump Label1' and a conditional move could be 'ceq move A, B' etc

Because you only have 30 instructions (after eliminating all but 1 jump), you can fit that into 5 bits and the condition field into the remaining 3 for an 8 bit byte. Of course, if you were planning to pack the operands in there, it might end up messier.


I won't pack anything. Each instruction will have its own parameters depending on the opcode. Ie:

iadd (8bit) | index_srca (16bit) | index_srcb (16bit) | index_dst (16bit)
jump (8bit) | dest_index (32bit)
sysi (8bit) | name_string_addr (32bit) | index_dst(16bit)

I'm thinking of making the opcodes just 8 bit since I don't want too many instructions. As for the 16 bit stack indices, you might think that's big... But in the end, it means each stack frame can have 64KB of data in it, and it will eliminate alot of push/pop instructions. I could also have some arithmetic operations that operate directly on the operands on top of the stack, eventually, to make this even more lightweight.

As for everything being conditional... I think that would just slow things down. it's just doesn't work well with the hardware... And what people really want is branching.

Share this post


Link to post
Share on other sites
Quote:
Original post by Max_Payne
Well, I've thought about giving my bytecode program a header that specifies such parameters, and it's not such a bad idea, for the most part. However, I would much rather have the VM freeze for 0.01 second while it expands its memory from 4 to 8 MB than having it crash in the middle of something. I also don't really like the idea of programs specifying their memory requirements because then people could put any kind of crazy number that will just uselessly tax your system.

I wouldn't worry about developers misusing your virtual machine to tax the system, they can misuse the computer in about a million different ways before even touching your VM [wink] But I suppose a dynamically growing VRAM wouldn't hurt. My instincts just cringe at the thought of allocating so much memory at once and then copying 4MB from somewhere else, at run-time. But if you test it out and it meets your requirements, then by all means go for it.

Share this post


Link to post
Share on other sites
Quote:
Original post by Name_Unknown
personally I'd stick with a stack machine approach. I'd mix it in with lisp oriented cons cells and stop and copy gc. than I'd add a dash of salt and make a language like dylan.


What do you call a stack machine?

Quote:
My instincts just cringe at the thought of allocating so much memory at once and then copying 4MB from somewhere else, at run-time.


Copying memory, even 32MB, on a modern system, should actually be rather fast.

Share this post


Link to post
Share on other sites
Quote:
As for everything being conditional... I think that would just slow things down. it's just doesn't work well with the hardware... And what people really want is branching.

(1) It's how the ARM processor works. All operations are conditional. Several other RISC processors have this feature across a subset of their operations.

(2) A branch is a jump is a goto. There is no difference.

(3) No, not everything needs to be conditional. But you're desperately missing some jumps; in addition to "is zero", you need "is negative". Their converses would suit in their place, and having all four is normal practice. Without is-negative, you cannot perform less-than or greater-than comparison jumps.

(4) Standard procedure in a stack machine is to operate only on the top of the stack with all general operations (arithmetic, comparison, condition, etc), and then have a few opcodes like "push-to" (pops top and second-to-top, A and B; writes B to stack entry at #A entries below B) and "pop-from" (replaces top-of-stack, A, with stack entry #A entries below A). That ought to reduce the bloat in your program segment by, apparently, over 300%.

Share this post


Link to post
Share on other sites
Quote:
Original post by Wyrframe
(1) It's how the ARM processor works. All operations are conditional. Several other RISC processors have this feature across a subset of their operations.


However, this is not a RISC processor, and having a conditional form for each instruction would just slow things down. Furthermore, it's the kind of optimization that can easily be derived from simple bytecode, but is harder to "undo" when implemented on hardware without support for it.

Quote:
(3) No, not everything needs to be conditional. But you're desperately missing some jumps; in addition to "is zero", you need "is negative". Their converses would suit in their place, and having all four is normal practice. Without is-negative, you cannot perform less-than or greater-than comparison jumps.


My jumps will not actually perform any comparisons, they will operate on pre-computed true/false values.

Quote:
(4) Standard procedure in a stack machine is to operate only on the top of the stack with all general operations (arithmetic, comparison, condition, etc), and then have a few opcodes like "push-to" (pops top and second-to-top, A and B; writes B to stack entry at #A entries below B) and "pop-from" (replaces top-of-stack, A, with stack entry #A entries below A). That ought to reduce the bloat in your program segment by, apparently, over 300%.


I can see how there would be some validity in that for arithmetic operations, but then it adds the need to constantly move data from and to the top of the stack.

Share this post


Link to post
Share on other sites
You're worried about efficiency and matching hardware, but you're doing things like having 'jump yes' and 'jump no' and 'is greater than'? Are you targetting a different architecture than x86?

Why are you writing a VM if your main priority is efficiency and matching the hardware? It seems like you would be better served by merely writing an assembler or compiler for the target platform.

Share this post


Link to post
Share on other sites
Quote:
Original post by Extrarius
You're worried about efficiency and matching hardware, but you're doing things like having 'jump yes' and 'jump no' and 'is greater than'? Are you targetting a different architecture than x86?

Why are you writing a VM if your main priority is efficiency and matching the hardware? It seems like you would be better served by merely writing an assembler or compiler for the target platform.


Well, why in the hell should I choose a less efficient design such as making each instruction conditional when I can go with a simpler and more efficient design such as branching...

My reason for designing a VM is so I can design a simple compiler and my own scripting engine. I much prefer this kind of environment to x86 ASM.

Share this post


Link to post
Share on other sites
Efficiency is one thing, but it seems like you're really going overboard. Also, you're picking odd combinations like having a seperate instruction for testing equality and 'greater than' instead of either condition instructions or a single 'compare A to B' instruction that sets flags coupled with various conditional jumps that jump based on the flag states. Also, why have different mnemonics depending on the type of things being operated on? Different opcodes makes sense, but not really different mnemonics unless they're otherwise ambiguous.

Share this post


Link to post
Share on other sites
Quote:
Original post by Extrarius
Efficiency is one thing, but it seems like you're really going overboard. Also, you're picking odd combinations like having a seperate instruction for testing equality and 'greater than'


It's rather practical when you need more than one comparison to be performed when branching. And boolean algebra does not need to be coupled with branching.

Quote:
instead of either condition instructions or a single 'compare A to B' instruction that sets flags coupled with various conditional jumps that jump based on the flag states.


Well, I wanted to completely avoid flags because it's not really optimal to use them in a VM. When designing a processor, it's quite easy because you have direct access to all the bits in the registers, and you can just have any particular bit trigger some gates directly, but here, if I want flags, I have to test for each one of them individually.

Quote:
Also, why have different mnemonics depending on the type of things being operated on? Different opcodes makes sense, but not really different mnemonics unless they're otherwise ambiguous.


Because they are not just different opcodes, they are different instructions, with different parameters, but similar purposes.

Share this post


Link to post
Share on other sites
Quote:
Original post by Max_Payne
[...]Because they are not just different opcodes, they are different instructions, with different parameters, but similar purposes.
In the same way that "mov X,Y" is more than one instruction on the x86, but only has one mneumonic.

Share this post


Link to post
Share on other sites
Quote:
Original post by Extrarius
Quote:
Original post by Max_Payne
[...]Because they are not just different opcodes, they are different instructions, with different parameters, but similar purposes.
In the same way that "mov X,Y" is more than one instruction on the x86, but only has one mneumonic.


Well, that's purely a choice when making the assembler. However, in this case, there is no way to distinguish between the different movs... Some take the same number of parameters (which are all numbers), but aren't the same.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this