Om: OpCodes, Varadic Functions and Reference Types

Published January 16, 2016
Advertisement

OpCodes for the Om Virtual Machine

Just finished up writing a full list of the [font='courier new']OpCode[/font] set that the Om virtual machine needs to support, along with a brief description of what each code does, so thought I'd dump it here. Surprisingly few really, considering some of the complexities of the language now.

This is basically what the Compiler module has to generate as byte-code.


OpCode::Call // calls entity on top of stack e.g. FunctionEntity, ExternalFunction, store func.stackTop as current stack sizeOpCode::Ret // pops call stack and returns to previous position, exits if nothing left on stackOpCode::Jmp uint // unconditional branch to address in parameterOpCode::JmpT uint // branch to address in parameter if top of stack to bool is trueOpCode::JmpF uint // branch to address in parameter if top of stack to bool is falseOpCode::Push TypedValue // pushes the supplied TypedValue onto the value stackOpCode::Pop // pop the top of the value stackOpCode::PopN uint // pop N values from the top of the value stackOpCode::Peek uint // copy value at N from top of stack backwards and push it onto stackOpCode::GetLc uint // copy value at func.stackTop + N from bottom of stack upwards, push on stackOpCode::PutLc uint // copy value from top of stack to func.stackTop + N from bottom of stack upwardsOpCode::GetMb uint // pop entity on top of stack, parameter is member ID, push copy of member to top of stackOpCode::PutMb uint // pop entity on top of stack, parameter is member ID, copy object on top of stack to memberOpCode::GetSc // pop value, then entity from top of stack, copy entity[value] to top of stackOpCode::PutSc // pop value, then entity from top of stack, copy top of stack to entity[value]OpCode::GetNl uint uint uint // params are func.guid, local addr, name ID, search guidStack to find func.stackTop, add addr and push to stackOpCode::PutNl uint uint uint // params as for GetNL, search guidStack to find func.stackTop, add addr and copy top of stack to positionOpCode::Add // pop stack twice, add together, push resultOpCode::Sub // pop stack twice, subtract, push resultOpCode::Mul // pop stack twice, multiply, push resultOpCode::Div // pop stack twice, divide, push resultOpCode::AddEq uint // if string, pop arg from top of stack, add arg to string on top of stack, else hand back to OpCode::AddOpCode::Eq // pop stack twice, push boolean to stack if equalOpCode::Neq // pop stack twice, push boolean to stack if not equalOpCode::Lt // pop stack twice (b, a), push boolean to stack if a < bOpCode::LtEq // pop stack twice (b, a), push boolean to stack if a <= bOpCode::Gt // pop stack twice (b, a), push boolean to stack if a > bOpCode::GtEq // pop stack twice (b, a), push boolean to stack if a >= bOpCode::Not // apply not to top of stackOpCode::Neg // apply negate to top of stackOpCode::Inc // increment top of stack (Int or Float) by 1OpCode::Dec // decrement top of stack (Int or Float) by 1OpCode::Cast Om::Type // pop top of stack, convert to supplied Type, push result to top of stackOpCode::FeChk uint // check peek(1) is array, if peek(0) >= array size jump to address in parameterOpCode::FeGet // push value at array at peek(1), position at peek(0) to top of stackOpCode::MkEnt TypedValue // create entity of given type, push to top of stack, TypedValue.v is used for array reserve and string idOpCode::AddCh uint // pop value from top of stack, add to entity (Object or Array) on top of stack, parameter is id for ObjectOpCode::Out // pop top of stack, output it to std::coutOpCode::OutNl // output a newline to std::cout

Varadic Functions and Reference Types

I've finished up adding varadic script functions to Om now, and have just got the basics of reference types working, although that needs a lot more testing and checking of the implications. [EDIT] Have removed references now, implications were just too horrendous to continue.

Varadic Functions
First up, I wanted script functions to be possibly varadic mainly for balance, since the built-in methods and external C++ functions you can call are all essentially varadic, so I added the following syntax:


var f = func(a, b, c...){};var g = func(a...){};In each case, the ellipses has to follow the last parameter in a function declaration and the function must be called with a minimum of the parameter count minus one. For example, given the above:

f(1); // error, minimum of two expectedf(1, 2); // fine, no varadic paramsf(1, 2, 3, 4, 5, 6); // fine, four varadic paramsg(); // fine, no varadic paramsg(1, 2, 3, 4); // fine, four varadic paramsWhen the function is called, the virtual machine takes all the varadic parameters and converts them internally into an [font='courier new']Om::Array[/font] which is accessible in the function via the varadic parameter's name. For example:

var f = func(a, b, c...){ out "a: ", a; out "b: ", b; out c.length, " varadic parameters"; for(var i: c) { out " ", i; }};f(1, 2, 3, 4, 5);The call at the bottom would output:

a: 1b: 23 varadic parameters 3 4 5So now you can have a function that takes between zero and many parameters, or a forced number with as many following as you wish and so on.


Eventually the [font='courier new']out[/font] command is going to be removed once we're done and the user would replace this with an external C++ function to dump the values to [font='courier new']std::cout[/font] if they wished (so the VM ends up not dependent on the [font='courier new']std::streams)[/font] at which time the syntax for this, if it is wrapped in a script function, will be much nicer.

Was actually quite easy to implement. Just stored a boolean on the [font='courier new']FunctionEntity[/font] to say if it was varadic or not, kept the existing parameter count value, then added the following method that is called by the [font='courier new']OpCode::Call[/font] instruction:

bool Machine::convertStackToVaradic(uint &params, uint expected, Om::Value &res){ TRACE; if(params < expected - 1) { res = CommonErrors::varadicParamCountError(state, params, expected - 1); return false; } uint varadics = (params - expected) + 1; ArrayEntity *array = state.em.allocate(varadics); if(varadics) { uint i = varadics - 1; while(true) { array->elements.push_back(vs.from_back(i)); if(i-- == 0) break; } vs.trim(varadics); params -= (varadics - 1); } else { params = 1; } vs.push_back(TypedValue(Om::Type::Array, state.eh.add(array))); inc(state, vs.back()); return true;}Just a bit of stack-voodoo going on, popping the varadics from the stack and inserting them in the correct order into a newly created [font='courier new']Om::Array[/font]. The [font='courier new']Om::Array[/font] will go out of scope when the function returns and be released at this point.


Reference Types [EDIT] - Left up for posterity, but this feature now removed sad.png

I felt it was a little bit clunky to have to add a simple [font='courier new']Om::Int[/font] or [font='courier new']Om::Float[/font] to an [font='courier new']Om::Object[/font] or [font='courier new']Om::Array[/font] in order to pass it by reference to a function, but was a bit scared at the thought of implementing references. Turns out I had it all back to front.

I started off trying to declare function parameters as references, like:

var f = func(:a, b, :c){};Where the colon defines a reference, then I quickly realised that at the point the [font='courier new']CallNode[/font] generates, it has no idea what it is generating the parameters for. This is resolved at run-time so it was impossible to figure out in the [font='courier new']CallNode::generate()[/font] method what to do.


In fact, the solution was far simpler: add a "get reference" operator ([font='courier new']:[/font]) that can be applied to a subset of expressions:

var a = 20;var b = :a;This is just a case of adding a [font='courier new']RefNode[/font] class to the [font='courier new']Node[/font] system that just has a [font='courier new']NodePtr target[/font] member, then adding [font='courier new']bool canGenerateRef() const[/font] and [font='courier new']bool generateRef(Context &c)[/font] virtual methods to the [font='courier new']Node[/font] base class. Any [font='courier new']Node[/font] type that can then generate a reference implements these. [font='courier new']RefNode::generate()[/font] just checks its target's [font='courier new']canGenerateRef()[/font] then calls [font='courier new']generateRef()[/font] if it can do.


At the moment, you can only generate a reference to a local variable in the current function scope, just while I iron out any issues.

We have a set of [font='courier new']ExtendedType[/font]'s that are hidden from the user and augment the public [font='courier new']Om::Types[/font]. I added [font='courier new']ReferenceType[/font] to this set, and implemented [font='courier new']SymbolNode::generateRef()[/font] like this:

bool SymbolNode::generateRef(Context &c){ c.update(pos); const Symbol *s = c.locals().find(name); if(s) { c.pm() << OpCode::Push << TypedValue(static_cast(ReferenceType), s->addr); return true; } return c.error(pos, "Cannot generate reference to non-local variable");}So we find the address (on the stack) of the [font='courier new']Symbol[/font], then push a [font='courier new']TypedValue[/font] onto the value stack with this hidden type and the address on the stack of the actual value.


The only way the virtual machine can access local variables is via the [font='courier new']OpCode::GetLc[/font] and [font='courier new']OpCode::PutLc[/font] instructions, so we just had to modify these slightly to check for this new type:

bool Machine::getLc(uint addr){ TypedValue t = vs[addr]; if(t.realType() == static_cast(ReferenceType)) { t = vs[t.toUint()]; } vs.push_back(t); inc(state, vs.back()); return true;}bool Machine::putLc(uint addr, Om::Value &res){ TypedValue *v = &(vs[addr]); if(v->realType() == static_cast(ReferenceType)) { v = &(vs[v->toUint()]); } if(!dec(state, *v, res)) return false; *v = vs.back(); inc(state, *v); return true;}And, amazingly, that was pretty much it. It automatically works for function parameters too since they are just local variables as far as the machine is concerned.

var f = func(a){ out "called f"; a = 100;};var b = { name = "Paul"; destructor = func { out "bye ", this.name; }; };out b;f(:b);out b;See, the function does not need to know either way if the parameter is a reference or not. The user decides this at the point of call. The above outputs:

called fbye Paul100When the [font='courier new']a = 100[/font] is executed, it resolves the reference and assigns [font='courier new']100[/font] to the [font='courier new']Om::Object[/font] that [font='courier new']b[/font] holds, decrementing and destroying the [font='courier new']Om::Object[/font] and replacing it with the [font='courier new']Om::Int[/font].


But on the plus side, you can also use this outside of function parameter calls now and, for a given subset of expressions, declare references wherever you like, as long as the compiler supports the generation of a reference. Should be able to extend this to non-function-local variables as well, since they also live on the stack. Sadly won't be able to extend to [font='courier new']Om::Object[/font] or [font='courier new']Om::Array[/font] members though, since that would require the reference to store two numbers, the [font='courier new']Entity[/font] ID and the position, and I'm not doubling the size of a [font='courier new']TypedValue[/font] just to support this.

But I'm pretty happy with how simple that was to implement and need to figure out what wider consequences we need to think about now e.g. references to temporaries and so on.

[EDIT] Next morning...

Actually, having investigated the implications (which are a nightmare), I just decided to revert to a backup and remove references from Om. I googled to see how Javascript handled this, hoping for some inspiration and saw that it does not - it relies on the approach of packing a simple value into an array or object workaround to achieve the same, which we already support. Was sadly just more trouble than it was worth, given that the easy workaround is possible.

I found that I was having to add support to more and more of the [font='courier new']Machine[/font] instruction implementations and this means you pay a small cost on every variable access, even if you never use the reference operator, since we are having to do at least an [font='courier new']if(v.realType() == static_cast(ReferenceType))[/font] check in more and more places. Turns out it wasn't just [font='courier new']OpCode::GetLc[/font] and [font='courier new']OpCode::PutLc[/font] after all, but was peppering it all over the place.

If we wanted to extend to support references to non-local variables (see last post) and [font='courier new']Om::Object[/font] and [font='courier new']Om::Array[/font] members, we would have to widen the value stack in the machine by another four bytes, since it takes two numbers to address all three of those (non-local: function ID and local address, object: entity ID and member ID, array: entity ID and index).

Any time I start thinking about widening the value stack, it is time to stop and think very hard I've found. This can spiral out of control pretty quickly.

It isn't a big deal in terms of performance I don't think, since the value stack is already probably 8 bytes wide due to compiler padding (has an [font='courier new']enum Om::Type[/font] value, very low maximum, and a [font='courier new']char d[4][/font] data area, and is copied via [font='courier new']std::memcpy[/font] et al in the [font='courier new']pod_vector[/font]), but the complexity tends to start to grow once we start down this road.

Plus if I ever change to a 64 bit Om, we'll have an extra 8 bytes rather than 4 added. This all has implications for cache usage of course.

One workaround:

var f = func(a){ a.value = 20;};var a = 10;var o = { value = a; };f(o);a = o.value;Maybe not pretty, but since it works, all the complexity of adding a reference operator, which can only work on local variables anyway, just doesn't seem worthwhile to me.


Thanks for stopping by and hope this was of interest.

2 likes 0 comments

Comments

Nobody has left a comment. You can be the first!
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Profile
Author
Advertisement
Advertisement