Jump to content

  • Log In with Google      Sign In   
  • Create Account

Jemgine



DeyjaScript 12 : Closures

Posted by , 30 October 2011 - - - - - - · 382 views

The next step after lambda functions is obviously closures. Without closures, I couldn't write code like this.

var view = new View(world, new Rectangle(0,0,800,600));
bindKey("W", [void v] { view.panCamera(0.0, 1.0, 5.0); }, true);

Closures allow lambdas to use variables local to the function they are declared in. To implement this, first I need to detect when variables that need to be closed over are used. The Scope abstraction already performs variable name resolution. I hijack that, and when the scope in question is a lambda scope, and the variable found is from a higher scope, I wrap the variable up. I also prevent it from wrapping the same variable twice.

public UsableVariable resolveVariableName(bool searchUp, String name)
{
	foreach (var node in variables) if (node.Name == name) return node;
	if (baseScope != null)
	{
		var R = baseScope.resolveVariableName(false, name);
		if (R != null) return R;
	}

	if (isLambdaFunctionScope) foreach (var node in lambdaReferencedVariables) if (node.Name == name) return node;

	if (searchUp && parent != null)
	{
		var R = parent.resolveVariableName(true, name);
		if (R != null && !(R is MemberVariable) && isLambdaFunctionScope)
		{
			var newLambdaVar = new LambdaVariableWrapper(R);
			newLambdaVar.variableIndex = lambdaReferencedVariables.Count;
			lambdaReferencedVariables.Add(newLambdaVar);
			return newLambdaVar;
		}
		return R;
	}
	return null;
}

LambdaVariableWrapper disguises the variable wrapped as a member variable to some imaginary type. All of these variable references are detected in the first pass of compilation, so the list is ready to go for the second. As far as the code inside the lambda, the LambdaVariableWrapper emits bytecode to fetch the variable from an object passed on the stack. Func handles pushing. More interesting is the code that pushes the Func to the stack in the first place. It first has to create an object to hold all the closed variables, and then it needs to load all of them into this object, and then finally it passes the object to Func's constructor.

bytecode.AddBytecode(Instruction.pushNewImpleObject, (byte)localScope.lambdaReferencedVariables.Count);
bytecode.AddBytecode(Instruction.loadRegisterFromStack, (byte)3);
foreach (var boundVar in localScope.lambdaReferencedVariables)
{
	boundVar.originalVariable.emitFetchByteCode(bytecode, context);
	bytecode.AddBytecode(Instruction.pushRegisterToStack, (byte)3);
	bytecode.AddBytecode(Instruction.loadMemberFromStack, (byte)boundVar.variableIndex);
	bytecode.AddBytecode(Instruction.popN, (byte)2);
}

The lambda body will later assume this 'ImpleObject' is on the stack at position -4. It's actually a variable; localScope.addVariable(new LocalVariable("__bind", null, IndexType.Parameter)); But, even though the lambda body can resolve the name, it can't use the variable because it has no type.

This implementation closes over values, not variables. That means that whatever value a variable has when the lambda is declared, that's the value it will have in the body. Supporting closure-over-variables is simple. Notice that in the current implementation, something like

type A { int x; }
var a = new A();
a.x = 5;
var func = [void v] { doSomething(a.x); };
a.x = 2;
func();

closes over the variable x. When I call func() at the end, the value of the closed-over variable, 'a', has not changed. It still refers to the same A. And, even if you do a = new A();, the lambda is still going to use the original A. But, look at x. x has changed. And x wasn't closed over, a was. So when I call func at the end, x has changed, and the lambda sees the new value. This method of implementing closure over variables is so simple that I made it an official part of the language with this generic type.

var refCode = @"type Ref<T> {
	T value;
	void construct(T value) { this.value = value; }
}";
CompileType(globalScope, refCode);

And then I can write code like this

var ogRef = new Ref<ObjectGrid>();
ogRef.value = new ObjectGrid(new Rectangle(104, 4, 692, 592), [Definition d, void v] 
{ 
	ogRef.value.hide(); 
	mousePlaceObject(view, world, d, [Location l, Facing f, void v]
	{
		addLocalTask(new BuildTask(d,l,f));
	});
});

in which the lambda argument to the ObjectGrid constructor references the same ObjectGrid it is being passed to. It (unfortunately) compiles without the use of references, such as

var og = new ObjectGrid(new Rectangle(104, 4, 692, 592), [Definition d, void v] 
{ 
	og.hide(); 
	mousePlaceObject(view, world, d, [Location l, Facing f, void v]
	{
		addLocalTask(new BuildTask(d,l,f));
	});
});

but will fail at runtime because, at the time the closed values are gathered, 'og' is null. The lambda body will see null, not the new value of og. The reason the first version works is because ogRef already exists. I change the object ogRef references without creating a new Ref<> object. The lambda sees the original value of ogRef (the Ref<> instance) and gets the new value of Ref<>.value from it.

Closures are damn handy. I also implemented some helper functions to make binding things to the scripting language easier. C# reflection makes automatically wrapping arbitrary C# lambdas possible.

public void wrapSystemFunction(Scope into, String name, Delegate Func)
{
	var systemFunction = new CallableFunction(CallingConvention.System, FunctionType.Free);

	var invokeMethod = Func.GetType().GetMethod("Invoke");
	var Parameters = invokeMethod.GetParameters();

	systemFunction.ReturnType = findSystemType(invokeMethod.ReturnType);
	systemFunction.ParameterTypes = new List<Scope>();
	foreach (var parm in Parameters) systemFunction.ParameterTypes.Add(findSystemType(parm.ParameterType));

	systemFunction.ReturnTypeName = systemFunction.ReturnType.fullTypeName;
	systemFunction.Name = name;
	systemFunction.DecoratedName = Scope.getDecoratedName(systemFunction);
	systemFunction.ParameterCount = systemFunction.ParameterTypes.Count; // parameterTypes.Length;
	systemFunction.systemFunc = (vm, parms) =>
	{
		var arguments = new object[parms.Length];
		for (int i = 0; i < parms.Length; ++i)
		arguments[i] = parms[i].getMember(0);
		var result = Func.DynamicInvoke(arguments);
		var obj = vm.createScriptObject(systemFunction.ReturnType);
		obj.setMember(0, result);
		return obj;
	};
	systemFunction.DeclarationScope = into;

	into.addStaticFunction(systemFunction);
	if (globalScope.IsDuplicateFunction(systemFunction)) throw new CompileError("Duplicate system func");
}

Unfortunately I still have to explicitly declare the types of parameters to the delegate when I call wrapSystemFunction.

scriptEngine.wrapSystemFunction(scriptEngine.getGlobalScope(), "hostPeerToPeerBuildSession", new Action<World, String>((world, port) =>
	{ activeSession = new PeerToPeerBuildSession(world, port, this); }));

Attached Files




DeyjaScript 11 : Enough blather; time to actually use it.

Posted by , 18 October 2011 - - - - - - · 458 views

I mentioned last time that I might start integrating this scripting language with that other thing, and I did. Nothing evolves a library faster than actually using it. First, I designed the interface I wanted scripts to see. The first thing I want to use scripts for is key binding. I could enumerate all possible actions and end up with something like 'Game:bindKey(Key, Action)', but I really want something more. I want players to be able to bind complex scripts to keys. It complicates the binding of simple things a great deal, but I think it will be worth it. A representative line from the startup script is Game:bindKey("W", [void v] { Game:getActiveCamera().pan(0.0, 1.0, 5.0 * Game:deltaTime()); }, true); Eventually I want to find a way to simplify this, but first, I'll concentrate on just getting this working.

In that line I see a 'Game' type (or variable?) that has the functions bindKey, getActiveCamera, and deltaTime. I can approach this two ways. I can add global objects to the language, create a type, and create a global instance called 'Game'. Or, I can add static or 'free' functions. I decide to go with the latter as it doesn't involve me having to figure out where to store the globals. Each scope can have regular member functions or static functions. You can't actually declare them in script, there is simply no syntax for them, but you can bind them externally by calling ScriptEngine.addSystemFunction as in scriptEngine.addSystemFunction(gameType, "getActiveCamera", "Camera", (vm, parms) => { return Cam; }); Allowing scripts to call these functions is more complicated. There are three different nodes in the grammar that call functions. FunctionCallNode, MethodCallNode, and InvokeOperationNode, and FunctionCallNode is overloaded to behave like MethodCallNode in some cases. Where do I shove StaticCallNode into this? Well, I could, and then overload FunctionCallNode again, but instead I combine all these nodes into a single InvokationNode. It's... well, I don't think Apoch would like it.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Irony.Interpreter.Ast;

namespace DeyjaScript
{
	class FunctionCallExNode : CompilableNode
	{
    	CallableFunction function;
    	UsableVariable invokeOn;
    	CompilableNode emitThisFirst;
    	bool pushImplicitThis = false;

    	public override void Init(Irony.Parsing.ParsingContext context, Irony.Parsing.ParseTreeNode treeNode)
    	{
        	base.Init(context, treeNode);
        	AddChild("Expresion", treeNode.ChildNodes[0].FirstChild);
        	if (treeNode.ChildNodes.Count == 2)
            	foreach (var parameter in treeNode.ChildNodes[1].ChildNodes)
                	AddChild("Parameter", parameter);
        	this.AsString = "Function Invokation";
    	}

    	internal void lookupFunction(ScopeStack context)
    	{
        	if (function != null) return;

        	var argumentTypes = CompilableNode.getArgumentTypes(ChildNodes, context, 1);

        	if (ChildNodes[0] is MemberAccessNode)
        	{
            	//method call or member functor; This is a shortcut to implement methods more effeciently.
            	var memberAccessExpression = ChildNodes[0].ChildNodes[0];
            	var memberAccessIdentifier = ChildNodes[0].AsString;

            	var invokeOnType = (memberAccessExpression as CompilableNode).getResultType(context);
            	invokeOn = invokeOnType.resolveVariableName(false, memberAccessIdentifier);
            	if (invokeOn != null)
                	function = invokeOn.Type.resolveFunctionOverload(false, "__invoke", argumentTypes);
            	else
                	function = invokeOnType.resolveFunctionOverload(false, memberAccessIdentifier, argumentTypes);
            	emitThisFirst = memberAccessExpression as CompilableNode;
        	}
        	else if (ChildNodes[0] is TypenameNode)
        	{
            	var funcName = (ChildNodes[0] as TypenameNode).removeLastToken();
            	if (funcName == null) throw new CompileError("How did this match if it's empty?");
            	var invokeOnType = (ChildNodes[0] as TypenameNode).findType(context);
            	if (invokeOnType == null)
            	{
                	invokeOn = context.topScope.resolveVariableName(true, funcName.identifier);
                	if (invokeOn != null) //Functor
                	{
                    	function = invokeOn.Type.resolveFunctionOverload(false, "__invoke", argumentTypes);
                    	if (function == null) throw new CompileError("Cannot invoke this type.");
                    	pushImplicitThis = true;
                	}
                	else
                	{
                    	function = context.topScope.resolveFunctionOverload(true, funcName.identifier, argumentTypes);
                    	if (function != null) pushImplicitThis = true; //bare function call
                    	if (function == null)
                    	{
                        	function = context.topScope.resolveStaticFunctionOverload(true, funcName.identifier, argumentTypes);
                        	if (function == null) throw new CompileError("Could not find function " + funcName.identifier);
                    	}
                	}
            	}
            	else
            	{
                	function = invokeOnType.resolveStaticFunctionOverload(false, funcName.identifier, argumentTypes);
                	if (function == null) throw new CompileError("Could not find function " + funcName.identifier);
            	}
        	}
        	else if (ChildNodes[0] is CompilableNode)
        	{
            	//Expression - must be a functor of some kind.
            	var invokeOnType = (ChildNodes[0] as CompilableNode).getResultType(context);
            	function = invokeOnType.resolveFunctionOverload(true, "__invoke", argumentTypes);
            	emitThisFirst = ChildNodes[0] as CompilableNode;
        	}
    	}

    	internal override Scope getResultType(ScopeStack context)
    	{
        	lookupFunction(context);
        	if (function == null) throw new CompileError("Function not found (GRT)");
        	return function.ReturnType;
    	}

    	internal override void gatherInformation(ScopeStack context)
    	{
        	lookupFunction(context);
        	if (function == null) throw new CompileError("Function not found (GI)");

        	if (emitThisFirst != null) emitThisFirst.gatherInformation(context);
        	for (int i = 1; i < ChildNodes.Count; ++i)
            	(ChildNodes[i] as CompilableNode).gatherInformation(context);
    	}

    	internal override void emitByteCode(List<byte> bytecode, ScopeStack context, bool placeResultOnStack)
    	{
        	for (int i = 1; i < ChildNodes.Count; ++i)
            	(ChildNodes[i] as CompilableNode).emitByteCode(bytecode, context, true);

        	//Can't just emit the leading expression since member access nodes are short circuited.
        	if (emitThisFirst != null) emitThisFirst.emitByteCode(bytecode, context, true);
        	if (invokeOn != null)
        	{
            	if (invokeOn is MemberVariable && pushImplicitThis)
                	bytecode.AddBytecode(Instruction.pushVariableToStack, (byte)(125));
            	invokeOn.emitFetchByteCode(bytecode, context);
            	function.EmitCallBytecode(bytecode, context);
        	}
        	else
        	{
            	if (function.type == FunctionType.Method && pushImplicitThis)
                	bytecode.AddBytecode(Instruction.pushVariableToStack, (byte)(125));
            	function.EmitCallBytecode(bytecode, context);
        	}
        	bytecode.AddBytecode(Instruction.popN, (byte)function.ParameterCount);
        	if (placeResultOnStack) bytecode.AddBytecode(Instruction.pushRegisterToStack, (byte)Register.returnValue);
    	}
	}

	
}

It's big, it's ugly, and it handles every possible way of invoking a function.

Oh, I also added While loops and a floating-point type. It's called 'float' in script, it's implemented as a C# double.

Up until this point, I've had a 'test script' that I run to make sure everything works. But it's gotten too long, and I need to cut out old stuff to debug the new stuff. Suddenly I have no way to know if I've broken old functionality. Also, bits of the library are leaking out to support the testing. Bits that really shouldn't be public. So I move regression testing into the library itself, with a suite of tests. It is woefully incomplete, but as I discover bugs, I'll add tests to cover them.

executeTest("Empty Test", "pass();", Engine, VM);
executeTest("Global System Call", "pass();", Engine, VM);
executeTest("System Method Call", "var S = \"123\"; if (S.length() == 3) pass();", Engine, VM);
executeTest("Free Function Call", "void function() { pass(); } void TEST() { function(); }", Engine, VM, false);
executeTest("Method Call", "type A { void method() { pass(); }}; var a = new A(); a.method();", Engine, VM);
executeTest("Lambda Call", "var lambda = [void v] { pass(); }; lambda();", Engine, VM);
executeTest("Lambda Method", "type A { void method() { pass(); }}; var a = new A(); var lambda = a.method; lambda();", Engine, VM);
executeTest("Integer Ops", "int x = 4 * 3; int y = 16 / 8; if (x + y == 14) pass();", Engine, VM);
executeTest("Float Ops", "float x = 4.0 * 3.0; float y = 16.0 / 8.0; if (x + y == 14.0) pass();", Engine, VM);
executeTest("Order of Ops", "int x = 4 * 3 + 10 / 5 - (4 + 2) / 3; if (x == 12) pass();", Engine, VM);
executeTest("Chain Invoke", "RegressionTest:getRegTest().test();", Engine, VM);
executeTest("Expression Invoke A", "Func<void> A() { return RegressionTest:getRegTest().test; } void TEST() { A()(); }", Engine, VM, false);
executeTest("Expression Invoke B", "Func<void> A() { return [void v] { RegressionTest:getRegTest().test(); }; } void TEST() { A()(); }", Engine, VM, false);
executeTest("Negative Numbers", "int x = -5; if (x == -5) pass();", Engine, VM);
executeTest("Subtraction & Negate", "if (5 - 7 == 2 + -4) pass();", Engine, VM);

Most of these tests trivially test trivial bits of the language. Several of them declare variables or invoke other parts of the language they aren't directly testing. At least one of them is designed to fail - Expression Invoke A results in the compile error, 'Cannot take pointer to system function'. This represents a flaw I'll have to address at some point. You can see in the last two tests how parsing negative numbers broke everything.

It's time now to seriously consider the interface the scripting language will present to the world. So far I've just sort of written code. A lot of the types are public for no good reason. Some, like CodePointer, the outside world has no business knowing about ever. I hide them, and I resolve the inconsistent accessibility errors (An error that took me a long time to understand when I first started using C#. Now I wonder, why doesn't C++ have this same error?), and then I go through every class that's still public and make sure what's public actually should be, and everything else is internal or private. I'm left with a few core types. The engine, the virtual machine, Scope, ScriptObject, Function. The engine was pretty well buttoned up. Now Scope has public functions for looking up functions, and you can walk the owner scope tree but only up. Function exists only so you can call it. The important data is exposed, but immutable.

I decided that the most useful function the virtual machine could have is a 'run string' function. This will compile a module, execute it, and then discard it. The engine doesn't actually support discarding modules, but that's fine, I'm the library author. I can just hack it.

var wrappedCode = "void RUNSTRING(){" + code + "}";
var currentTypeCount = context.globalScope.SubordinateScopes.Count;
var module = context.CompileModule("RUNSTRING", new List<String> { wrappedCode });
var func = module.findStaticFunctionBySimpleName("RUNSTRING");

callFreeFunction(func, context);

context.globalScope.SubordinateScopes.RemoveRange(currentTypeCount, context.globalScope.SubordinateScopes.Count - currentTypeCount);

So I'm finally using the scripting language in my game, and I want to bind a function that takes a Func<void> as an argument. Except, I can't. Why not? Because addSystemFunction doesn't actually parse the argument types, it just looks for a type with that name. Fine, Func<void> is named Func<void>; except that it doesn't exist yet, because it hasn't be instantiated. This is a terrible flaw. AddSystemFunction should parse that typename and instantiate the generic. In the mean time, I provide a terrible work around, invoked like so

var FuncVoidType = scriptEngine.InstanceGenericType(scriptEngine.getGlobalScope().findGenericType("Func", false, 1),
	new List<DeyjaScript.Scope> { scriptEngine.getGlobalScope().findLocalType("void", false) });

I use it, it works. When I press W, scripts get run, the camera pans. The game does absolutely nothing it didn't do before, but it does it differently, and that makes all the difference.

Next time, I think I'll talk about type deduction. Not the sort the compiler uses now, which I'll term 'backward type deduction'. That is fairly trivial. Given A(B);, if I know the type of B, I can find A. Everything in the compiler currently uses backward type deduction. Instead, I'll talk about something more complicated, forward type deduction. That is, given something like A([B] { B.foo(); });, if we know the type of A, we can deduce the type of B. I call this 'forward' because we move forward from the deduced type, rather than backward from it. Implementing a method of forward type deduction will allow me to dispense with type arguments to Lambdas, including their return type; it's also necessary to properly support the 'null' token. See if you can figure out why.

Attached Files




DeyjaScript 10 : Lambdas

Posted by , 11 October 2011 - - - - - - · 440 views

I started thinking, okay, I've written this thing. Now how can I actually use it? Wasn't I supposed to be making a mud engine? Yeah, well, I already don't want to anymore. Instead, I'm going to use DeyjaScript in this thing, http://www.gamedev.net/blog/342/entry-2250747-gnome-colony-build-demo/ . I'll use it to implement an in-game console where the player can assign macros - bits of DeyjaScript code - to keys. This is also how I'll implement key binding and re-mapping. This decision led me to the next required feature in DeyjaScript - Lambdas. The goal is to be able to type something like 'bindkey [key] [lambda]' into the console.

The first thing I need to support lambdas is function pointers. A function pointer will be able to point to a method or a lambda function. To make this possible, lambdas will masquerade as methods. Like arrays, I'll first create a type-less raw system type, and then wrap it in a generic type. Since it will be impossible for scripts to do anything with this type (though they could create one, since the language lacks any sort of access control) it doesn't have any methods. It's just a data carrier. I add two instructions to create them. pushMemberFunctionPointer and pushLambdaPointer. The first looks up the member, wraps it's code pointer in the system type, and pushes it. The second takes the current code pointer, changes it's start index, wraps it and pushes it. Since lambdas will be smashed onto the end of the function they are declared in, and can't be referenced anywhere else, this works great. A final instruction, invokeFunctionPointer, takes the wrapped code pointer and jumps to it.

Next, the wrapper generic.

Func<T0,R> {
	funcPtr raw;
	object _this;
	void construct(object _this, funcPtr ptr) 
	{ 
		raw = ptr;
		this._this = _this;
	}
	R __invoke(T0 t0) { 
		//What goes here??
	}
}

There's another reserved method name. I'll use '__invoke' later to support operator(). Two places in the grammar create Func<...> objects. Accessing members, and lambda declarations. Both of these emit code to create the funcPtr type, then to create a Func<...> with the correct types, and call it's constructor, so that it appears to the script that the type of a method or lambda is 'Func<...>'.

//Inside LambdaExpressionNode.emitBytecode...
bytecode.AddBytecode(Instruction.pushVariableToStack, (byte)125); //push this
callIndex = bytecode.Count + 1;
bytecode.AddBytecode(Instruction.pushLambdaPointer, BitConverter.GetBytes((Int16)0));
bytecode.AddBytecode(Instruction.pushNewObject);
NewStatementNode.emitIndexStack(bytecode, funcType, 0, context);
var constructor = funcType.findFunctionBySimpleName("construct");
constructor.EmitCallBytecode(bytecode);

//replace the raw function pointer with the Func<..> object.
bytecode.AddBytecode(Instruction.loadRegisterFromStack, (byte)Register.returnValue);
bytecode.AddBytecode(Instruction.popN, 3); //pop Func<...>, funcPtr, this
bytecode.AddBytecode(Instruction.pushRegisterToStack, (byte)Register.returnValue);

The lambda has an address of 0 because the enclosing function needs to fill that value in later. The actual syntax of a lambda looks like this -

var func = [int a, int b, int c, int r] { return a + b + c; };

I'm not thrilled with having to specify the return value as the last type argument, or with having to name it. The name is never used. Maybe I'll find a way to deal with that later. I want to support calling Funcs with variable() syntax, so I have to throw a bunch of code all over. A new grammar construction 'expression + ( + parameter list + )' catches cases where the func is the result of an operation, but when it's just a variable, the parser mistakes it for an ordinary function call. The function call node gets to deal with it too. I could replace the function call node with the new version entirely, but only at the cost of highly ineffecient function calls (Create a function pointer, invoke it, throw it away - compared to just that middle step.)

To compile lambdas, the enclosing function keeps track of all the lambdas declared in itself, and after it compiles it's own code it compiles all of them and tacks them onto the end of it's bytecode.

I skipped over an important detail earlier, the actual implementation of Func<...>.__invoke. How, exactly, do I invoke the raw funcPtr object? I have an instruction to do it, but I need to prepare the stack and actually call it. The answer turns out to be inline bytecode. To support that, I need to build a table of all the instructions and the number and sizes of their arguments. Then I add it to the grammar and compile it. It does no checking of any sort, it just spits opcodes into the bytecode stream, so it's very dangerous. But it allows me to write this

R __invoke(T0 t0) { 
	inline {
		pushVariableToStack 126; //push return pointer
		loadRegisterFromStack 4; //Remember return pointer
		pop;
		
		pushVariableToStack 125; //push this
		pushMemberToStack 0; //push raw
		loadRegisterFromStack 5; //Remember func pointer
		pop;

		pushVariableToStack 125; //push this
		pushMemberToStack 1; //push _this
		loadRegisterFromStack 6; //Remember _this;
		pop;

		loadRegisterFromStack 0; //recover frame pointer
		popN 3; //remove frame pointer, return pointer, and this pointer

		pushRegisterToStack 6; //Push _this;                                    	
		pushRegisterToStack 4; //Push return pointer
		pushRegisterToStack 5; //Push func pointer                                    	
		
		invokeFunctionPointer;
	}
}

I re-write the top of the stack to remove the current frame, and then I invoke the function. When that function returns, it will go directly to the code that called __invoke, as if __invoke never existed at all. Since the parameters to __invoke are on the stack, I never have to touch them. I use a few magic constants (The this pointer is always at stack position 125, which is the same as -3 - it's encoded as 128 + offset, to support negative offsets), and I know where __invoke's frame pointer and return value are at on the stack. By time I invoke, the stack looks as if the code that called __invoke called the function pointed to directly, complete with it's this pointer.

So, lambdas are a go. Maybe next I'll start integrating it with that other thing. Or maybe I'll add loops. I'm not out of 'advanced' features I can implement before the basics at all, maybe this language will never have loops.

Aside : As it turns out, deducing the return type of a lambda from the contents of that lambda is impossible. Even in the trivial case of a lambda that returns an int, the system can't handle it. Does it return int? Does it return object? Either is valid. The only way to deduce the return type will be from what the lambda is assigned to, or from the type of the argument it is in a function call. Presumably, all types could be deduced this way, but it would mean type resolution becomes compilation order dependent.

Attached Files




Gnome Colony Build Demo

Posted by , 07 October 2011 - - - - - - · 749 views

So, I think this is finally ready to show people.

Here's a screenshot I took.
Attached Image

Here's something someone else made.
Attached Image

And here is the actual thing...!
Attached File  GCBuildDemo.zip (710.09KB)
downloads: 67


DeyjaScript 9 : Arrays and Generics

Posted by , 01 October 2011 - - - - - - · 602 views

For arrays in this language, I had two requirements. First, they had to be dynamically sized, and second, they had to be type-safe. But, when I want about trying to actually implement them, I discovered that my current implementation of the type system could handle only one or the other. I could create arrays, but like in C++, the array itself could have no members. Dynamic sizing requires member functions on the array for adding and removing elements, and there is no way to add these members to the array without implementing some sort of generic type. The second option was to create a dynamically sized array that wasn't type-safe. This didn't require any special syntax since it looks like an ordinary type (called 'array'), but it did require every type in the language to have a common base. For types defined in scripts, that was simple. Their base defaults to 'object'. For primitive types, it was a bit more complicated. Instead of being a C# primitive boxed into an object, they are a C# primitive boxed into an object boxed into a ScriptObject. It is nasty, but in the long run, worth it. For either of these array implementation options, the other behavior can be built on top using generics. I decided to provide a dynamically-sized array of anything, as it should be useful for many data structures I might want to represent in the scripts.

The array isn't technically part of the language. It's one of the system types that are always available, but the script engine uses the public interface to bind them. Any client code could bind a type that behaves exactly like it.

var arrayType = addSystemType("array", objectType);
addSystemMemberFunction(arrayType, "construct", "void", (parms) =>
{
	parms[0].members[0] = new List<scriptObject>();
	return null;
});
addSystemMemberFunction(arrayType, "add", "void", (parms) =>
{
	(parms[1].members[0] as List<scriptObject>).Add(parms[0]);
	return null;
}, "object");
addSystemMemberFunction(arrayType, "__indexGet", "object", (parms) =>
{
	return (parms[1].members[0] as List<scriptObject>)[(parms[0].members[0] as int?).Value];
}, "int");
addSystemMemberFunction(arrayType, "__indexSet", "object", (parms) =>
{
	var index = (parms[1].members[0] as int?).Value;
	var list = (parms[2].members[0] as List<scriptObject>);
	var obj = parms[0];
	list[index] = obj;
	return obj;
}, "object", "int");

There will also have to be some methods for removal, for retrieving the count, etc, but I haven't written them. Those named '__indexGet' and '__indexSet' are special methods called by operator[]. Any type can support operator[] by implementing these functions. Function names that begin with double underscores are from this point on reserved for implementation details like this. Also notice that when a function returns void, it doesn't need to create a special ScriptObject to represent void.

I mentioned generics, and I really wanted type-safe arrays, so it was time to implement them. First, I added generic parameters to the grammar for typenames. So Foo<Bar> is a valid typename, and so is Foo<ModuleName:Bar>. Typenames became a tree of nested typenames, and, frankly, became way more complex than I expected. Parsing is not so difficult. I add generic parameters to object declarations too. Now I need to resolve these new typenames to actual types. TypenameNode does the heavy lifting.

internal static Scope findType(bool searchUp, ScopeStack context, Scope currentScope, TypenameToken of)
{
	Scope nextScope = null;
	if (of.hasGenericParameters)
	{
		var templateName = of.singleLevelName;
		var instancedType = currentScope.findLocalType(templateName, searchUp);
		if (instancedType != null) nextScope = instancedType;
		else
		{
			var declarationScope = currentScope.findLocalType(of.identifier, searchUp);
			var genericParameters = new List<Scope>();
			foreach (var parameter in of.genericParameters)
				genericParameters.Add(findType(true, context, currentScope, parameter));
			var Engine = context.globalScope.tag as ScriptEngine;
			var Object = declarationScope.tag as ObjectDeclarationNode;
			nextScope = Engine.EnqueueGenericTypeForCompilation(Object, genericParameters);
		}
	}
	else
		nextScope = currentScope.findLocalType(of.identifier, searchUp);
	if (of.next == null || nextScope == null) return nextScope;

	return findType(false, context, nextScope, of.next);
}

Ow! TypenameToken is a linked list of tokens, where a token is an identifier plus an optional list of generic arguments. This function does one thing of particular note; it looks for an instance of the generic type that takes the same parameter types. If it finds one, great, it just returns that one (This little detail will allow me to implement typed arrays of primitive types far more efficiently, assuming I do something to prevent a script from deriving from a primitive type). If not, it tells the script engine to compile it. The script engine will delay compilation until the current module is finished, but it will immediately create a scope for the generic type, register all it's members, and resolve all it's typenames. It has to do this right away so that whatever called TypenameNode.findType can finish it's compilation.

internal Scope EnqueueGenericTypeForCompilation(ObjectDeclarationNode decl, List<Scope> genericParameters)
{
	var into = new Scope();
	var newDecl = decl.Clone() as ObjectDeclarationNode;
	genericTypesAwaitingCompilation.Enqueue(new pendingGeneric
	{
		decl = newDecl,
		genericParameters = genericParameters,
		into = into
	});

	if (genericParameters.Count != decl.genericParamters.Count)
	throw new CompileError("Wrong number of generic arguments.");
	var newTypeName = newDecl.localScope.thisType + "<";
	for (int i = 0; i < genericParameters.Count; ++i)
	{
		if (i != 0) newTypeName += ",";
		newTypeName += genericParameters[i].fullTypeName;
	}
	newTypeName += ">";

	into.thisType = newTypeName;
	into.tag = newDecl;
	newDecl.registerMembers(decl.localScope.ownerScope, into);
	
	for (int i = 0; i < genericParameters.Count; ++i)
		into.genericParameters[i].Type = genericParameters[i];

	newDecl.resolveTypes(new ScopeStack(globalScope, into), into);
	into.finalizeScope();
	into.parent = null;
	return into;
}

It is necessary to clone the entire abstract syntax tree for the generic type. This is because each node keeps some state that's set in the 'gather information' compile pass, and used in the 'emit bytecode' pass. Since we are just doing the first pass on the generic type, and saving the second pass for later, we need a copy of that information for each instantiation. If you tried to use TypedArray<int> and TypedArray<bool> in the same module, whichever one was mentioned second would trample the state of the first. Wait, why can't we just compile it completely right away? Because a generic argument might be a type in the module that's using the generic type, and it's first pass isn't finished yet. Actually cloning the syntax tree is straightforward. I don't need to worry about the extra information that Irony puts into AstNodes, because it won't be used.

public object Clone()
{
	var result = Activator.CreateInstance(this.GetType());
	if (result is CompilableNode) (result as CompilableNode).AsString = AsString;
	foreach (var child in this.ChildNodes)
	{
		var newInstance = (child as ICloneable).Clone();
		if (child is CompilableNode) (child as CompilableNode)._cloneImpl(newInstance);
		(result as AstNode).ChildNodes.Add(newInstance as AstNode);
	}
	return result;
}

A few types use _cloneImpl to do some extra state transfer, and a few override clone entirely. TypenameNode gets away with public object Clone() { return this; }!

After cloning the AST, I figure out the name of this instance so that later attempts to use this same generic with these same parameters won't repeat this effort. Finally, I get down to compiling. RegisterMembers, in addition to adding all the members and methods, populates a list of generic parameter names. I use this list to fill in the types with the generic parameters specified. Later, when asked for a type, Scope will look to see if the name asked for exists in this list. If it does, it won't search it's member types, it will just return whatever type is in the list. Now I resolve the generic's types. There's a terrible bug here I'll leave as an exercise to the reader, but here's a hint : Where is the module in that scope stack? I return the scope for TypenameNode to use as the generic type instance.

Finally, I can actually write TypedArray.

TypedArray<T> {
	array data;
	void construct() { data = new array(); }
	T __indexGet(int i) { return (data[i] as T); }
	T __indexSet(T t, int i) { return ((data[i] = t) as T); }
	void add(T t) { data.add(t); }
	//etc
}

It exists in the 'Core' module, so you can use it like 'Core:TypedArray<int>'.

Some closing thoughts
*This is probably the only scripting language to support generics before it supports for loops. But I'm glad for that, because I would have had to modify them anyway.
*When resolving the types of a function declaration, I had to reach into the function implementation and look for any generic types so they can be instanced before the function is compiled.
*You could technically use the type 'TypedArray' without any generic arguments. But none of it's methods will work, as it will never, and in fact can't, be compiled.
*These 'generics' are more like C++ templates than C# generics. However, the compilation model won't support a test-substitution without major changes so SFINAE will probably never be an option.
*The error messages produced by the compiler are entirely useless.

Attached Files








September 2016 »

S M T W T F S
    123
45678910
11121314151617
18192021222324
25262728 29 30 

Recent Comments

Recent Comments



PARTNERS