Deyja

Members
  • Content count

    2382
  • Joined

  • Last visited

Community Reputation

920 Good

About Deyja

  • Rank
    Not so Advanced Member
  1. Some Thoughts on Software Design

    If you never aim high, you'll miss the stars every time.
  2. Multithreading Help

    I understand you're trying to learn how this stuff works, but, .net already has this functionality built in. I'd suggest you look at the System.Threading.Tasks namespace, and at least study their interface before you roll your own. http://msdn.microsoft.com/en-us/library/ff963549.aspx
  3. DeyjaScript 12 : Closures

    The next step after lambda functions is obviously closures. Without closures, I couldn't write code like this. var view = new View(world, new Rectangle(0,0,800,600)); bindKey("W", [void v] { view.panCamera(0.0, 1.0, 5.0); }, true); Closures allow lambdas to use variables local to the function they are declared in. To implement this, first I need to detect when variables that need to be closed over are used. The Scope abstraction already performs variable name resolution. I hijack that, and when the scope in question is a lambda scope, and the variable found is from a higher scope, I wrap the variable up. I also prevent it from wrapping the same variable twice. public UsableVariable resolveVariableName(bool searchUp, String name) { foreach (var node in variables) if (node.Name == name) return node; if (baseScope != null) { var R = baseScope.resolveVariableName(false, name); if (R != null) return R; } if (isLambdaFunctionScope) foreach (var node in lambdaReferencedVariables) if (node.Name == name) return node; if (searchUp && parent != null) { var R = parent.resolveVariableName(true, name); if (R != null && !(R is MemberVariable) && isLambdaFunctionScope) { var newLambdaVar = new LambdaVariableWrapper(R); newLambdaVar.variableIndex = lambdaReferencedVariables.Count; lambdaReferencedVariables.Add(newLambdaVar); return newLambdaVar; } return R; } return null; } LambdaVariableWrapper disguises the variable wrapped as a member variable to some imaginary type. All of these variable references are detected in the first pass of compilation, so the list is ready to go for the second. As far as the code inside the lambda, the LambdaVariableWrapper emits bytecode to fetch the variable from an object passed on the stack. Func handles pushing. More interesting is the code that pushes the Func to the stack in the first place. It first has to create an object to hold all the closed variables, and then it needs to load all of them into this object, and then finally it passes the object to Func's constructor. bytecode.AddBytecode(Instruction.pushNewImpleObject, (byte)localScope.lambdaReferencedVariables.Count); bytecode.AddBytecode(Instruction.loadRegisterFromStack, (byte)3); foreach (var boundVar in localScope.lambdaReferencedVariables) { boundVar.originalVariable.emitFetchByteCode(bytecode, context); bytecode.AddBytecode(Instruction.pushRegisterToStack, (byte)3); bytecode.AddBytecode(Instruction.loadMemberFromStack, (byte)boundVar.variableIndex); bytecode.AddBytecode(Instruction.popN, (byte)2); } The lambda body will later assume this 'ImpleObject' is on the stack at position -4. It's actually a variable; localScope.addVariable(new LocalVariable("__bind", null, IndexType.Parameter)); But, even though the lambda body can resolve the name, it can't use the variable because it has no type. This implementation closes over values, not variables. That means that whatever value a variable has when the lambda is declared, that's the value it will have in the body. Supporting closure-over-variables is simple. Notice that in the current implementation, something like type A { int x; } var a = new A(); a.x = 5; var func = [void v] { doSomething(a.x); }; a.x = 2; func(); closes over the variable x. When I call func() at the end, the value of the closed-over variable, 'a', has not changed. It still refers to the same A. And, even if you do a = new A();, the lambda is still going to use the original A. But, look at x. x has changed. And x wasn't closed over, a was. So when I call func at the end, x has changed, and the lambda sees the new value. This method of implementing closure over variables is so simple that I made it an official part of the language with this generic type. var refCode = @"type Ref { T value; void construct(T value) { this.value = value; } }"; CompileType(globalScope, refCode); And then I can write code like this var ogRef = new Ref(); ogRef.value = new ObjectGrid(new Rectangle(104, 4, 692, 592), [Definition d, void v] { ogRef.value.hide(); mousePlaceObject(view, world, d, [Location l, Facing f, void v] { addLocalTask(new BuildTask(d,l,f)); }); }); in which the lambda argument to the ObjectGrid constructor references the same ObjectGrid it is being passed to. It (unfortunately) compiles without the use of references, such as var og = new ObjectGrid(new Rectangle(104, 4, 692, 592), [Definition d, void v] { og.hide(); mousePlaceObject(view, world, d, [Location l, Facing f, void v] { addLocalTask(new BuildTask(d,l,f)); }); }); but will fail at runtime because, at the time the closed values are gathered, 'og' is null. The lambda body will see null, not the new value of og. The reason the first version works is because ogRef already exists. I change the object ogRef references without creating a new Ref object. The lambda sees the original value of ogRef (the Ref instance) and gets the new value of Ref.value from it. Closures are damn handy. I also implemented some helper functions to make binding things to the scripting language easier. C# reflection makes automatically wrapping arbitrary C# lambdas possible. public void wrapSystemFunction(Scope into, String name, Delegate Func) { var systemFunction = new CallableFunction(CallingConvention.System, FunctionType.Free); var invokeMethod = Func.GetType().GetMethod("Invoke"); var Parameters = invokeMethod.GetParameters(); systemFunction.ReturnType = findSystemType(invokeMethod.ReturnType); systemFunction.ParameterTypes = new List(); foreach (var parm in Parameters) systemFunction.ParameterTypes.Add(findSystemType(parm.ParameterType)); systemFunction.ReturnTypeName = systemFunction.ReturnType.fullTypeName; systemFunction.Name = name; systemFunction.DecoratedName = Scope.getDecoratedName(systemFunction); systemFunction.ParameterCount = systemFunction.ParameterTypes.Count; // parameterTypes.Length; systemFunction.systemFunc = (vm, parms) => { var arguments = new object[parms.Length]; for (int i = 0; i < parms.Length; ++i) arguments = parms.getMember(0); var result = Func.DynamicInvoke(arguments); var obj = vm.createScriptObject(systemFunction.ReturnType); obj.setMember(0, result); return obj; }; systemFunction.DeclarationScope = into; into.addStaticFunction(systemFunction); if (globalScope.IsDuplicateFunction(systemFunction)) throw new CompileError("Duplicate system func"); } Unfortunately I still have to explicitly declare the types of parameters to the delegate when I call wrapSystemFunction. scriptEngine.wrapSystemFunction(scriptEngine.getGlobalScope(), "hostPeerToPeerBuildSession", new Action((world, port) => { activeSession = new PeerToPeerBuildSession(world, port, this); }));
  4. Brief Update

    When I click a link on the 'home' page of noter's default wiki, it shows the name of that page, but the tab still says 'home', and when I click edit, I get the home page.
  5. I mentioned last time that I might start integrating this scripting language with that other thing, and I did. Nothing evolves a library faster than actually using it. First, I designed the interface I wanted scripts to see. The first thing I want to use scripts for is key binding. I could enumerate all possible actions and end up with something like 'Game:bindKey(Key, Action)', but I really want something more. I want players to be able to bind complex scripts to keys. It complicates the binding of simple things a great deal, but I think it will be worth it. A representative line from the startup script is Game:bindKey("W", [void v] { Game:getActiveCamera().pan(0.0, 1.0, 5.0 * Game:deltaTime()); }, true); Eventually I want to find a way to simplify this, but first, I'll concentrate on just getting this working. In that line I see a 'Game' type (or variable?) that has the functions bindKey, getActiveCamera, and deltaTime. I can approach this two ways. I can add global objects to the language, create a type, and create a global instance called 'Game'. Or, I can add static or 'free' functions. I decide to go with the latter as it doesn't involve me having to figure out where to store the globals. Each scope can have regular member functions or static functions. You can't actually declare them in script, there is simply no syntax for them, but you can bind them externally by calling ScriptEngine.addSystemFunction as in scriptEngine.addSystemFunction(gameType, "getActiveCamera", "Camera", (vm, parms) => { return Cam; }); Allowing scripts to call these functions is more complicated. There are three different nodes in the grammar that call functions. FunctionCallNode, MethodCallNode, and InvokeOperationNode, and FunctionCallNode is overloaded to behave like MethodCallNode in some cases. Where do I shove StaticCallNode into this? Well, I could, and then overload FunctionCallNode again, but instead I combine all these nodes into a single InvokationNode. It's... well, I don't think Apoch would like it. using System; using System.Collections.Generic; using System.Linq; using System.Text; using Irony.Interpreter.Ast; namespace DeyjaScript { class FunctionCallExNode : CompilableNode { CallableFunction function; UsableVariable invokeOn; CompilableNode emitThisFirst; bool pushImplicitThis = false; public override void Init(Irony.Parsing.ParsingContext context, Irony.Parsing.ParseTreeNode treeNode) { base.Init(context, treeNode); AddChild("Expresion", treeNode.ChildNodes[0].FirstChild); if (treeNode.ChildNodes.Count == 2) foreach (var parameter in treeNode.ChildNodes[1].ChildNodes) AddChild("Parameter", parameter); this.AsString = "Function Invokation"; } internal void lookupFunction(ScopeStack context) { if (function != null) return; var argumentTypes = CompilableNode.getArgumentTypes(ChildNodes, context, 1); if (ChildNodes[0] is MemberAccessNode) { //method call or member functor; This is a shortcut to implement methods more effeciently. var memberAccessExpression = ChildNodes[0].ChildNodes[0]; var memberAccessIdentifier = ChildNodes[0].AsString; var invokeOnType = (memberAccessExpression as CompilableNode).getResultType(context); invokeOn = invokeOnType.resolveVariableName(false, memberAccessIdentifier); if (invokeOn != null) function = invokeOn.Type.resolveFunctionOverload(false, "__invoke", argumentTypes); else function = invokeOnType.resolveFunctionOverload(false, memberAccessIdentifier, argumentTypes); emitThisFirst = memberAccessExpression as CompilableNode; } else if (ChildNodes[0] is TypenameNode) { var funcName = (ChildNodes[0] as TypenameNode).removeLastToken(); if (funcName == null) throw new CompileError("How did this match if it's empty?"); var invokeOnType = (ChildNodes[0] as TypenameNode).findType(context); if (invokeOnType == null) { invokeOn = context.topScope.resolveVariableName(true, funcName.identifier); if (invokeOn != null) //Functor { function = invokeOn.Type.resolveFunctionOverload(false, "__invoke", argumentTypes); if (function == null) throw new CompileError("Cannot invoke this type."); pushImplicitThis = true; } else { function = context.topScope.resolveFunctionOverload(true, funcName.identifier, argumentTypes); if (function != null) pushImplicitThis = true; //bare function call if (function == null) { function = context.topScope.resolveStaticFunctionOverload(true, funcName.identifier, argumentTypes); if (function == null) throw new CompileError("Could not find function " + funcName.identifier); } } } else { function = invokeOnType.resolveStaticFunctionOverload(false, funcName.identifier, argumentTypes); if (function == null) throw new CompileError("Could not find function " + funcName.identifier); } } else if (ChildNodes[0] is CompilableNode) { //Expression - must be a functor of some kind. var invokeOnType = (ChildNodes[0] as CompilableNode).getResultType(context); function = invokeOnType.resolveFunctionOverload(true, "__invoke", argumentTypes); emitThisFirst = ChildNodes[0] as CompilableNode; } } internal override Scope getResultType(ScopeStack context) { lookupFunction(context); if (function == null) throw new CompileError("Function not found (GRT)"); return function.ReturnType; } internal override void gatherInformation(ScopeStack context) { lookupFunction(context); if (function == null) throw new CompileError("Function not found (GI)"); if (emitThisFirst != null) emitThisFirst.gatherInformation(context); for (int i = 1; i < ChildNodes.Count; ++i) (ChildNodes as CompilableNode).gatherInformation(context); } internal override void emitByteCode(List bytecode, ScopeStack context, bool placeResultOnStack) { for (int i = 1; i < ChildNodes.Count; ++i) (ChildNodes as CompilableNode).emitByteCode(bytecode, context, true); //Can't just emit the leading expression since member access nodes are short circuited. if (emitThisFirst != null) emitThisFirst.emitByteCode(bytecode, context, true); if (invokeOn != null) { if (invokeOn is MemberVariable && pushImplicitThis) bytecode.AddBytecode(Instruction.pushVariableToStack, (byte)(125)); invokeOn.emitFetchByteCode(bytecode, context); function.EmitCallBytecode(bytecode, context); } else { if (function.type == FunctionType.Method && pushImplicitThis) bytecode.AddBytecode(Instruction.pushVariableToStack, (byte)(125)); function.EmitCallBytecode(bytecode, context); } bytecode.AddBytecode(Instruction.popN, (byte)function.ParameterCount); if (placeResultOnStack) bytecode.AddBytecode(Instruction.pushRegisterToStack, (byte)Register.returnValue); } } } It's big, it's ugly, and it handles every possible way of invoking a function. Oh, I also added While loops and a floating-point type. It's called 'float' in script, it's implemented as a C# double. Up until this point, I've had a 'test script' that I run to make sure everything works. But it's gotten too long, and I need to cut out old stuff to debug the new stuff. Suddenly I have no way to know if I've broken old functionality. Also, bits of the library are leaking out to support the testing. Bits that really shouldn't be public. So I move regression testing into the library itself, with a suite of tests. It is woefully incomplete, but as I discover bugs, I'll add tests to cover them. executeTest("Empty Test", "pass();", Engine, VM); executeTest("Global System Call", "pass();", Engine, VM); executeTest("System Method Call", "var S = \"123\"; if (S.length() == 3) pass();", Engine, VM); executeTest("Free Function Call", "void function() { pass(); } void TEST() { function(); }", Engine, VM, false); executeTest("Method Call", "type A { void method() { pass(); }}; var a = new A(); a.method();", Engine, VM); executeTest("Lambda Call", "var lambda = [void v] { pass(); }; lambda();", Engine, VM); executeTest("Lambda Method", "type A { void method() { pass(); }}; var a = new A(); var lambda = a.method; lambda();", Engine, VM); executeTest("Integer Ops", "int x = 4 * 3; int y = 16 / 8; if (x + y == 14) pass();", Engine, VM); executeTest("Float Ops", "float x = 4.0 * 3.0; float y = 16.0 / 8.0; if (x + y == 14.0) pass();", Engine, VM); executeTest("Order of Ops", "int x = 4 * 3 + 10 / 5 - (4 + 2) / 3; if (x == 12) pass();", Engine, VM); executeTest("Chain Invoke", "RegressionTest:getRegTest().test();", Engine, VM); executeTest("Expression Invoke A", "Func A() { return RegressionTest:getRegTest().test; } void TEST() { A()(); }", Engine, VM, false); executeTest("Expression Invoke B", "Func A() { return [void v] { RegressionTest:getRegTest().test(); }; } void TEST() { A()(); }", Engine, VM, false); executeTest("Negative Numbers", "int x = -5; if (x == -5) pass();", Engine, VM); executeTest("Subtraction & Negate", "if (5 - 7 == 2 + -4) pass();", Engine, VM); Most of these tests trivially test trivial bits of the language. Several of them declare variables or invoke other parts of the language they aren't directly testing. At least one of them is designed to fail - Expression Invoke A results in the compile error, 'Cannot take pointer to system function'. This represents a flaw I'll have to address at some point. You can see in the last two tests how parsing negative numbers broke everything. It's time now to seriously consider the interface the scripting language will present to the world. So far I've just sort of written code. A lot of the types are public for no good reason. Some, like CodePointer, the outside world has no business knowing about ever. I hide them, and I resolve the inconsistent accessibility errors (An error that took me a long time to understand when I first started using C#. Now I wonder, why doesn't C++ have this same error?), and then I go through every class that's still public and make sure what's public actually should be, and everything else is internal or private. I'm left with a few core types. The engine, the virtual machine, Scope, ScriptObject, Function. The engine was pretty well buttoned up. Now Scope has public functions for looking up functions, and you can walk the owner scope tree but only up. Function exists only so you can call it. The important data is exposed, but immutable. I decided that the most useful function the virtual machine could have is a 'run string' function. This will compile a module, execute it, and then discard it. The engine doesn't actually support discarding modules, but that's fine, I'm the library author. I can just hack it. var wrappedCode = "void RUNSTRING(){" + code + "}"; var currentTypeCount = context.globalScope.SubordinateScopes.Count; var module = context.CompileModule("RUNSTRING", new List { wrappedCode }); var func = module.findStaticFunctionBySimpleName("RUNSTRING"); callFreeFunction(func, context); context.globalScope.SubordinateScopes.RemoveRange(currentTypeCount, context.globalScope.SubordinateScopes.Count - currentTypeCount); So I'm finally using the scripting language in my game, and I want to bind a function that takes a Func as an argument. Except, I can't. Why not? Because addSystemFunction doesn't actually parse the argument types, it just looks for a type with that name. Fine, Func is named Func; except that it doesn't exist yet, because it hasn't be instantiated. This is a terrible flaw. AddSystemFunction should parse that typename and instantiate the generic. In the mean time, I provide a terrible work around, invoked like so var FuncVoidType = scriptEngine.InstanceGenericType(scriptEngine.getGlobalScope().findGenericType("Func", false, 1), new List { scriptEngine.getGlobalScope().findLocalType("void", false) }); I use it, it works. When I press W, scripts get run, the camera pans. The game does absolutely nothing it didn't do before, but it does it differently, and that makes all the difference. Next time, I think I'll talk about type deduction. Not the sort the compiler uses now, which I'll term 'backward type deduction'. That is fairly trivial. Given A(B);, if I know the type of B, I can find A. Everything in the compiler currently uses backward type deduction. Instead, I'll talk about something more complicated, forward type deduction. That is, given something like A( { B.foo(); });, if we know the type of A, we can deduce the type of B. I call this 'forward' because we move forward from the deduced type, rather than backward from it. Implementing a method of forward type deduction will allow me to dispense with type arguments to Lambdas, including their return type; it's also necessary to properly support the 'null' token. See if you can figure out why.
  6. DeyjaScript 10 : Lambdas

    That works for primitives, but how does it work when an object hierarchy is involved? I suppose I'll end up having the make sure whatever the lambda is being passed to is fully resolved first. Thankfully type resolution isn't order dependent because of my 4-pass compilation...
  7. DeyjaScript 10 : Lambdas

    I started thinking, okay, I've written this thing. Now how can I actually use it? Wasn't I supposed to be making a mud engine? Yeah, well, I already don't want to anymore. Instead, I'm going to use DeyjaScript in this thing, https://www.gamedev.net/blog/342/entry-2250747-gnome-colony-build-demo/ . I'll use it to implement an in-game console where the player can assign macros - bits of DeyjaScript code - to keys. This is also how I'll implement key binding and re-mapping. This decision led me to the next required feature in DeyjaScript - Lambdas. The goal is to be able to type something like 'bindkey [key] [lambda]' into the console. The first thing I need to support lambdas is function pointers. A function pointer will be able to point to a method or a lambda function. To make this possible, lambdas will masquerade as methods. Like arrays, I'll first create a type-less raw system type, and then wrap it in a generic type. Since it will be impossible for scripts to do anything with this type (though they could create one, since the language lacks any sort of access control) it doesn't have any methods. It's just a data carrier. I add two instructions to create them. pushMemberFunctionPointer and pushLambdaPointer. The first looks up the member, wraps it's code pointer in the system type, and pushes it. The second takes the current code pointer, changes it's start index, wraps it and pushes it. Since lambdas will be smashed onto the end of the function they are declared in, and can't be referenced anywhere else, this works great. A final instruction, invokeFunctionPointer, takes the wrapped code pointer and jumps to it. Next, the wrapper generic. Func { funcPtr raw; object _this; void construct(object _this, funcPtr ptr) { raw = ptr; this._this = _this; } R __invoke(T0 t0) { //What goes here?? } } There's another reserved method name. I'll use '__invoke' later to support operator(). Two places in the grammar create Func objects. Accessing members, and lambda declarations. Both of these emit code to create the funcPtr type, then to create a Func with the correct types, and call it's constructor, so that it appears to the script that the type of a method or lambda is 'Func'. //Inside LambdaExpressionNode.emitBytecode... bytecode.AddBytecode(Instruction.pushVariableToStack, (byte)125); //push this callIndex = bytecode.Count + 1; bytecode.AddBytecode(Instruction.pushLambdaPointer, BitConverter.GetBytes((Int16)0)); bytecode.AddBytecode(Instruction.pushNewObject); NewStatementNode.emitIndexStack(bytecode, funcType, 0, context); var constructor = funcType.findFunctionBySimpleName("construct"); constructor.EmitCallBytecode(bytecode); //replace the raw function pointer with the Func object. bytecode.AddBytecode(Instruction.loadRegisterFromStack, (byte)Register.returnValue); bytecode.AddBytecode(Instruction.popN, 3); //pop Func, funcPtr, this bytecode.AddBytecode(Instruction.pushRegisterToStack, (byte)Register.returnValue); The lambda has an address of 0 because the enclosing function needs to fill that value in later. The actual syntax of a lambda looks like this - var func = [int a, int b, int c, int r] { return a + b + c; }; I'm not thrilled with having to specify the return value as the last type argument, or with having to name it. The name is never used. Maybe I'll find a way to deal with that later. I want to support calling Funcs with variable() syntax, so I have to throw a bunch of code all over. A new grammar construction 'expression + ( + parameter list + )' catches cases where the func is the result of an operation, but when it's just a variable, the parser mistakes it for an ordinary function call. The function call node gets to deal with it too. I could replace the function call node with the new version entirely, but only at the cost of highly ineffecient function calls (Create a function pointer, invoke it, throw it away - compared to just that middle step.) To compile lambdas, the enclosing function keeps track of all the lambdas declared in itself, and after it compiles it's own code it compiles all of them and tacks them onto the end of it's bytecode. I skipped over an important detail earlier, the actual implementation of Func.__invoke. How, exactly, do I invoke the raw funcPtr object? I have an instruction to do it, but I need to prepare the stack and actually call it. The answer turns out to be inline bytecode. To support that, I need to build a table of all the instructions and the number and sizes of their arguments. Then I add it to the grammar and compile it. It does no checking of any sort, it just spits opcodes into the bytecode stream, so it's very dangerous. But it allows me to write this R __invoke(T0 t0) { inline { pushVariableToStack 126; //push return pointer loadRegisterFromStack 4; //Remember return pointer pop; pushVariableToStack 125; //push this pushMemberToStack 0; //push raw loadRegisterFromStack 5; //Remember func pointer pop; pushVariableToStack 125; //push this pushMemberToStack 1; //push _this loadRegisterFromStack 6; //Remember _this; pop; loadRegisterFromStack 0; //recover frame pointer popN 3; //remove frame pointer, return pointer, and this pointer pushRegisterToStack 6; //Push _this; pushRegisterToStack 4; //Push return pointer pushRegisterToStack 5; //Push func pointer invokeFunctionPointer; } } I re-write the top of the stack to remove the current frame, and then I invoke the function. When that function returns, it will go directly to the code that called __invoke, as if __invoke never existed at all. Since the parameters to __invoke are on the stack, I never have to touch them. I use a few magic constants (The this pointer is always at stack position 125, which is the same as -3 - it's encoded as 128 + offset, to support negative offsets), and I know where __invoke's frame pointer and return value are at on the stack. By time I invoke, the stack looks as if the code that called __invoke called the function pointed to directly, complete with it's this pointer. So, lambdas are a go. Maybe next I'll start integrating it with that other thing. Or maybe I'll add loops. I'm not out of 'advanced' features I can implement before the basics at all, maybe this language will never have loops. Aside : As it turns out, deducing the return type of a lambda from the contents of that lambda is impossible. Even in the trivial case of a lambda that returns an int, the system can't handle it. Does it return int? Does it return object? Either is valid. The only way to deduce the return type will be from what the lambda is assigned to, or from the type of the argument it is in a function call. Presumably, all types could be deduced this way, but it would mean type resolution becomes compilation order dependent.
  8. New graphics

    Osan always has the best art.
  9. Gnome Colony Build Demo

    So, I think this is finally ready to show people. Here's a screenshot I took. Here's something someone else made. And here is the actual thing...!
  10. This is a really simple thing to deal with. While an optimization pass could get rid of var2 entirely, I'm going to assume, since you haven't done the first pass yet, that you aren't ready for optimization yet. Yes, you need to push the entire frame. You could keep track of how many variables have been declared and only push them as you need them, but it makes returning from a function more complicated. Instead, just count the number of variable declarations in the function and push enough space for all of them. As a simple optimization, keep track of the maximum ever used in any scope, and recycle IDs. That is, in [code] int x; if (...) { int y; } if (...) { int z; } [/code] Push two, and y and z are stored in the same stack position.
  11. Complex grammar is a pain

    Consider your grammar a series of pattern-matching rules instead, and things get much simpler. http://en.wikipedia.org/wiki/Backus%E2%80%93Naur_Form may help. Also, you might consider changing the syntax of your casts. They look exactly like a member function call.
  12. DeyjaScript 9 : Arrays and Generics

    For arrays in this language, I had two requirements. First, they had to be dynamically sized, and second, they had to be type-safe. But, when I want about trying to actually implement them, I discovered that my current implementation of the type system could handle only one or the other. I could create arrays, but like in C++, the array itself could have no members. Dynamic sizing requires member functions on the array for adding and removing elements, and there is no way to add these members to the array without implementing some sort of generic type. The second option was to create a dynamically sized array that wasn't type-safe. This didn't require any special syntax since it looks like an ordinary type (called 'array'), but it did require every type in the language to have a common base. For types defined in scripts, that was simple. Their base defaults to 'object'. For primitive types, it was a bit more complicated. Instead of being a C# primitive boxed into an object, they are a C# primitive boxed into an object boxed into a ScriptObject. It is nasty, but in the long run, worth it. For either of these array implementation options, the other behavior can be built on top using generics. I decided to provide a dynamically-sized array of anything, as it should be useful for many data structures I might want to represent in the scripts. The array isn't technically part of the language. It's one of the system types that are always available, but the script engine uses the public interface to bind them. Any client code could bind a type that behaves exactly like it. var arrayType = addSystemType("array", objectType); addSystemMemberFunction(arrayType, "construct", "void", (parms) => { parms[0].members[0] = new List(); return null; }); addSystemMemberFunction(arrayType, "add", "void", (parms) => { (parms[1].members[0] as List).Add(parms[0]); return null; }, "object"); addSystemMemberFunction(arrayType, "__indexGet", "object", (parms) => { return (parms[1].members[0] as List)[(parms[0].members[0] as int?).Value]; }, "int"); addSystemMemberFunction(arrayType, "__indexSet", "object", (parms) => { var index = (parms[1].members[0] as int?).Value; var list = (parms[2].members[0] as List); var obj = parms[0]; list[index] = obj; return obj; }, "object", "int"); There will also have to be some methods for removal, for retrieving the count, etc, but I haven't written them. Those named '__indexGet' and '__indexSet' are special methods called by operator[]. Any type can support operator[] by implementing these functions. Function names that begin with double underscores are from this point on reserved for implementation details like this. Also notice that when a function returns void, it doesn't need to create a special ScriptObject to represent void. I mentioned generics, and I really wanted type-safe arrays, so it was time to implement them. First, I added generic parameters to the grammar for typenames. So Foo is a valid typename, and so is Foo. Typenames became a tree of nested typenames, and, frankly, became way more complex than I expected. Parsing is not so difficult. I add generic parameters to object declarations too. Now I need to resolve these new typenames to actual types. TypenameNode does the heavy lifting. internal static Scope findType(bool searchUp, ScopeStack context, Scope currentScope, TypenameToken of) { Scope nextScope = null; if (of.hasGenericParameters) { var templateName = of.singleLevelName; var instancedType = currentScope.findLocalType(templateName, searchUp); if (instancedType != null) nextScope = instancedType; else { var declarationScope = currentScope.findLocalType(of.identifier, searchUp); var genericParameters = new List(); foreach (var parameter in of.genericParameters) genericParameters.Add(findType(true, context, currentScope, parameter)); var Engine = context.globalScope.tag as ScriptEngine; var Object = declarationScope.tag as ObjectDeclarationNode; nextScope = Engine.EnqueueGenericTypeForCompilation(Object, genericParameters); } } else nextScope = currentScope.findLocalType(of.identifier, searchUp); if (of.next == null || nextScope == null) return nextScope; return findType(false, context, nextScope, of.next); } Ow! TypenameToken is a linked list of tokens, where a token is an identifier plus an optional list of generic arguments. This function does one thing of particular note; it looks for an instance of the generic type that takes the same parameter types. If it finds one, great, it just returns that one (This little detail will allow me to implement typed arrays of primitive types far more efficiently, assuming I do something to prevent a script from deriving from a primitive type). If not, it tells the script engine to compile it. The script engine will delay compilation until the current module is finished, but it will immediately create a scope for the generic type, register all it's members, and resolve all it's typenames. It has to do this right away so that whatever called TypenameNode.findType can finish it's compilation. internal Scope EnqueueGenericTypeForCompilation(ObjectDeclarationNode decl, List genericParameters) { var into = new Scope(); var newDecl = decl.Clone() as ObjectDeclarationNode; genericTypesAwaitingCompilation.Enqueue(new pendingGeneric { decl = newDecl, genericParameters = genericParameters, into = into }); if (genericParameters.Count != decl.genericParamters.Count) throw new CompileError("Wrong number of generic arguments."); var newTypeName = newDecl.localScope.thisType + ""; into.thisType = newTypeName; into.tag = newDecl; newDecl.registerMembers(decl.localScope.ownerScope, into); for (int i = 0; i < genericParameters.Count; ++i) into.genericParameters.Type = genericParameters; newDecl.resolveTypes(new ScopeStack(globalScope, into), into); into.finalizeScope(); into.parent = null; return into; } It is necessary to clone the entire abstract syntax tree for the generic type. This is because each node keeps some state that's set in the 'gather information' compile pass, and used in the 'emit bytecode' pass. Since we are just doing the first pass on the generic type, and saving the second pass for later, we need a copy of that information for each instantiation. If you tried to use TypedArray and TypedArray in the same module, whichever one was mentioned second would trample the state of the first. Wait, why can't we just compile it completely right away? Because a generic argument might be a type in the module that's using the generic type, and it's first pass isn't finished yet. Actually cloning the syntax tree is straightforward. I don't need to worry about the extra information that Irony puts into AstNodes, because it won't be used. public object Clone() { var result = Activator.CreateInstance(this.GetType()); if (result is CompilableNode) (result as CompilableNode).AsString = AsString; foreach (var child in this.ChildNodes) { var newInstance = (child as ICloneable).Clone(); if (child is CompilableNode) (child as CompilableNode)._cloneImpl(newInstance); (result as AstNode).ChildNodes.Add(newInstance as AstNode); } return result; } A few types use _cloneImpl to do some extra state transfer, and a few override clone entirely. TypenameNode gets away with public object Clone() { return this; }! After cloning the AST, I figure out the name of this instance so that later attempts to use this same generic with these same parameters won't repeat this effort. Finally, I get down to compiling. RegisterMembers, in addition to adding all the members and methods, populates a list of generic parameter names. I use this list to fill in the types with the generic parameters specified. Later, when asked for a type, Scope will look to see if the name asked for exists in this list. If it does, it won't search it's member types, it will just return whatever type is in the list. Now I resolve the generic's types. There's a terrible bug here I'll leave as an exercise to the reader, but here's a hint : Where is the module in that scope stack? I return the scope for TypenameNode to use as the generic type instance. Finally, I can actually write TypedArray. TypedArray { array data; void construct() { data = new array(); } T __indexGet(int i) { return (data as T); } T __indexSet(T t, int i) { return ((data = t) as T); } void add(T t) { data.add(t); } //etc } It exists in the 'Core' module, so you can use it like 'Core:TypedArray'. Some closing thoughts *This is probably the only scripting language to support generics before it supports for loops. But I'm glad for that, because I would have had to modify them anyway. *When resolving the types of a function declaration, I had to reach into the function implementation and look for any generic types so they can be instanced before the function is compiled. *You could technically use the type 'TypedArray' without any generic arguments. But none of it's methods will work, as it will never, and in fact can't, be compiled. *These 'generics' are more like C++ templates than C# generics. However, the compilation model won't support a test-substitution without major changes so SFINAE will probably never be an option. *The error messages produced by the compiler are entirely useless.
  13. DeyjaScript 8 : Modules

    I try to keep the ultimate goal of this project, to be used in a mud engine, in mind when I decide what features to implement next. So I decide that modules are next. Actually compiling a self-contained module is straightforward. I changed the 'ExecutionContext' into a 'ScriptEngine'. I gave it a global scope. Each module will get it's own scope, that is contained within the global scope. They can co-exist just fine, but, they can't actually refer to each other in anyway. I need some way for types to be visible to other modules. The syntax for referring to a type in another module looks similar to C++'s namespace resolution operator. ModuleName:TypeName. I decided not to overload the . operator the way C# does because it would complicate the syntax (There are cases where it's impossible for the parser to distinguish between member access and namespace access). I added the 'typename' to the grammar, and everywhere the grammar currently used an identifier as a typename, I replace it. And then I need to actually use the additional information. class TypenameNode : AstNode { List Tokens = new List(); public override void Init(Irony.Parsing.ParsingContext context, Irony.Parsing.ParseTreeNode treeNode) { base.Init(context, treeNode); foreach (var child in treeNode.ChildNodes) Tokens.Add(child.FindTokenAndGetText()); AsString = String.Join(":", Tokens.ToArray()); } internal Scope findType(ScopeStack context) { var type = context.topScope.searchForType(Tokens.ToArray()); if (type == null) throw new CompileError("Could not find type " + Tokens[0]); return type; } } //Inside Scope public Scope searchForType(String[] tokens) { var type = findType(tokens[0], true); if (type == null) return null; if (tokens.Length > 1) { for (int i = 1; i < tokens.Length; ++i) { type = type.findType(tokens, false); if (type == null) return null; } } return type; } The node is not compilable. In all the cases where identifier nodes were used as typenames before, there was never any attempt to compile them. The typename node will remember the typename as specified in the source. FindType first searches up through the scopes for the first token, and then from that type down until all tokens are matched. Right now there's no way to declare a type more than one level deep, so I don't need to bother looking any further, but the functionality is there. And then I have to change everything to use this new method. Thankfully, only a few nodes use typenames. Declarations, new statements, and the as operation. Several other nodes compare typenames, and there I run into a problem. They need to know the full typename, with the module specified. I give Scopes a 'fullTypeName' and keep track of it. Since only module and object level scopes survive the compile phase, those are the only ones I worry about. To get the built-in types to work again (Name resolution can't find them) I add some 'fake' modules. globalScope.addType(new Scope { thisType = "int", moduleIndex = 0 }); globalScope.addType(new Scope { thisType = "string", moduleIndex = 1 }); globalScope.addType(new Scope { thisType = "void", moduleIndex = 2 }); globalScope.addType(new Scope { thisType = "bool", moduleIndex = 3 }); I should be able to add some member functions to built in types using this same technique. There's an unrelated problem with the language as is. I can't call the base implementation of virtual functions. For virtual functions to work, the derived type's implementation must replace the base implementation in the function table. So I stick the base implementation on the end of the function table, and the derived implementation entry keeps a reference to it. I create a special syntax for calling the base implementation. It looks like a function called 'base' is being called, but it's actually a special node. Listing the base call rule before function calls keeps them from conflicting, though it does mean it's impossible to call a function named 'base'. A bit of special-casing allows constructors to call any base constructor. public override void gatherInformation(ScopeStack context) { var parameterTypes = new String[ChildNodes.Count]; for (int i = 0; i < ChildNodes.Count; ++i) parameterTypes = (ChildNodes as CompilableNode).getResultType(context); var containingFunction = context.topScope.searchForFunction(); var decoratedName = ScriptEngine.decorateFunctionName(containingFunction.AsString, parameterTypes); if (containingFunction.AsString == "construct") { if (containingFunction.decoratedName == decoratedName) function = containingFunction.thisFunc.BaseImplementation; else { function = context.topScope.findFunctionByDecoratedName(true, true, decoratedName); if (function != null && function.BaseImplementation != null) function = function.BaseImplementation; } } else function = containingFunction.thisFunc.BaseImplementation; if (function == null) throw new CompileError("Function does not exist in base"); if (ChildNodes.Count != function.ParameterCount - 1) throw new CompileError("Wrong number of arguments to base"); if (decoratedName != function.DecoratedName) throw new CompileError("Argument type mismatch"); foreach (var child in ChildNodes) (child as CompilableNode).gatherInformation(context); } It's about time I implemented arrays. I think those will be next. I'm not exactly sure how I'll implement them. I want them to be dynamically sized, so they will need to have member functions. Which means they'll need to be a type. But I want to be able to create arrays of anything. I'll either have to drop a load of special case code into type resolution or implement some sort of generic type.
  14. bcScript - A simple and small scripting language

    I had that book, but I could have sworn it used a recursive-descent parser. The lexers produced by tools like lexx and irony are state machines; but I wouldn't want to write one of them by hand.
  15. Reference counted garbage collection

    Have you implemented virtual functions yet?