Useful scripting language features?

Started by
25 comments, last by Republicanist 15 years, 8 months ago
I've never really understood 'closures' but I guess I now understand.

I'm currently to cure boredom designing a javaish language.

Similar to ToohrVyk's example, one problem logistically I've tried to solve is how something like the following would work internally (due to the use of first class functions):
# lambda/closure like example# create a function called foo (lack of 'var' makes it const)foo = func (ref obj){	var ctr = 0;	obj.counter_callback = func ()	{		++ctr;	}}

The problem I'm having in my head (before even trying to workout the code implications) is how would [ctr] be referenced when foo.ctr only exists during the foo() call.

I guess local variables shouldnt be allowed to be used in sub-functions/lambdas.

Alternatively, I could allow C-like static declarations or allow it to be used on the ref (pointer as in your argument).

# lambda/closure like example# create a function called foo (lack of 'var' makes it const)foo = func (ref obj, ref ctr){	obj.counter_callback = func ()	{		++ctr;	}}
Advertisement
Quote:Original post by thre3dee
The problem I'm having in my head (before even trying to workout the code implications) is how would [ctr] be referenced when foo.ctr only exists during the foo() call.

The solution is that while the name foo.ctr should notexist after the foo() call, var ctr = 0; creates a new value in memory, and the anonymous function assigned as a callback references this value, keeping it in memory once foo has returned. This is a trivial thing to accomplish in a garbage-collected language (simply have the closure reference the variable).

Quote:I guess local variables shouldnt be allowed to be used in sub-functions/lambdas.


The entire point of anonymous functions (and nested functions) is to allow referencing a local variable! If you never reference local variables, then your anonymous function is nothing more than a syntax trick to avoid giving it a name in the global scope, which makes the feature quite useless.

Quote:Alternatively, I could (...) allow it to be used on the ref (pointer as in your argument).

Nope, that would cause the same problems as any other local variable:

foo = func (ref obj, ref ctr){  obj.counter_callback = func ()  {    ++ctr;  }}bar = func (ref obj){  var ctr = 0;  foo(obj, ctr); // Boom, same issue.}


The way I understand it is... the hierarchy relationships displayed in your code are translated into a similarly structured series of classes and methods, as necessary.

In fixed systems, such as the .NET framework, the Lambda expression itself would be lifted into its own method to account for giving the method a place in the CLI's VTable for the class that defines the methods that uses the lambda expression. This also makes logical sense as opposed to creating a completely new portion of application code every time you call the method.

Things get trickier when you deal in closures. If a condition occurs where a local from the lambda's declaring method is referenced inside the lambda (the ctr example you're giving), the compiler needs to perform extra work to handle the reference, so that it's still valid, but doesn't require an extra parameter to be added to the lambda's signature (which would break using the lambda on cases where a method call requires a pointer to a method of a given signature).

The extra work involves creating a completely new class (within the scope of the method that used the lambda, so private members of the class can still be referenced inside the lambda). The class would contain the variable the lambda expression references. The original method that defined the local would now refer to the value defined within the newly generated class.

Here's an example (written in C#, since it uses lambdas):
public void TestMethod(){    var i = 0;    FuncV inc = () => { i++; };     //I believe the equivalent everyone else's been using is:    //var inc = func(){i++;}    for (int k = 0; k < 10; k++){        inc();        Console.WriteLine(i);    }}


Here's the example of how the compiler/interpreter would restructure the code:
private class LambdaGenerated_TestMethod{    internal int tm_i;    internal void LambdaGeneratedMethod0001()    {        this.tm_i++;    }}public void TestMethod(){    var incSrc = new LambdaGenerated_TestMethod();    incSrc.tm_i = 0; //var i = 0;    for (int k = 0; k < 10; k++)    {        incSrc.LambdaGeneratedMethod0001();        Console.WriteLine(incSrc.tm_i);    }}


Depending on your language's implementation you can ignore the public/private/internal scoping.

There's also one important thing to note about lambda expressions. In the above example you noticed that it replaced the variable 'i' with a reference to the class that defined the lambda and its internal storage for i. If you were to make an array of func() instances all from the same lambda, all would contain a reference to the same exact 'i' and all would be the same instance, depending on the scope of where 'i' was originally sourced. If 'i' was from the scope of the method, there'd be one instance of LambdaGenerated_TestMethod; however, if the 'i' was sourced from a variable declared inside a loop, it would contain multiple instances, because the value was sourced from a sub-scope, and the instances of LambdaGenerated_TestMethod would go out of scope similarly as the values it references.

Edit:
I forgot to mention that the example I gave is slightly different from how C# actually generates its solution for lambdas/closures. The only real difference is C# actively refers to the original delegate type (FuncV) to invoke the method instead of calling it directly, I went for the example shown to emphasize from the examples here that there's no real type implied; thus, why the example calls the class member function directly, instead of being piped through a function pointer (delegate).

[Edited by - AlexanderMorou on July 26, 2008 11:11:01 PM]
Well, in my language the class implementation would probably work out like this:

Lambda-ish code
foo = func (ref obj){	var ctr = 0;	obj.increment = func ()	{		++ctr;	}}


In-language implementation
lam0 = class{	var ctr := 0;	f := func () {		++ctr;	}}foo = func (ref obj){	# while lam0.f() is referenced the lam0 instance isn't destroyed	obj.increment = (new lam0).f;}
Quote:For inspiration watch this lecture by the author of Urbi, his language has a very elegant syntax for writing concurrent scripts.


I have watched this a couple of times now and it really is super inspiring. The guy has a strong sense of cross-over of concurrency concepts /fsms with games also. please share if you know of any other vids like this on dsl's.
Some feature I think should be in every language:

Strictness. I know, some people don't like it, but you should at least give the user this option if he wants it, similar to VB's Option Explicit.

Detect errors as early as possible. This helps a lot with debugging, and yes, some strict languages get in the way of writing code, but this surely helps when debugging it.

Have an editor in the game itself. This is especially important if loading or switching times are an issue (Or its in full-screen mode). And make sure that you don't have to restart the game to load a script... just not good, really slows the user down. Not a good idea. (what I mean by this, is, the entire game... the scenario or level can be reloaded, and you can optimize this - only reload what changes) )

Some examples:
No strictness in language (this is why I don't use &#106avascript):

foo = "Hello!";
MsgBox(fou); // oops

Python-type semantics (if a variable was not already assigned, don't try and receive a value from it)
foo = "Hello!";
MsgBox(foo);
fou = "Hello Two!"; // oops
MsgBox(foo);

Strict-semantics (without types... it doesn't need specific types, though it can be nice sometimes)
var foo = "Hello!";
MsgBox(fou); // compiler error
fou = "Hello!"; // compiler error
foo = 4; // allowed because foo doesn't have a specific type

Do not allow pointers! Bad idea! (especially if you're going to be allowing you're users to distribute with other players... not only should the program be proven safe, but if you write the language so that it is always safe, always, then it shouldn't matter...)

Have you read the dragon book? Its a nice literature on writing a compiler, and a classic in that area of study. (Called "Compilers: Principles, Techniques, and Tools" by Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman - it has a dragon on the cover)

I'd also recommend Jack Crenshaw's "Let's Build a Compiler" - you might have seen it already.

I'd recommend compiling to some kind of bytecode... not only is it faster, I think its actually easier to write...

I like the garbage collection idea. And it might be kind of nice to allow references, kind of like this:
ref int a = 4;
ref int b = a;
b = 6; // a == 6 now

Or even nullables (like C#)
bool (true false)
bool? (true false null(unknown) )

Some structure type would be awesome...
Have you ever thought of OOP? Doesn't need to be too complex...

Oh yeah. Don't get caught in feature bloat.

And if you think of the most awesome language feature, write it down, and try implementing it. See how well it helps the language.
Quote:Original post by Republicanist
I like the garbage collection idea. And it might be kind of nice to allow references, kind of like this:
ref int a = 4;
ref int b = a;
b = 6; // a == 6 now

The way my 'toy' language was going to implement constants and references was by the two keywords var and ref in a slightly different way than ur example.

# const/variable examplesin = func(x) { }; # constsin = 2; # errorvar cos = func(x) { }; # variablecos = "hello"; # ok# referencesa = 2;var b = a.ref;a = 1; # error, a is constb = 1; # error, reference is a constantb.var = a; # assigns a's value (2) to b; .var explicitly assigns the variable not the referenced variable if it is a reference# x parameter is by-ref, and y is a variablefoo = func(ref x, var y) { };


Also, it was going to have explicit type semantics like ActionScript (ECMAScript?).

var:number,ref:number a = 2; # correctb = 2;a = b.ref; # correct, b is a numbervar:ref c = a; # c is now referencing b through a's reference (if a was not referencing anything, this would fail# c can only be assigned a reference to any type of variable


This is one way of having a pointer system. You can explicitly declare a variable to be a reference to a particular type.

# game example# import game module, containing game classes, functions and globals etcimport game;# the 'e' parameter is both a by-ref const (e is read-only) and also only allowed to be a game.entity reference (game.entity would have to be a class type at run-timegetModelName = func(ref:game.entity e){	return game.resources.getModelById(e.getModelId()).name();}# in usescript = func(){	var e = new game.entity; # create instance of class object 'game.entity'	e.setModel(game.resources.getModel("monster.md2").id();	if (getModel(e) == "monster.md2") {		print ("Success!");	}}


[Edited by - thre3dee on August 25, 2008 7:54:04 PM]
I've just started writing a game with Lua, and right now I feel a good module system is quite important. It seems that the module() function is a bit of an afterthought. This page talks a bit about some of the problems with it's current design, and maybe some other ways it can be redone. I'm also trying to work out a good way to load and unload modules for a seamless world, so a language that helps me with that would be nice :)
Quote:Original post by Republicanist
Python-type semantics (if a variable was not already assigned, don't try and receive a value from it)
foo = "Hello!";
MsgBox(foo);
fou = "Hello Two!"; // oops
MsgBox(foo);

I'm not sure what your point is here: there's no error here in Python, and a strongly-and-statically-typed language would not see an error either. Some things the programmer just has to take responsibility for!
Quote:Original post by Kylotan
Quote:Original post by Republicanist
Python-type semantics (if a variable was not already assigned, don't try and receive a value from it)
foo = "Hello!";
MsgBox(foo);
fou = "Hello Two!"; // oops
MsgBox(foo);

I'm not sure what your point is here: there's no error here in Python, and a strongly-and-statically-typed language would not see an error either. Some things the programmer just has to take responsibility for!


The point is that the programmer tried to assign something to foo, but accidentally created a new variable fou. In a language such as C# or C++, you would be forced to tell the compiler "Yes, I'm creating a new variable."

string foo = "Hello!"; // forced to say "new variable"
MsgBox(foo);
fou = "Hello Two"; // error caught - tried to assign to non-existent variable
foo = "Hello Three"; // allowed

Yeah, the code was not meant to be in error, except for in logic. Maybe its cause I like languages which allow you to do stuff, but explicitly.

From above - references are not pointers, to say. A reference must be valid (or point to NULL) - dereferencing a reference does not allow bad things to happen. However, a pointer can point anywhere, including areas that you would not want it to point. And references are probably easier for the user to understand - they can never be invalidated :)

This topic is closed to new replies.

Advertisement