Scripting Language: Informal Request For Comments

Started by
8 comments, last by flangazor 18 years, 10 months ago
I recently found the need to include a scripting language in one of my non-gaming related projects. The intent is to port certain sections of existing C++ code over to the scripting language to allow users the ability to change and extend existing behavior. I started looking around for languages based on C or C++ in the hopes of finding one that met a good portion of my requirements. While a couple of languages (AngelScript, C#, and CSL) came pretty close there were requirements that could not be met. Since the use of the scripting language is long term I have the choice of either writing a compiler from scratch or enhancing an existing package to meet the requirements. Luckily there are a considerable number of languages already based on the C grammar. This significantly increases the possibility of finding a language that requires minimal modification to meet the requirements. Interestingly enough the one language that came closest was ECMA-334 (C#). The only problem I can foresee in extending an existing language or VM is that the performance of the VM may not be efficient enough for a game environment. This of course is the one of the primary motivations in making this post – to get feedback and possibly find something that has been overlooked. Language Extensions: ------------------------------------------------------------ The scripting language described is based primarily on C. Because the C grammar is well documented most of the language is not covered here. It is assumed that common language features such as primitives, syntax, keywords, and identifiers all follow the same rules and behavior in C and/or it's object oriented derivatives (Java, C#, C++, etc.). Most of the features are based on specific requirements for the runtime environment. While many of these features could easily be implemented with commonly used design patterns the implementation of the described functionality may not perform well when executed in a virtual machine. Since the requirements are still in draft I expect several items to be streamlined over the next few weeks in both the language and the runtime environment. The following is a summary of certain aspects of the language which reflect the requirements of the project and the desired level of functionality beyond existing C based grammars. 1. Object Oriented. While I see nothing wrong with plain procedural programming using an object oriented language generally provides a much cleaner approach and lends additional support for strongly typed constructs. By using explicit declarations for abstract interfaces and class implementations the language becomes cleaner. Includes support for interface, class, extends, implements, abstract, and final keywords from Java and similar languages. As with Java and C# all classes inherit (at the very least) from a base class named Object. 2. Lightweight runtime. Unlike Java, C#, etc. the requirements of the runtime environment are minimal and only provides those items that are required by the language to function. Classes and interfaces for strings, reflection, threading, etc. are recommended and may be required in the final draft of the language and runtime spec. 3. Strongly typed. I know that a lot of people prefer dynamically typed languages but there are far too many pitfalls and performance bottlenecks associated with dynamic typing. Support for typecasting should be significantly reduced and in some cases eliminated all together. 4. No pointers. In this type of environment pointers are not necessary. Instead this is replaced with built in iterators and object references. 5. Exception handling (try/catch/throw). Unlike most languages the exception handling accepts only a single typed enumeration as it's argument. This eliminates the creation of objects when throwing an exception and forces the explicit use of constant values. This is a performance over flexibility decision. 6. Primitive types including boolean, char, byte, short, int, and float. Like Java primitives corresponding classes can be supported but are not necessary. I'm still tossing around whether to include auto 'boxing' of primitives or not. Includes modifiers such as static, const, and volatile. 7. Typed Enumerations. One of the original but minor annoyances (for me) of Java was its lack of enumerations. I was amazed that a language designed to be strongly typed would overlook such a simple and easily supported construct. Luckily this is no longer the case for Java and most other languages provide support for enumerations in one form or another. 8. Expressions in switch/case statements. Traditionally case statements in C, C++, and Java require a constant value. The use of constant values in switch/case statements allows the compiler to detect duplicate case statements using the same value. This removes the possibility that a case statement will be skipped if one before it uses the same value. In the case where two or more case statements are triggered by the same value, all will be called in the order in which they are declared.
	int a;
	int b;

	a = 0;
	b = 0;

	/*
	 * In the following example case statements for a+b
	 * and a–b are both called. A performance hit is also
	 * incurred due to the expression.
	*/
	switch(value)
	case 1:		// No performance hit
		... do something ...
		break;

	case a + b:	// Performance hit for expression
		... do something ...
		break;

	case a – b:	// performance hit for expression
		... do something ...
		break;
	}
9. Data structures. While this may not seem like a worthy addition it can play a very significant role. In a gaming environment it can be extremely beneficial to allow the script direct access to an internal data structure rather than going through the pain of creating and binding native accessors. While the use of bit fields might be useful in a scripting environment providing support would be candy rather than a requirement.
	struct align(4) MyStruct {
		byte	flags1;		// Offset 0
		byte	flags2;		// Offset 1
		long	cookie;		// Offset 4
	};
10. Support for “finally” code blocks. This is borrowed from Java and allows the programmer to declare a block of code that is always called when returning from within a method call. This eliminates the need for supporting 'goto' which is commonly used to jump to code blocks responsible for cleaning up after a method has completed. Only one “finally” block may be included in any given method.
	int test(int a, int b, int c) {

		// The following code is ONLY called before returning from
		// the call to test() and not during the normal flow of execution
		finally {
			a = (b * c) + (c / a) * (c – a);
		}

		if(0 == a) {
			// the code in the finally block is called BEFORE
			// the the return from calls to test() causing the
			// return value (a) to be recalculated before the
			// return
			return a;
		}

		if(a == b) {
			b++;
		}

		if(c == a || c == b) {
			c--;
		}

		return a;
	}
11. Delegates. This is similar to delegates in C# which addresses the lack of function pointers. Unlike C# however delegates come in two flavors. The first is the single declaration as in C# meant to represent a single function pointer. The second comes in a form similar to interfaces. This provides a form of dynamic polymorphism in that an object reference can be assigned to a delegate and only the required function calls are propagated. This could be extended to include the same functionality as UnrealScript's 'state' mechanism. This would also eliminate the restriction in UnrealScript in which only one state can be active at a time. An example where this can be useful would be in weapons control such as a gatling gun. One set of delegates would control the trigger mechanism while the other would control the visuals for the spinning barrels. This extends the concept of both events and delegates (in regards to 'function pointers') into a more acceptable OO paradigm with named types and clearly defined interface requirements.
	/*
	 * Standard delegates per C#
	*/
	delegate void SomeFunc(int val);

	void Test(int val) {
		... do something ...
	}

	main() {
		SomeFunc = Test;
	}

	/*
	 * Delegate interfaces and implementations for state machines
	*/
	delegate MyDelegate {
		void Func1();
		void Func2();
		bool Func3(IActor actor);
	}

	MyDelegate state1 default {
		void Func1() {
			... do something ...
		}

		void Func2() {
			... do something ...
		}

		bool Func3(IActor actor) {
			return false;
		}
	}

	void main() {
		SetState(state1);
	}
Alternatively the following could be used instead in order to present a cleaner implementation :
	void main() {
		this = state1;
	}
In the above examples no performance hit is incurred as the methods can be resolved during compile time. This allows a base class or interface to define a known set of requirements for object states or general delegation while providing an optimal approach for binding. Dynamic binding on the other hand can also be supported when nothing more than an object (of any type) is known:
	class Test {

		delegate MyDelegate {
			void TriggerEvent(int eventID);
			void EnterState(int stateID);
		}

		void Test(Object parent) {
			MyDelegate  = parent;
		}

		void Begin() {
			EnterState(0);
			TriggerEvent(2);
			EnterState(4);
		}
	}
In the above example dynamic binding is used when passing 'parent' in the constructor. This allows the base implementation to change over time without requiring any subclasses to be recompiled. Typically this could be addressed by forcing all subclasses to inherit from a predefined base class or interface and providing a default implementation for each method. However for situations where dynamic binding is necessary or preferred this provides an explicit mechanism for doing so. There are still several issues that need to be addressed such as using multiple delegates of different types, mulitple delegates of the same type, ambiguity, and visibility. This could be expanded even further by including modifiers such as required, optional, ignore (as in UnrealScript) to further enhance event handling. 12. Events. Although the dynamic nature of delegates could aid in the handling of specific events it may be desirable to support a built in event handling mechanism. Unfortunately the requirements for event handling can be extremely broad and may be left up to the runtime for implementation. 13. Support for intrinsic functions. Due to the functional differences between virtual machines some may support operations that others do not. This allows the compiler (based on the target VM) to generate bytecode for a specific operation rather than making a call to native code. This can eliminate certain performance hits and allows direct execution of functions such as min(), max(), abs(), etc. to be executed 'inline' by the virtual machine. 14. Restricted classes and methods. This introduces the 'restricted' modifier for classes, member and instance variables, and methods. Restricted members can only be accessed by other classes found in the same package (similar to Java packages). 15. Type impersonation. This is something that I tossed around while doing a couple of quick mods for Unreal Tournament. While developing the mod I found that I wanted to integrate it with some mods by other people. Doing this would have created a dependency on a new version of the other mod. At that point I thought it would be nice to have some way to tell the runtime to use a different class instead of the original. The syntax for impersonation is similar to that of a class declaration except that it cannot have any abstract methods:
	class Base {
		public void DoSomething() {}
	}

	class NewBase impersonates Base {
		public void DoSomething() {}
	}
In the above example the runtime is instructed to replace all uses of Base with that of NewBase. NewBase automatically inherits all methods and attributes of Base. The only restriction is that root class Object cannot be impersonated. Those are the primary points so far. The rest of the language is assumed to functionally mimic C in regards to identifiers, operators, expressions, and keywords (if/else/do/while). Regardless of the complexity involved in including these features in a new compiler are significant benefits as well. There are also several features that I am considering for inclusion in the language specs but I haven't hammered out the details yet. Two of those items are support for multiple grammars and external grammar handlers. By supporting multiple grammars a mixed set of code can be written to provide support for features not included in the standard language. The external grammar handlers would call an external compiler for a certain block of code and during compile time replace the block with the result. This provides the ability to attach an alternative language such as Prolog, SMC (State machine compiler) or any one of the shader meta languages. One of the things I am also batting around is how to handle the conditional compiling of code. The use of any type of preprocessor is yet another one of those religious arguments. Although there are preprocessors for languages such as Java it's generally discouraged. While developers could certainly choose to use any of the available preprocessors it would be nice to include it as part of the language – even if in a limited capacity. There are certainly other items which need to be addressed for inclusion. Providing features such as packages, namespaces, operator and method overloading, threading, properties, and whether or not parameterized classes (templates) should be supported.
Advertisement
No comments on these ones because I like them:

1. Object Oriented.
2. Lightweight runtime.
3. Strongly typed.
4. No pointers.
14. Restricted classes and methods.

Just a few comments and questions.
- turned into more than a few but hey


5. Exception handling (try/catch/throw).

I'm guessing that you mean something like 1 = divide by zero, 2 = null object etc?

My idea is to have a section of the method that handles the exception (similar to your finally block). There is only one handler for each method. It's not as useful, but if you aren't passing any other information to the handler anyway it shouldn't matter as much. It's simpler.

Or/and have a code block after a method call; if an exception occurs while that method is executing, and isn't caught until this method, the code straight afterwards would be called.

object.dangerousMethod(a) {    // code to execute if the above method call threw an exception}


These two features would handle all cases that I use exceptions and exception handling for.


6. Primitive types including boolean, char, byte, short, int, and float.

I'm a fan of pure object oriented languages, this is just a vote for that direction.


7. Typed Enumerations.

In java you can create an enumeration like this (the code may not be correct but hopefully you can get the gist of it):
// an action in rock paper scissorspublic class RCPAction{    static public RCPAction rock = RCPAction(0);    static public RCPAction paper = RCPAction(1);    static public RCPAction scissors = RCPAction(2);        private int enum;        // important: the constructor is private    private RCPAction(int startEnum)    {        enum = startEnum;    }}


You reference RCPAction.rock to get rock etc.

This is the way I'm planning on doing things.


8. Expressions in switch/case statements.

Isn't this the same as having a bunch of if statements? If you are using boolean expressions in the case statements, don't you have to evaluate all of them just to see if they are going to be called? Or are you somehow going to incorporate this with the traditional "jump" method, grouping all the single value case statements into a normal switch statement then your boolean if statements? (I'm talking about implementation here). In any case, it seems that the normal switch method and if statements provide the same functionality and you're just making a new way of writing it. That's cool btw, I think for commonly used commands that making new syntax for it isn't bad, you just need to know where to stop and if the new syntax is worth it in terms of language simplicity.


9. Data structures.

I'm mostly with you on this one, but I think it might complicate things and lead to errors. If it was up to me I wouldn't include it. I think that taking the time to nicely wrap and bind it would be more beneficial than having the ability to jump straight in down and dirty.

But then again I'm a believer in a language that almost forces you to code in a particular way, and I don't like more than one way to do something.


10. Support for “finally” code blocks.

I'm guessing that this isn't just a compiler thing, that the runtime actually jumps to a bit of code that is executed after it finds a return instruction. Otherwise it would still be good, but you would have to insert the code everytime there is a return statement.

My opinion is that this would make the code confusing to read but I can see where it comes from. I would try to put some sort of restriction on it so it can only be used to clean up objects and other things that are used in the method. Many times I have duplicate code in different if blocks just because I need to clean up whatever object I'm working with before returning.

It's a good feature but for me it's a toss up, it could be abused and lead to very nasty code. Considering what I said earlier (that I like strict languages) I would probably leave it out. Having said that, if I use your language in the future, I would use finally code blocks.


11. Delegates.

I read this as two things: function pointers, and delegates.

Function pointers are good, but not often used with object oriented programming (not sure if this is because they often aren't supported in such a language, or some sort of design pattern can do the same thing).

To me delegates looks like it could be implemented with a simple class hierarchy, unless I'm missing something?.

Something like this would do the same (using my pseudo code, hope you get the gist - it's tabbed blocked):
interface MyDelegate    void Func1()    void Func2()    bool Func3(IActor actor)// state1 implements the MyDelegate interfaceclass state1 : MyDelegate    void Func1()        ... do something ...            void Func2()        ... do something ...            bool Func3(IActor actor)        ... do something ...// to use it you goMyDelegate a = new state1()


The only difference I can see is if your delegates can have properties, and that each state acts on these properties. This couldn't be done easily with a class system.

delegate IntegerAction {    void action();    int get();    int value = 0;}IntegerAction increment default {    void action() {        value++;    }    int get() {        return value;    }}IntegerAction decrement default {    void action() {        value--;    }    int get() {        return value;    }}IntegerAction half default {    void action() {        value /= 2;    }    int get() {        return value;    }}


Is this how you envisioned it working? If so, that would be cool, very cool.

Also it looks like a state change is global, is this intentional? I think that would be a bad idea, each MyDelegate object should have a different current state.


12. Events.

I'm planning on having a callback type system, you register an object's method and trigger conditions, and that method gets called when the event is triggered. You can also tell the event system what parameters to call the method with, they can be static or based on the trigger condition (ie what object was involved with the trigger). Most, if not all of the events won't be polling type ones, they will be related to some other part of the engine (eg the collision system) and so won't effect performance quite as much.


13. Support for intrinsic functions.

Good idea.

My plan is to only have one virtual machine and to write it using standard C++ code so it will compile on any ansi compiler. This means that the virtual machine is always going to be the same. No need for intrinsic functions.


15. Type impersonation.

I can't believe I've come this far without spouting about Objective-C's class system thing. Basically each class had a 'class' object. I use the quoted class here because things tend to get a bit icky trying to describe it. Basically this object can have methods and properties just like other objects, but there is always only one instance of it. You actually ask it to create new instances of it's class for you. When you define a class methods preceded by a '+' mean that it's a class method, a '-' means that it's an object method.

So to talk to this class object you just use the classes name
Object o = Object.alloc()


The above code creates a new object of type Object (Objective-C doesn't actually use dot notation but you get the picture).

Anyway getting back to Type impersonation. Basically what I would do is replace the object's create methods (the ones that create new objects of that type) to return objects of the type that you want.

I really really like your idea of external grammar handlers, full points for that one.

Multiple grammars I'm not so keen on. I like to be able to look at some code and work out what's happening quickly, if there is more than one dialect of a language I think it would get complicated and messy.

Conditional compilation isn't necessary in my book. Compilation is fast, and being able to subclass almost makes it redundant. As an example, if you want a version of your class that logs everything all the time, just subclass it and make the class object return that instance (or use your impersonate thing).

Templates are good for lists. Even if you don't support creating your own templates I would think it mandatory to build in the common datastructures with them. Don't go the java way and make me cast everything that comes out of my ArrayList even though I'm only ever storing one type of object.

Not sure if you have seen it, but I had a big discussion a year or so ago about my language and what I wanted to have in it. I think I wasn't too cosher towards one of the guys that posted, so disregard that. But the rest of the thread might be useful to you. Unfortunately it's quite long.
Quote:5. Exception handling (try/catch/throw).

I'm guessing that you mean something like 1 = divide by zero, 2 = null object etc?

[--snip--]

object.dangerousMethod(a) {    // code to execute if the above method call threw an exception}


Since posting this I have amended the specs to allow try/catch/throw on arbitrary types (primitives, typed enums, classes) with optimizations and compiler flags to handle specific items such as typed enumerations. The reason being that it provides as much flexibility as possible while still providing optimal execution.

I had considered using an additional construct similar to the example you provided. In the end I decided there wasn't a compelling reason to include it was too limited in use. I like the idea it just doesn't translate well to conditional or expression heavy code.



Quote:8. Expressions in switch/case statements.

Isn't this the same as having a bunch of if statements?

[--snip--]

That's cool btw, I think for commonly used commands that making new syntax for it isn't bad, you just need to know where to stop and if the new syntax is worth it in terms of language simplicity.


I actually beat myself up over whether expressions in switch/case statements should be allowed. The most compelling reason is that it can be used to reduce or eliminate if/else statements that include similar code for different actions. Example:

int z = 0;if(1 == a && (2 == b && 3 == c)) {	z = 3;	// Common code	b = 6;	c++;}else if(2 == a && (2 == b && 3 == c)) {	z = 1;	// Common code	b = 6;	c++;}else if(3 == a && (2 == b && 3 == c)) {	z = 99;	b = 3;	c--;}switch(a) {	case 1:		z = 3;		break;	case 2:		z = 1;		break;	case 3:		z = 99;		b = 3;		c--;		break;	case (a < 3 && 2 == b && 3 = c):		b = 6;		c++;}


While there are certainly other ways to implement the logic shown it does show the flexibility of case expressions. If any of the conditions change ( say 2 == b && 3 = c becomes 2 == b || 3 = c ) you only need to change the expression once.

Worst case scenario it won't be any slower than a group of if/else statements. Best case scenario duplicate expressions are reduced to 1.

Right now I am considering a change in semantics. The change would revert switch/case back to allowing only constants in case statements and adding a 'match' construct that allows full expressions.

Quote:9. Data structures.

I think that taking the time to nicely wrap and bind it would be more beneficial than having the ability to jump straight in down and dirty.


I agree. This will be replaced with a cleaner approach that reflects the overall grammar of the language. This should retain the OO nature of the language while providing near native speed.


Quote:10. Support for “finally” code blocks.

[--snip--]

...that the runtime actually jumps to a bit of code that is executed after it finds a return instruction.

...I would try to put some sort of restriction on it so it can only be used to clean up objects and other things that are used in the method.


Actually the finally code blocks are the compilers domain not the runtime. C++ compilers generate epilog and prolog code (depending on compiler settings). This is done to adjust the stack and destroy local objects. Each time a return statement is encountered it just jumps to the generated prolog code. This just allows you to include additional code in the prolog.

The finally clause is restricted to use in methods only.


Quote:11. Delegates.

To me delegates looks like it could be implemented with a simple class hierarchy, unless I'm missing something?.

[--snip--]

Is this how you envisioned it working? If so, that would be cool, very cool.


Yes, that's exactly how it works. Delegates can have their own properties and methods AND can access the properties and methods of the class it is defined in. This means that you can change the active delegate collection IN one of the collections methods.

If this was done in C++ you could consider it a runtime polymorphic virtual table.


Quote:15. Type impersonation.

Anyway getting back to Type impersonation. Basically what I would do is replace the object's create methods (the ones that create new objects of that type) to return objects of the type that you want.


That's what type impersonation essentially does. Initially the idea was to apply an 'impersonation' modifier to class definitions. Unfortunately that approach is not very flexible and impersonation is now left up to the runtime reflection api.

Quote:Multiple grammars I'm not so keen on. I like to be able to look at some code and work out what's happening quickly, if there is more than one dialect of a language I think it would get complicated and messy.


I agree. It's not very intuitive and will probably make things a lot more difficult than it should be. The only exceptions I've been able to come up with are support for some of the shader languages that are based on C. I think that if something like Cg or GLSL were extended to OO you could expand existing shaders through inheritance. Perhaps an explicit shader modifier for class declarations:

public shader class Myshader extends Shader {	struct pixel_in {		float3 color : COLOR0;		float3 texcoord : TEXCOORD0;		float3 lightdist : TEXCOORD1;	};	struct pixel_out {		float3 color : COLOR;	};	pixel_out main(pixel_in IN,		       uniform sampler2D texture : TEXUNIT0) {		pixel_out OUT;		float d = clamp(1.0 - pow(dot(IN.lightdist, IN.lightdist), 0.5), 0.0, 1.0);		float3 color = tex2D(texture, IN.texcoord).rgb;		OUT.color = color * (d + 0.4);		return OUT;	}}


Since most of the language is already handled by the compiler not much extra work needs to be done - at least for parsing and syntax checking. But then that's not really multiple grammars but whether it would be beneficial to include this or not is still a coin toss.



Quote:Templates are good for lists. Even if you don't support creating your own templates I would think it mandatory to build in the common data structures with them. Don't go the java way and make me cast everything that comes out of my ArrayList even though I'm only ever storing one type of object.


I'm still deciding whether to include parameterized classes or not. From a syntactical view it's not very difficult to implement. On the other hand code generation presents some interesting issues that need to be dealt with - primarily duplicate code elimination and changes to the implementation.

My opinion is that casting should rarely if ever be done. Usually this can be avoided by a few design changes but I think there's always going to be at least one scenario where it need to be done. With that said it's very likely that the language will include parameterized classes at some point during development.

Quote:Not sure if you have seen it, but I had a big discussion a year or so ago about my language and what I wanted to have in it.....


I've gotten through the first page of the thread and will check the rest out a bit later. Looks like an interesting conversation. I really appreciate the comments you've made. I've made note of several items so I should have a new design spec soon. Given its size I will probably move the main specs to my website to make for easier reading.

I'm a bit under the weather so if I missed something or misinterpreted anything let me know.
5. Exception handling (try/catch/throw).

Quote:Original post by Helter Skelter
I had considered using an additional construct similar to the example you provided. In the end I decided there wasn't a compelling reason to include it was too limited in use. I like the idea it just doesn't translate well to conditional or expression heavy code.


Fair enough.

I came up with it when thinking about the uses of exceptions. I really didn't want to support them because I don't really like them. I think that in java at least, they make very messy code. It's very agitating to have to wrap a method call with a try catch just because it's been defined so it can throw exceptions even when you don't care at all what it throws.

Anyway, basically I wanted to make things simpler, and the two situations that I saw that need exceptions is when you call a method, and within a method. Either you want the method that called your method to handle the exception or you want to handle your own exceptions (and / or let some pass through). That's where those two exception handling characteristics came from.

Of course if you're going to implement try/catch/throw you won't need them.


10. Support for “finally” code blocks.
Quote:Actually the finally code blocks are the compilers domain not the runtime. C++ compilers generate epilog and prolog code (depending on compiler settings). This is done to adjust the stack and destroy local objects. Each time a return statement is encountered it just jumps to the generated prolog code. This just allows you to include additional code in the prolog.


That's what I was trying to get at, that when the code gets to a return statement it jumps to a different part of the code and executes that before leaving that method. I guess I was confusing a return statement and a return opcode or instruction. Of course there doesn't have to be a 1:1 relationship between these two. Also it would be hard with the way that I have built my virtual machine to do that, so I'm probably not thinking broadly enough.

11. Delegates.
Quote:Yes, that's exactly how it works. Delegates can have their own properties and methods AND can access the properties and methods of the class it is defined in. This means that you can change the active delegate collection IN one of the collections methods.

If this was done in C++ you could consider it a runtime polymorphic virtual table.


Not fair. Now I want to make delegates in my language, but the virtual machine has already been started and it would take a bit of a restructure to make it work.

hmmmm.


Quote:My opinion is that casting should rarely if ever be done. Usually this can be avoided by a few design changes but I think there's always going to be at least one scenario where it need to be done. With that said it's very likely that the language will include parameterized classes at some point during development.


Totally agree with you there.


Quote:I'm a bit under the weather so if I missed something or misinterpreted anything let me know.


Looks fine and crystal clear to me.
Strong Typing and Dynamic Typing are not opposites:

Dynamic Typing means that variables aren't stuck with a specific type and instead a variable can hold any type. Types are instead assigned to VALUES, so VarA can contain (int: 5) or (string: "Testing") and (Object: @454E8304) or whatever.
The opposite of this is Static Typing: each variable has one type and can only hold values of that type, so you have (string: VarA) that holds "A" or "B" or (int: ValB) that holds 5 or 13.

Strong typing means that the correct type must be used in all situations, so if a function wants a string you must pass it a string.
Weak typing means that types are automatically massaged into the correct types if at all possible, so passing 65 to a function that wants a string might act like calling func("65").

C++, for example, is statically typed and falls somewhere between strong and weak typing. There are many automatic conversions but also you must pass the correct types in some situations. In many cases, the strong/weak aspects conflict so you have to manually specify which type you're passing in (templates for example).
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
Quote:Original post by Extrarius
Strong Typing and Dynamic Typing are not opposites:


Correct. It should have read "Statically typed" and split with typecasting in it's own section.
Quote:11. Delegates.

Not fair. Now I want to make delegates in my language, but the virtual machine has already been started and it would take a bit of a restructure to make it work.


Depending on how your compiler/code generator is set up I can't imagine it would be difficult to implement. Just consider delegate declarations as a mutated type of class (i.e. class vs. interface vs. delegate) and for each usage add another vtable specific to that instance.



Now I just have to decide if I'm going to include operator overloading. It's already in the grammar file so I just need to decide if it's worth keeping :D
Quote:Original post by Helter Skelter
Quote:11. Delegates.

Not fair. Now I want to make delegates in my language, but the virtual machine has already been started and it would take a bit of a restructure to make it work.


Depending on how your compiler/code generator is set up I can't imagine it would be difficult to implement. Just consider delegate declarations as a mutated type of class (i.e. class vs. interface vs. delegate) and for each usage add another vtable specific to that instance.


I had another think about it and it's going to be quite easy.

I don't use vtables and things like that, my language is pure oop so there is no such thing as a global method. The virtual machine is really specific and has object specific opcodes as well as other things.


Quote:Now I just have to decide if I'm going to include operator overloading. It's already in the grammar file so I just need to decide if it's worth keeping :D


My language is only operator overloading, being pure oop every operator has to translate to a method. In fact, if you would believe, if statements are just methods on a boolean object. The normal if code structure works, but it can be thought of as:

// normal look at an if statementsif (a < b)    print a// method look at an if statement// is a half closure type thing(a.less_than(b)).if    print a


So the if statement is really a method that accepts a half closure thing and only calls it if it's true.
Time for adding a couple of new things to the list. Hopefully I'll have a website setup with a forum so primary discussion can take place there (with weekly highlights possibly cross-posted here). This will be useful since I'll be able to categorize various things. So here's the new stuff:



1. Constructs for meta data have been added. Meta blocks come in the form of a new object type called 'metablock' and are declared in the same manner as classes and interfaces. The metablock type can contain only data and can only inherit from other metablock declarations.

Data contained in metablocks are typeless and do not require compile time scope resolution. The following example shows the various ways data may be declared:

public metablock MyData {	simpleVar = 1;	nested.Qualifier.Var = block;	"var with space" = something;	arrayVar[0] = 1;	arrayVar[2] = 2;	anotherArrayVar[5] = {0, 1, 2, 3, 4};	associativeArrayVar["key"] = whatever;	// point to a default class for use in object creation	defaultFactory = Runtime::Patterns::Factory;}


This eliminates the need to include domain specific details into the compiler. I need to work on the specs for this a bit more to include limitations and requirements of using the metablocks. For the most part it's fairly flexible.



2. Parameterized classes. Since one of the goals is to create a strongly typed language parametrized classes (generics or templates if you prefer) are highly desirable. Key points for capabilities and constraints:

a) Parameterized classes will only accept classes, interfaces, and constant values as parameters.
b) By default a copy of all static variables declared within a parameterized class will exist for each unique set of parameters. This allows test<A, B> and test<B, A> to each have their own instance of static variables while allowing test<A, B> and more<A, B> to share the same instance of static variables.
c) Will use latent typing.
d) May be declared as inner classes.
e) May not be typecast regardless of what kind typecasting is provided).

Example:
public class Data<_STREAMIN, _STREAMOUT> {	public void Process(_STREAMIN in, STREAMOUT out) {		while(in) {			out.put(in.get());		}	}}




3. Inheritance model. The approach being taken resembles the model used in Java. Classes may inherit from a single class and one or more interfaces. In order to address ambiguities that may occur between methods base classes and/or interfaces additional syntax will be provided to allow either a) a single implementation or b) individual implementation for each interface. This approach provides greater flexibility and is used by several existing languages.

public interface A {	public void doit();}public interface B {	public void doit();}public class C implements A, implements B {	int a = 0;	void A::doit() {		a--;	}	void B::doit() {		a++;	}}




4. Will use "::" as the scope resolution operator.

5. Several new operators have been added and are intended to allow custom behavior. These operators are: :#, ":<", ":>", ":=", "@<", "@>", "@<=", "@>=", "@==", and "@!="



The rest of the additions but not all of them have anything more than a few basic notes. Hopefully I'll have a more details list in a day or so.

1. Support for signed and unsigned type modifiers.
2. Support for 64bit integers using "long long"
3. Static class initializers.
4. Limited operator overloading. Presently the only operators that can be overloaded are: instance function operator (treats an object like a function),
static function operator (treats a class like a function), all new operators mentioned above.
5. Methods can be declared as being non-recursive and can optionally throw an exception if recursion is detected.
6. Support for concurrent method invocation. When a method is invoked as concurrent the call returns immediately and the method runs in a different thread. Several methods can be invoked at the same time.
7. Dynamic classes. This allows you to a) create a new subclass from an existing class and 0 or more interfaces b) change the base class and 0 or more interfaces of an existing class.
8. Anonymous classes and functions.

	// Anonymous function	int a = @(b, c) {		return b + c;	}	// Anonymous class	a = new @(a, b) {		int test;		@(a, b) {			test = a + b;		}				int getVal() {			return test;		}	}	int z = a.getVal();

If you want users to change the existing behaviour, may I suggest moving to a declarative, non-turing complete language? The up-front cost is the design. Designing a non-turing complete language language is difficult. The benefits, however are that users will be severely limited in their mess making possiblities.

If this isn't possible, perhaps the portion of the project/system you are trying to farm off is too large to remove work from the system and is only going to move work from your group's development budget to your group's support budget (you *will* have to support this new language).

This is assuming your users are content providers to the system and not programmers.

EDIT: For what it's worth, non turing complete languages could often be called type systems. To learn more about type systems check out this book.

[Edited by - flangazor on June 30, 2005 7:40:58 PM]

This topic is closed to new replies.

Advertisement