Scripting Language Design (Not Implementation)

Started by
70 comments, last by Extrarius 18 years, 2 months ago

I've recently resumed work on creating a scripting language, and while I've learned a lot about implementation details (especially compiler creation) since I last worked on it, I haven't learned much at all about language (syntax and semantics) design. I've looked at existing scripting languages, and I've come up with a few features I want that in mine and a larger list of those I don't. The problem now is mainly the syntax of the features I want, but I also need help finding ways to further scrutinize which features should be included or excluded. So far, my judgement has just been based around what I, personally, would find helpful (or not), but I want my scripting system to be more general than that. What I'm looking for is ideas about what I should take into consideration when designing a scripting language, explanations of existing languages and what considerations led to it's design, and any related resources.

"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
Advertisement
Well, you could go with something similar to an existing syntax (such as Lisp or C). What sort of explainations of existing languages are you looking for? Any specific languages or language features?
I just want to learn about what 'should' be thought about when designing a language to avoid common mistakes and to take advantage.

For example, in the language 'Jass' by blizzard, warcraft 3 map development would be extremely crippled if they hadn't made mistakes in the implementation, because the design didn't include a way to associate data(of any type) with 'handles'(used for all in-game objects), which was the most needed(based on the comments of some of the most profficient scripters) type of associative container. Adding a generic associative container that can use any type as the key and any type as the value would either mean functions such as AddIntInt, AddBoolInt, etc (with Add, key type, then value type) or a radical change to the language's design. Luckily, a bug in their interpreter provides a way to convert from a 'handle' to a large 'integer', which can then be converted to a string using the provided library and used to index the only provided associative container (which has several Get[Type] functions).

The radical change to the language design would be my preferred solution, since I'm creating something new and don't have to worry about backwards compatability =-)
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
Quote:Original post by Extrarius
I just want to learn about what 'should' be thought about when designing a language to avoid common mistakes and to take advantage.

For example, in the language 'Jass' by blizzard, warcraft 3 map development would be extremely crippled if they hadn't made mistakes in the implementation, because the design didn't include a way to associate data(of any type) with 'handles'(used for all in-game objects), which was the most needed(based on the comments of some of the most profficient scripters) type of associative container. Adding a generic associative container that can use any type as the key and any type as the value would either mean functions such as AddIntInt, AddBoolInt, etc (with Add, key type, then value type) or a radical change to the language's design. Luckily, a bug in their interpreter provides a way to convert from a 'handle' to a large 'integer', which can then be converted to a string using the provided library and used to index the only provided associative container (which has several Get[Type] functions).

The radical change to the language design would be my preferred solution, since I'm creating something new and don't have to worry about backwards compatability =-)

I suppose that's up to you. Experimenting with different designs is probably the best way to go.
Quote:Original post by Extrarius
I just want to learn about what 'should' be thought about when designing a language to avoid common mistakes and to take advantage.


How many programming languages do you know? Learning a few very different languages is a good way to get an overview of what works and what doesn't.
Quote:Original post by marijnh
[...]How many programming languages do you know? Learning a few very different languages is a good way to get an overview of what works and what doesn't.
Part of the problem is knowing Assembly (for a few different platforms), C++, and Common Lisp. Each one gets a few things right (in my opinion, of course), but they don't combine so well =-)

Quote:Original post by bytecoder
[...]I suppose that's up to you. Experimenting with different designs is probably the best way to go.
Of course my personal preferences will play a large role in my final decisions, but if I rush into it and create something without any real knowledge on the topic (and I know nothing about language design), I'm likely to make many mistakes that could easily be avoided by simply looking to see what mistakes other people made and how they fixed their problems.
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
I would expand you're knowledge to take in a couple of scripting langs as well, Lua and Python spring to mind, see what they bring to the table faults and features wise
A few things I have found useful:

First class lexical closures - This alone is the biggest time saver of all (for me at least); the ability to use higher order functions greatly decreases the size of code, and closures + mutable local state = objects (although the syntax may vary, the semantics are remarkably similar).

Record Types - The ability to define a structure that is essentially the aggregation of a few named parts. For example, if we had some information about a unit in a game (lets go with "name", "health", "faction" for now), then we could capture all of this state with a single data structure, a record with a name component, a health component, and a faction component. There is certainly some overlap between records and objects, the choice between the two is one of personal taste. I'm doing my best to remain neutral here ;)

Streams - To be honest, I don't find myself using these all that much, but there are times when you want the elegance of monadic structure without the overhead of performing unnecessary computations just to maintain the sense of orthogonality. I do often compare them in spirit to python's generators, although I personally feel that generators are a cluttered and confused approach.

Continuations - These become necessary when you are dealing with higher-order functions, because the ordinary control flow (such as RETURN in C/C++) does not let us actually escape from the whole computation. I can't say that I find myself using continuations routinely, but I must pay homage to the fact that they are the-right-thing when it comes to abstracting over control.

Tail Call Optimization - At the least, a tail call should not require an activation record on the stack. Hypothetically, a tail call is faster than a non-tail call, because we don't need an activation record whatsoever, we just jump into the next computation.

Unfortunately, these features (except records) tends to have significant implications in regards to the implementation of the language, and the hurdles that must be overcome to safely and easily interact with C/C++.
Quote:
Streams - To be honest, I don't find myself using these all that much, but there are times when you want the elegance of monadic structure without the overhead of performing unnecessary computations just to maintain the sense of orthogonality. I do often compare them in spirit to python's generators, although I personally feel that generators are a cluttered and confused approach.

What is it about generators that you find "cluttered and confused"? Personally, I think they're great, especially in terms of functional-style routines and iteration.
Quote:Original post by bytecoder
Quote:Streams - To be honest, I don't find myself using these all that much, but there are times when you want the elegance of monadic structure without the overhead of performing unnecessary computations just to maintain the sense of orthogonality. I do often compare them in spirit to python's generators, although I personally feel that generators are a cluttered and confused approach.

What is it about generators that you find "cluttered and confused"? Personally, I think they're great, especially in terms of functional-style routines and iteration.

What's happening with generators on a line-by-line basis isn't very apparent, it adds an execution path and complicates things in much of the same way that threads do, not that I'm denying their usefulness but they do add clutter and confusion. My opinion is that they aren't needed if your language has other more elegant constructs.

Quote:Original post by Extrarius
What I'm looking for is ideas about what I should take into consideration when designing a scripting language, explanations of existing languages and what considerations led to it's design, and any related resources.

The planned features of the scripting language that I'm working on are here. I discuss very lightly why I've chosen each feature, but more discussion in relation to why it should / could be in your language is welcome.

State what sort of language you want, and what you want to do with it. Languages range from the esoteric so blindingly simple, and all can be used as a scripting language - a little direction would be good. If you could explain your current idea then we could comment on it - most design issues will happen with the details, not the overall concept.

At the moment you could potentially make every conceivable mistake :P.

If you like the sound of the language I'm working on, and don't have too many ground breaking modifications that you want to make you're welcome to help work on it.

This topic is closed to new replies.

Advertisement