Sign in to follow this  
dangerdan9631

Creating a Scripting Language

Recommended Posts

dangerdan9631    100
Hey, I am planning on making a scripting language in my free time. Below is the BNF I came up with it. I just wanted to get some feedback on things I might be missing, things I could do better, any problems I'm setting myself up for, etc. Thanks!
***** DEFINITIONS 
definition	=> <functionDef> | <objDef> | <varDef>
functionDef	=> (<dataType> | "void") <identifier> "(" (<dataType> <identifier> (, <dataType> <identifier>)*)? ")" <block>
objDef		=> "object" <identifier> "{" <varDef>* "}"
varDef		=> <dataType> <identifier> ("=" <expression>)? ("," <identifier> ("=" <expression>)?)* ";"
dataType        => ("string" | "number" | <identifier>) ("[]")?

***** STATEMENTS 
statement	=>(("case" (<string>|<number>) | "default") ":")*
			| <expression> ";"
			| <block>
			| "if" "(" <expression> ")" <statement> ("else" <statement>)?
			| "switch" "(" <expression> ")" <statement>
			| "while" "(" <expression> ")" <statement>
			| "do" <statement> "until" "(" <expression> ")"
			| "repeat" "(" <expression> ")" <statement>
			| "for" "(" <expression>? ";" <expression>? ";" <expression>? ")" <statement>
			| "continue" ";"
			| "break" ";"
			| "return" <expression> ";"

block		=> "{" (<varDef> | <statement>)* "}"

***** EXPRESSIONS 
expression	=> (<unary> (<assignOp>))* <ternary>
ternary		=> <or> ("?" <expression> ":" <expression>)?
or		=> <and> ("||" <and>)*
and		=> <equality> ("&&" <equality>)*
equality	=> <relational> (<equalOp> <relational>)*
relational	=> <additive> (<relatOp> <additive>)*
additive	=> <multiplicative> (<addOp> <multiplicative>)*
multiplicative	=> <unary> (<multOp> <unary>)*
unary		=> <prefixOp>? (<identifier> | <string> | <number> | <funcCall> | "(" <expression> ")")
							("[" <expression> "]" | "." (<identifier> | <funcCall>) | <postfixOp>)*
funcCall	=> <identifier> "(" (<expression> ("," <expression>)*)? ")"

***** TERMINALS 
string		=> '"' ([!'"'] | <escapeSeq>)* '"'
escapeSeq	=> [!"\"]("\\" | '\"')
number		=> [0-9]* .? [0-9]+
assignOp	=> "="  | "+=" | "-=" | "*=" | "/="
equalOp		=> "==" | "!="
relatOp		=> "<"  | "<=" | ">"  | ">="
addOp		=> "+"  | "-"
multOp		=> "*"  | "/"  | "%"  | "div"
prefixOp	=> "-"  | "!"
postfixOp	=> "++" | "--"
identifier	=> [a-zA-Z_][a-zA-Z_0-9]*
whitespace	=> " " | "\n" | "\t" | "//" [!"\n"]* "\n" | "/*" [!"*/"]* "*/"
***** Changes
<3/30/10>
- Changed dataType to: ("string" | "number" | <identifier>) ("[]")?
			from ("string" | "number") ("[]")?
- Renamed "struct" to "object" and <structDef> to <objDef>
- Changed block to: "{" (<varDef> | <statement>)* "}"
			from "{" <varDef>* <statement>* "}"
- Renamed <conditional> to <ternary>
- Changed ternary to: <or> ("?" <expression> ":" <expression>)?
			from <or> ("?" <expression> ":" <ternary>)?
- Changed funcDef to: (<dataType> | "void") <identifier> ...
			from <dataType> <identifier> ...
- Changed string to: '"' ([!'"'] | <escapeSeq>)* '"'
			from '"' [!'"']* '"'
- Added escapeSeq => [!"\"]("\\" | '\"')
- Changed prefixOp to: "-"  | "!"
			from "++" | "--" | "-"  | "!"
[Edited by - dangerdan9631 on March 30, 2010 7:38:11 PM]

Share this post


Link to post
Share on other sites
haegarr    7372
I have mostly critiques, but don't become discouraged ;)

BTW: I think that showing a syntax description is just one thing.

From the syntax I can see that
* functions are not 1st class values,
* function nesting isn't supported,
* hence I assume that closures are not possible,
* co-routines are not supported.

The missing features enumerated above are typically loved by scripters, so to say.

I can't examine the exact type system. I see the types string, number, arrays of the former types, and structs. Is the type system weak or strict? Does type coercion happen in some way?

It seems me that structs cannot be nested. Is this correct?

Does call-by-reference or call-by-value happen for strings, arrays, and/or structs? I'm not sure, but it seems me that functions must return a string or number or array; no "void" functions allowed?

Nothing is said about the integration with the application.

Nothing is said about execution (although that need not necessarily be part of a language specification): E.g. source interpretation, bytecode interpretation, or what?

Nothing is said about supporting packages/libraries (although again not necessarily part of a language specification).


Besides that (and issues I've not seen yet), using a syntax similar to C may be well accepted by programmers, and non-programmers may already have seen it once or twice, too.

[Edited by - haegarr on March 30, 2010 9:44:23 AM]

Share this post


Link to post
Share on other sites
dangerdan9631    100
Thanks a lot! After looking at what you wrote I have changed some things:
redefined dataType as:
     dataType => ("string" | "number" | <identifier>) ("[]")?

Where Identifier is used to look up the name of a struct

I also renamed "struct" to "object" as that seems a little less "c" like and sounds a little more high level. Purely an aesthetic choice.


Quote:
* functions are not 1st class values,
* function nesting isn't supported,
* hence I assume that closures are not possible,
* co-routines are not supported.


I chose not to do function nesting and co-routines because I felt that those might be too complicated for someone that might be using this (Yes, this is just for fun, but if it does get used it will be by people who have never programmed in their life, and thus wouldn't even know when to use a nested function or coroutine.) Also, this is my first attempt at this, so I figured I could keep it a little simpler by not including those. But maybe in a later iteration.

As for making functions 1st class values, that might actually be useful. I'll have to think about that and how I can add it in.

Quote:
I can't examine the exact type system. I see the types string, number, arrays of the former types, and structs. Is the type system weak or strict? Does type coercion happen in some way?


The three types would be string, number, and object. All type checking would be done at compile time. The only conversion that would be allowed is an implicit conversion of number to string for the purpose of concatenation.

Ex.
number a = 1, b = 2, c = 3;
string position = "(" + a + ", " + b + ", " + c + ")";

would assign "(1, 2, 3)" to position

Quote:
It seems me that structs cannot be nested. Is this correct?


Again for the sake of simplicity, no they cannot. However I did change my datatype definition to allow for recursive struct definitions, which were not possible before.

Quote:
Nothing is said about the integration with the application.

Nothing is said about execution (although that need not necessarily be part of a language specification): E.g. source interpretation, bytecode interpretation, or what?

Nothing is said about supporting packages/libraries (although again not necessarily part of a language specification).


I guess I probably should go over some of that. I am programming it in c#, I am working on a windows form editor, where you can write/save/run the program and test its value. When you run it here it uses source interpretation. However, to integrate it with other things, the plan was to translate it into c#, and then use msbuild to compile it into a .dll.

Maybe this isn't the best way to do things, but both of those are things that I have wanted to mess around with for a while, so I figured I'd try to kill two birds.

For supporting packages/libraries, I was planning on just using a #include "filename" directive. That would just insert the text of the file at that spot in the code.


Thanks for the comments! That was exactly what I was looking for!

Share this post


Link to post
Share on other sites
dangerdan9631    100
Woops, forgot to reply to this part

Quote:
Does call-by-reference or call-by-value happen for strings, arrays, and/or structs? I'm not sure, but it seems me that functions must return a string or number or array; no "void" functions allowed?


Good point, perhaps I could redefine functionDef as

functionDef => (<dataType> | "void") <identifier> ...


hmmm... That seems a little hacky. The other option would be to just add "void" to datatypes, and then do checking to make sure no "void" variables are declared? hmmm... I'll have to think about that some more when I get a chance.

Share this post


Link to post
Share on other sites
DracoLacertae    518
Can you give examples of short programs written in your language? Looking at BNF is OK I guess, but whenever I'm trying to come up with a language, I just start writing stuff in it and see how it looks.

In lieu of 1st class functions, function pointers/references might be useful so you can at least do callbacks. Or put them in a struct and make a lightweight object.

Share this post


Link to post
Share on other sites
Atrix256    539
Making a scripting language is fun, and cool, and a great learning process.

However!

It's a lot of work, and in the end you (probably!) won't have as good a thing as other scripting languages already out there.

If you get tired of making your language and decide you want to start focusing more on your GAME, check out Lua. It's easy to use, open sourced and benchmarks show it's one of the fastest scripting language out there.

I have it integrated into my current project and it's great.

World of warcraft also uses Lua for their UI system, so you can rely on it being a stable, mature language as well.

My 2 cents for ya!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this