Sign in to follow this  
pinacolada

Your ideal syntax

Recommended Posts

So I'm making a programming language and I'd like it to have a really nice and readable syntax. But I can't decide on some things. First off, what do you think would be an ideal syntax for a language that is primarily used for coding games? Second, take a look at what I have and tell me if you think it sucks.
-- Comments start with two dashes

-- Function calls look like this
function(arg0, arg1)

-- Instead of declaring variables, you attach names to expressions
a = function(arg0, arg1)

-- There's infix operators
a = 1 - 2

-- Blocks start with keywords and end with 'end' (like Ruby)
if something
  do(something)
end

-- But I can't decide if blocks should have significant indentation like Python. What do you think?
if something:
  do(something)

-- Or possibly curly brackets
if something {
  do(something)
}

-- This is a function definition.
-- The language is statically typed, so each argument has a type name.
def my_func(int a, int b)
  do(something)
end

-- This is a function that returns something, the return type is declared with ->
def my_func(int a, int b) -> int
  return a - b
end

-- But maybe the 'return' should be optional, and the function should just return the last expression?
def my_func(int a, int b) -> int
  a - b
end

-- Also there are different possible ways to specify the return type
def my_func(int a, int b) { a - b }  -- implicit return type
def my_func(int a, int b) :: Int { return a - b } -- two colons like Haskell
int my_func(int a, int b) { return a - b }  -- return type name in front, C style

-- Here's a real function which draws a filled rectangle
-- The 'gl' part at the beginning of gl:triangles is a namespace
-- The square brackets indicate a list
def fill_rect(Rect rect, Color color)
    "Draw a filled rectangle."
    nw = [rect.x1 rect.y1]
    ne = [rect.x2 rect.y1]
    sw = [rect.x1 rect.y2]
    se = [rect.x2 rect.y2]
    gl:triangles([nw ne se se sw nw] color)
end

-- The -> operator is used to feed the input on the left to the function on the right.
-- It's there so that the writer can list the sequence of calls in left-to-right order
-- instead of right-to-left.
some_function() -> print
-- is the same as:
print(some_function())

-- The @ operator is shorthand that takes a name, and rebinds the overall result to
-- that name. It's there to make it easier to write a chain of expressions that
-- all operate on a certain name. Does this suck?
add(@a, 2)
-- is the same as:
a = add(a, 2)


This post is getting long so I'll stop there. TIA for feedback.

Share this post


Link to post
Share on other sites
It looks pretty standard, but could you elaborate on the notion of "named expressions"? How does a named expression behave/works?
Is the language typed?
Can you foresee a roadmap for the future?
What is the goal of this language? I understand it is for "coding games", but how would it win over competitors? Can you compare, at least in line of theory, what you're going to build with some alternatives?
Are you sure you want to use "--" for comments considering this could be operator-decrement?

For me, the ideal language for writing games is UnrealScript... States do rock, and replication leaves me in total awe. It's probably unnecessarly complicated for everything but highest-end games.

Share this post


Link to post
Share on other sites
What's it called? YAPL? Yet Another Programming Language? xD

Personally I'm really not fond of the Basic style of if statements, I rather have brackets instead of 'end', since brackets can be aligned under each other (have the matching brackets on the same column). Didn't scroll down...
Further more, a semi-colon at the end of the line feels more natural. It gives a good marking for the end of the statement (and also allows multiple commands per line, not that you want that in most cases though). But it might be my C influences preferring this ^^
I like how you treat a function as a variable, since it is! It's merely a pointer to data in the code segment.

Beside that, will it be a programming language, or a scripting language? I mean, will you compile to machine code at compile time, will you compile to bytecode compile time and interpret bytecode at run time or will you interpret the code at run time entirely?

The syntax looks much like Lua btw, are you sure Lua will not fit your needs? Some big boys use it for their games too, so it's suited for that. Or is this more of a learning project?

Quote:
Original post by jwezorek
I'd have multiple assignment and multiple returns.
so you could have like:

a, b = 5, 6;

and

foo, bar, mumble = froboz();


I like that. I always wondered why this was not possible in C, I think it can be done on the stack much like the input variables for functions (which is actually a tuple too). I do admit it is not very common in mathematics...

[Edited by - Decrius on February 16, 2010 2:48:22 AM]

Share this post


Link to post
Share on other sites
I think one should always go new ways when designing a new language.
I'm working on my own programming language ATM and going through some of the same issues.

I think the new language designer should take extra care not just to extend existing languages. For instance. See how much a functional programming language is different from i.e. C++?

I think the language looks ok, but as mentioned theres a lot of lua/python in it.
But it's great learning experience (not to mention great fun) so you DONT have to invent anything truly groundbreaking :)

Feel free to PM me for my MSN as I'd love to talk about language design and implementation ;)

Also you should not call a byte code compiled language for a scripting language. Java and .NET are byte code compiled. (With JIT I know, but still)
Scripting languages are used to extend something, but it's still programming language that's being used. C# can be used for scripting too :)

And writing byte code or ASM is pretty close to each other depending on how you do it. If you can do one, you can (most likely) do the other.

Share this post


Link to post
Share on other sites
First and foremost, I demand strong typing for making real games in a language. Python and Ruby are excellent for small, simple games, but the type system makes them much more prone to critical (runtime!) errors.

Another thing to consider is OO and encapsulation. While both Ruby and Python are OO languages in a sense, they lack many features that I consider critical. For example, Python's encapsulation is... Well, there really isn't any to speak of. Ruby allows for new methods and data to be added to a class from anywhere, rendering the encapsulation that actually exists useless against a determined "attack".

Furthermore, both of these languages are "duck typed", and, as the Ruby folks like to put it: "if it looks like a duck and quacks like a duck, it's probably a duck". This discourages the use of inheritance for flexibility and instead encourages its use as a means of code reuse. As a C++ programmer I consider this bad.

As for the actual syntax, I agree with Krohm on the use of the '--' operator as comment symbol. There are much better (and more widely used) comment symbols, and you get the extra benefit of not having your language mixed up with Lua all the time. ;)

I like the omission of the semicolon, which I quite honestly find rather pointless. Ending blocks with a keyword rather than a bracket is something I can deal with, though I really prefer the good old {} pair. :)

Perhaps you could show us some more? A function declaration? A class?

Also, will your language be compiled to native code or some intermediate bytecode?

Share this post


Link to post
Share on other sites
I think any new programming language can't simply be a syntactic wrapper over the familiar constructs and methodologies supported by common languages. Even in combining language elements of different progenitors under a new umbrella language is not terribly compelling, unless the integration is unique or very elegant.

And certainly targeting a 'niche' market such as game programming, which does not actually have any hard distinction from other types of programming in the way that, say, critical/verifiable systems have, is a bit of a fools errand in practical terms.

There has to be some specific goal, preferably one which meets a need that is unfulfilled in current programming languages, in order to have any hope of being anything more than a toy language all of a dozen people might try out. In other words, try to do something that's more than superficially different -- try to do something novel.

Take Lua, for instance, which was probably the first scripting language which placed a high degree of importance on being embeddable -- that is, integrating nicely into another application.

Or take SPECS, which is explicitly a new syntax for C++, but it's specific goal is to make the syntax more consistent (from a human perspective) and also completely unambiguous (from the compiler's perspective) -- which largely go hand-in-hand, but which attempts to address very significant issues with C++ from a usability, maintainability and evolutionary standpoint.

Or take Epoch, created by a member of this forum whose identity escapes me this moment, but is meant to unify heterogeneous (meaning non-similar) computing resources under a single language and run-time system. The ideal of this language is to write one piece of code which can run on the CPU, GPU, a DSP, or any other type of execution resource -- possibly even broken down and assigned to the most appropriate execution resource intelligently.

Share this post


Link to post
Share on other sites
One of the most tempting pitfalls of starting a new language project is getting caught up in designing the syntax. It is incredibly easy to focus on this, as it seems to be a legitimately significant question.

What I've learned over the course of building several languages, however, is that syntax is virtually irrelevant. Of all the things you need to consider when starting a new language, syntax is basically the bottom of the list.

Syntax is a means to an end, not a goal in itself (unless you truly have some novel idea for how the syntax should work, which it appears that you do not in this case). All of the other decisions you need to make will influence the syntactical design. You will need to mold the syntax itself to suit the features and capabilities of the language.

Placing syntax first is, again, very tempting, and easy to get caught up with. But sticking too closely to a rigid syntax design will damage your ability to include other language features cleanly. For instance, by choosing "--" as your comment delimiter, you have completely eliminated the possibility of including the post-/pre-decrement operators (a-- and --a, respectively).


There is a great series of questions that you should be able to answer about your language. Some of them you can safely postpone until later, but none of them can be totally ignored. I would recommend getting to a point where you can answer the majority of those questions before you get too deep into coming up with syntax ideas.

Last but not least: remember that for every syntactic decision you make that is different from existing languages, you will introduce a point of confusion for newcomers. Everything that is unique will actually make your syntax less readable and intuitive, because by definition, what your syntax does is different from what the reader will expect from past experience. Achieving readability is much harder than it sounds.


In any case, best of luck with your project, and enjoy it - it's a very rewarding ride [smile]

Share this post


Link to post
Share on other sites
That list only needs one question: Do you want to bore yourself to death or have fun writing the language? :)

Seriously. It's a great list, but just designing the syntax can also have it's merits :)

The language should be your own dream language IMHO. And if the idea is good, others might follow. If not. Well. At least you have your own dream language :)

But: It's important thinking about types, functions, program flow and collections before or during the design of the syntax.

Share this post


Link to post
Share on other sites
It's not nearly that bad [wink] I've been through that list several times and I'm still having a hell of a lot of fun on my own project, so it isn't impossible to both analyze heavily and still enjoy the task.

Quote:
Seriously. It's a great list, but just designing the syntax can also have it's merits :)


Easy enough to say, but I'm not convinced... what merits do you see, specifically?


Quote:
The language should be your own dream language IMHO. And if the idea is good, others might follow. If not. Well. At least you have your own dream language :)


Creating a "dream language" is far more involved than just making a pretty syntax. If you aren't careful, you'll end up with a nice syntax for 5% of the features, and a hideous mass for everything else. It's critical to think of what the syntax "means" under the surface, and how the internals of the language come together. Sometimes moving a single character around can spell the difference between a nice, simple parser, and a horribly complicated bog of nastiness (like C++).


In any case, I don't think that this is really that sort of situation. The OP posted asking for input on the syntax - which indicates to me that the OP doesn't already have a clearly defined "dream syntax" to be working towards.

Share this post


Link to post
Share on other sites
Quote:
Original post by Windryder
First and foremost, I demand strong typing for making real games in a language. Python and Ruby are excellent for small, simple games, but the type system makes them much more prone to critical (runtime!) errors.


That's why it would be awesome if someone would make a statically typed, properly compiled Ruby clone. Of course, that would require stripping some features, but I think the end would be worthwhile. C/C++'s speed fused with Ruby's syntax, that's my ideal language. (And maybe add Python-like indentation treatment, since that contributes to overall readability.) Now that there is LLVM, the bar to create full-blown optimizing compiler for your language is significantly lowered. (But it's still a pretty complex project)

Share this post


Link to post
Share on other sites
Basically a mildly revised version of Tangent syntax I used before.

Standard C# comments
Basic literals (no octal/hex, no float specifiers, no sci-notation)
Standard C# identifiers (expanded to include more unicode categories)
Expanded symbol/operator list (again, including more unicode categories)
If/while/do-while/foreach as in C#
Method invocation via juxtiposition
Explicit returns required

and psuedo-bnf

expression ::= (identifier|symbol|literal)+ // and a few special things like lambdas
parameter-definition ::= (expression:expression)|(comma-delim-var-list)
base-class ::= expression
return-type ::= expression
class-def ::= class-modifiers? class (identifier|symbol|parameter-definition)+ (: base-class)? class-def-block
method-def ::= method-modifiers? (identifier|symbol|parameter-definition)+ => return-type (block|;)


Thus you end up with stuff like:

public add(a:int, b:int) => int {...} // add(1,2);
public (a:int) plus (b:int) => int {...} // 1 plus 2;
public (a:int) + (b:int) => int {...} // 1+2;
public shoot (target:Actor) with (weapon:Item) => void {...}

Share this post


Link to post
Share on other sites
I would try to design a set of language semantics that can be transferred to different syntactic front-ends. That should help with putting syntax in the right place on the priority list. FWIW, typing discipline is not fundamentally a syntactic issue.

Share this post


Link to post
Share on other sites
Thanks all for the excellent feedback. I heart Gamedev.

Multiple return values: I was strongly considering these so now I'll definitely add them.

I'm surprised the comment syntax got so much attention :) I'll most likely change that.

Here's some requested context information. The language's name is Circa (feel free to let me know if you hate the name).

The goal is a scripting / rapid prototyping language, along the same lines as Processing.

It is primarily executed by an interpreter, although in the future I'd like to add JIT or C++ cross-compilation.

It's statically typed with type inference. It does support dynamic typing (using a type called "any")

It does have some interesting features, one of the features I have in mind is that compiled code should be really easy to introspect and modify. Outside of the Lisps, not many languages have this as a priority. An example of what this could mean, is that you could have an IDE which allows you to visualize a piece of code as a dataflow diagram, and even make changes to the diagram, and then save those changes back to text format.

Another interesting feature is that it has really good support for live code reloading, which is pretty fun for prototyping stuff.

Quote:

It looks pretty standard, but could you elaborate on the notion of "named expressions"? How does a named expression behave/works?


In a C-like language, if you had code like this:

a = 1
a = 2

it means that 'a' refers to some piece of memory, the first statement sets that memory to 1, and the second statement sets it to 2.

In a more functional language, you don't think of it like that. Instead you have an expression '1' which has the name 'a' attached to it. Then you have another expression '2' which also has the name 'a' attached. The name doesn't necessarily refer to a specific location in memory.

Quote:
In any case, I don't think that this is really that sort of situation. The OP posted asking for input on the syntax - which indicates to me that the OP doesn't already have a clearly defined "dream syntax" to be working towards.

Yeah that is the case. The syntax is probably the least interesting of Circa's features; if I'm able to come up with a syntax that is relatively clean and readable and not hated by future generations, then I'll call that a success.

Share this post


Link to post
Share on other sites
That looks pretty cool. But I don't see any top-level game loop. How does that work?

On a less important note, I don't really like having to use the state keyword to instantiate a type. It seems like I should just be able to say Ship ship = [...] and be done. Is it required for all mutable state?

I think I would say to stick with one style for function definitions and such. Seeing Ruby-style, C-style, Python-style, and Haskell-style syntax elements all in one program would be very confusing. Leave open the possibility of one-liners, but other than that cut it down.

Anyway, I certainly don't think it will be hated by future generations, at least not for its syntax [wink].

If you haven't already, you might want to take a look at Go. It has some interesting features.

Share this post


Link to post
Share on other sites
Quote:
Original post by theOcelot
That looks pretty cool. But I don't see any top-level game loop. How does that work?


It loops over the whole script for every frame, which I figure is OK default behavior for a rapid-prototyping language.

Quote:

On a less important note, I don't really like having to use the state keyword to instantiate a type. It seems like I should just be able to say Ship ship = [...] and be done. Is it required for all mutable state?


Yeah it's required; in practice I don't think this is too bad. The language encourages you to use pure functional expressions, so you don't need as much mutable state as you would with a C-like language. Also, "state" variables in Circa behave in slightly weird ways.

Quote:

I think I would say to stick with one style for function definitions and such. Seeing Ruby-style, C-style, Python-style, and Haskell-style syntax elements all in one program would be very confusing. Leave open the possibility of one-liners, but other than that cut it down.

Yeah, you are probably right about that.

Quote:

If you haven't already, you might want to take a look at Go. It has some interesting features.


I really like their emphasis on interface types; that feels like the right way to do OO.

Share this post


Link to post
Share on other sites
Quote:
Original post by DevFred
The syntax should be easy to parse for automatic refactoring (and other) tools.

I my opinion absolutlly not!!! You should provide the source code for the scanner and the parser for people to use in tools. I don't care if the syntax is hard for the computer to understand if humans find it very naturral. Take for example an idea i have:

The if statement looks something like: "if( expression) instruction" where `instruction` can be a block of instructions: "{ instruction ... }. Using {} is good but say you don't want to use them. The same language takes a valid syntax the following:
"if( exp) instruction... instruction endif". This grammar is context dependant grammar and a pain in the ass to parse. There is a small trick i'm thinking of for using an ll(1) parser to parse this particular case of `if` statement.


The language should be strongly typed and with c++ like templates ( even more powerfull with `static if` something like D programming language) :
example:

struct foo
{
int a,c,b;
}
template <attribute A,operator OP,class T> // or template <attribute T::A,class T>
bool compare(T & left , T & right)
{ return OP(left.A,right.A); }
// you could spawn any compare function for a struct ; ex: compare<foo::a,int::operator <=,foo>


I would add type inference with a nice construct: ex: int a; typeof(a) b;
and `auto` types: "auto a=..." or "a <- ..." where the types of a is deduced by the value assigned.

Comments sould not be started with `--`(instead use the classic // and /*) because you might want to use that as an operator: int a = 10; a--;

The syntax has to be very very simple wo write! I find pressing Shift(lots of times) on my keyboard very frustrating when writing in c++ something like: foo<int>::method(my_instance,1 + 2 + 3,"adfgh");;

Address the issue of string:
For the sake of this world please use ' or ` not " (because you don't have to press shift)
when starting them.
When you have something like:
 message = 'Error : line:' + i + ' row:' + j + ' message'; 

create a system that allocates the memory space neccesarry to fit that in only one go.Take c++ like strings, for example, every time you call + operator more space is created , the stuff is put inside that and then the old space is releassed (i find this verry stpid and slow).


Paralelism is the future so you need so have it. I would dump the classic oop and thrade that for a new concept that can parallelized more easley. Do a google search on OpenCl.
In my opinion the only synchronization mechanisms you would ever need are semaphores and barriers ( No mutex , no Critical_Section, no Events)

This is all the stuff i can think of right now;
Hope it helped.

Raxvan.









Share this post


Link to post
Share on other sites
Quote:
For the sake of this world please use ' or ` not " (because you don't have to press shift)
when starting them.


that only applies to american QWERTY keyboards :)

Share this post


Link to post
Share on other sites
Quote:
Original post by Makaan
Quote:
Original post by DevFred
The syntax should be easy to parse for automatic refactoring (and other) tools.

I my opinion absolutlly not!!! You should provide the source code for the scanner and the parser for people to use in tools. I don't care if the syntax is hard for the computer to understand if humans find it very naturral.

While releasing the source is a good idea, ease of parsing for computers and humans are not mutually exclusive.
Quote:
Original post by pinacolada
It loops over the whole script for every frame, which I figure is OK default behavior for a rapid-prototyping language.

That should be optional, at least. It might be convenient in simple cases, but I think I'd prefer just having an easy API for loop timing and control and handling it myself. Eventually, a user is going to want more control over their main loop.

I don't understand how @ is used, especially in the for loops. Could you explain that more?

Share this post


Link to post
Share on other sites
Quote:
Original post by Makaan
For the sake of this world please use ' or ` not " (because you don't have to press shift)


Or, allow both? Many people have their preferences, just allow whatever is possible and more people will like it. Must admit ' instead of " ain't a bad idea.

Share this post


Link to post
Share on other sites
Quote:
Original post by theOcelot
I don't understand how @ is used, especially in the for loops. Could you explain that more?

In Circa, the @ symbol means in general that "this name is rebound to the result of this expression". In a for loop, it means that the list can be "modified" by the code inside the loop (although the list isn't really modified; instead it makes a new list and rebinds the name to that result). Example code:

list = [1, 2, 3]
for i in @list
i += 1
end
print('list = ', list)
// prints: list = [2, 3, 4]

It's kind of like the map() function in Lisp/Ruby/Python/etc, but with ordinary for-loop syntax instead of passing a function or lambda.

The motivation is that I'm trying to minimize side effects; code shouldn't modify values in-place, instead it should return new values (and possibly rebind existing names to the new values). So the different usages of the @ symbol are all about making it easier to write code under those restrictions.
Quote:
Original post by theOcelot
That should be optional, at least. It might be convenient in simple cases, but I think I'd prefer just having an easy API for loop timing and control and handling it myself. Eventually, a user is going to want more control over their main loop.

Yeah that is true, this will probably be supported later on.
Quote:
Original post by Decrius
Or, allow both? Many people have their preferences, just allow whatever is possible and more people will like it. Must admit ' instead of " ain't a bad idea.

One nice thing about allowing both is that you don't need to put slashes in front of quote marks as often.

str1 = 'This "string" has unescaped double quotes in it'
str2 = "This string has unescaped 'single quotes' in it"

Quote:
Original post by Makaan
Comments sould not be started with `--`(instead use the classic // and /*) because you might want to use that as an operator: int a = 10; a--;

Heh, a few people have pointed this out. The other side is that I wanted to use // to mean something else. In Python 3, // means integral division and / means floating-point division (it's not based on whether the arguments are ints or floats). I was thinking of copying them. I guess that leaves # for the comment syntax.

Share this post


Link to post
Share on other sites
I think I'm starting to get it. I'll probably understand it after it percolates in my brain for a while. [smile]

I'd say go ahead and use '#' for comments, since its sort of a standard. Python, shell scripts, Ruby, Perl, and heaven knows how many other languages use it.

Share this post


Link to post
Share on other sites
Quote:
Original post by Windryder
First and foremost, I demand strong typing for making real games in a language. Python and Ruby are excellent for small, simple games, but the type system makes them much more prone to critical (runtime!) errors.


I object: Python and Ruby are strongly typed. You are confusing "strong vs. weak" with "dynamic vs. static".

Quote:
Another thing to consider is OO and encapsulation. While both Ruby and Python are OO languages in a sense, they lack many features that I consider critical.


Ah, yes, well. :)

Quote:
For example, Python's encapsulation is... Well, there really isn't any to speak of. Ruby allows for new methods and data to be added to a class from anywhere,


Python allows this too. The only real difference is that the Python community sort-of-officially discourages it, whereas Ruby has standard frameworks that use it to do wonderous things.

Quote:
rendering the encapsulation that actually exists useless against a determined "attack".


Repeat after me: Encapsulation is not in any way, shape or form a security measure. It cannot protect your code from malicious attack, in any language, because it's only relevant to developers. If you can't trust the people on your own team, you have bigger problems than your choice of language. It is there to prevent people on your team from getting the wrong impression about what's part of the interface to a module or not. But you have documentation for that, right?

Quote:
Furthermore, both of these languages are "duck typed"


You're restating yourself, but more accurately this time.

Quote:
This discourages the use of inheritance for flexibility


As it should; composition is usually what you want instead.

But more importantly, it enables you to program to an interface and not worry about boilerplate. If Foo and Bar are conceptually Spammable, but get spammed() in completely different ways, is it useful or necessary to have a base Spammable definition (which might not provide any useful shared code) - or even to name the concept?

Quote:
and instead encourages its use as a means of code reuse.


Not really. In my experience, "code reuse" is a red herring anyway; the real goal is code use - i.e. actually calling the functionality you already have rather than rewriting it. There are a zillion ways to do this. Inheritance is rarely the right way, but it's one of the few things it's actually good for. Especially when you have higher-order functions and can therefore customize behaviour with all kinds of neat template-and-hook patterns.

For what it's worth, about the only time I ever (manually) use inheritance in Python at all is when an API tells me it expects me to.

The short version of all that: you don't have to like it, but there are real reasons why others do.

Quote:
As for the actual syntax, I agree with Krohm on the use of the '--' operator as comment symbol.


The overall impression of a language's syntax is important, yes. But details like this are getting into bikeshed-colour-choice territory.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this