What do you think of my scripting language?

Started by
17 comments, last by ouraqt 17 years, 9 months ago
All of this ASM is making my brain hurt. I'm not actually developing a compiler/assembler that generates machine code, just one that produces bytecode - which is then interpreted. The bytecode is fairly high-level.

"And, yes, I have to agree that repeat() is a nice feature. I'd like a way to access which number it's on, though."

That's exactly what for loops are for. I have nothing against the concept, but what kind of things would you use a repeat() loop for?

Anyway, I'm still deciding how I'm going to implement strings. Should I use character arrays or a std::string-like class? I'm probably just going to do C-style character arrays and I can write a class to handle all the gritty work (like std::string but implemented in Anaphase).

EDIT: Actually, I'm curious about something. How low-level should I make my bytecode? Should I try to convert all the Anaphase source code into assembly? Or should I try a simpler approach like just removing all the whitespace/comments? Which would be faster (ASM probably)? Also, I guess the bytecode would be simple and language-independent (like MSIL for .NET) if it was just assembly. But it would be a lot of work...
Advertisement
Quote:Original post by ouraqt
And to answer more questions...
Functions are not virtual. Everything is by value, not by reference. (references and pointers can get yucky, I'm trying to keep it simple) Variables can be declared anywhere, but will only last until the scope ends (as documented). And by GC, you mean garbage collection, right? Any memory allocated with the new keyword will automatically be freed when the scope that contains the declaration of the variable (not the allocation) ends.


In other words, there's no GC needed at all.
Honestly, without references (or pointers, whatever), this will be too simple. If it's just for learning, and then you're planning on moving on to implement more complicated features, then that's ok. But if you get down to it, implementing the language, as it is, is getting really simple. You should, IMO, consider at least passing function arguments by reference.

String can be implemented as an intrinsic type. That's how it's done in most cases (in most scripting languages) anyway, so worry not about that.

Oh, and about repeat keword: it's almost purely syntactic, so it can be added anytime in the future, once you get the basic compiler working. But I think it's a neat feature. [smile]

-----

Quote:Original post by __many_people__
[about repeat, and for, and assembly output]


People, people, don't turn it into optimization wars - again...
It sounds like repeat is a good idea.

"String can be implemented as an intrinsic type. That's how it's done in most cases (in most scripting languages) anyway, so worry not about that."

I was considering that, but then I realized that the memory for strings would need to be dynamically allocated...which is fine, I guess, but all the other intrinsic variables are created on the heap. It just doesn't seem consistant to me. ...Would I still need to create a 'char' type, then? I suppose chars could just be strings with a length of 1.

"In other words, there's no GC needed at all.
Honestly, without references (or pointers, whatever), this will be too simple. If it's just for learning, and then you're planning on moving on to implement more complicated features, then that's ok. But if you get down to it, implementing the language, as it is, is getting really simple. You should, IMO, consider at least passing function arguments by reference."

Well GC isn't needed, but it is handy! :) When exactly is GC really needed? I guess it's when the programmer is too lazy to free up his memory (ie. me) or when the language doesn't allow you to free it manually. ...Anyway, why should function arguments be passed by reference?

ALso note that this language will be used for rapid application development, not intricately optimised programs. Kind of a C# thing...

PS: How do I quote someone, like in one of those cool quote boxes? I don't see a button anywhere.
Quote:Original post by ouraqt
I was considering that, but then I realized that the memory for strings would need to be dynamically allocated...which is fine, I guess, but all the other intrinsic variables are created on the heap. It just doesn't seem consistant to me. ...Would I still need to create a 'char' type, then? I suppose chars could just be strings with a length of 1.


Using the heap for a language, that has only variables limited to scope that they were created in, is (simply put) not needed at all. Stack is completely sufficient. Thus irrelevance of GC.

You could implement string as (std::string*), seriously! Managing the internals would be imlemented in your virtual machine. String variables could be kept as std::string* type. I doubt you'd need more than string addition and comparison (no, [] operator is not a must).

Quote:Original post by ouraqt
When exactly is GC really needed? I guess it's when the programmer is too lazy to free up his memory (ie. me) or when the language doesn't allow you to free it manually.


When the programmer (or the compiler) is not capable of directly controlling the scope of some variable (who's seeing it, how many times it is referenced in the program, what objects keep reference to a specific variable/object and when the last reference vanishes, so it is safe to release (delete) the variable/object). This issues arise all the time when some objects are referenced by other objects created on the heap (their lifetime is independent from direct program flow or how the code is structured). But that doesn't happen if in the language objects cannot hold references to another objects, nor can they be created on the heap (both of those requiments have to be present for GC to be useful).

Quote:Original post by ouraqt
...Anyway, why should function arguments be passed by reference?


To save space.
To speed up the execution.
To allow the script writer to split the code chunks without changing the script logic (eg. function that can modify an object, and object is passed by reference).

Quote:Original post by ouraqt
ALso note that this language will be used for rapid application development, not intricately optimised programs.


By intrinsic I didn't mean optimised, but that the operations will be handled directly by the virtual machine, and so they can be implemented directly in the underlying language of the VM implementation (C++ ?), and thus they do not pose any additional requiments on the scripting language design nor its implementation - as you said that you'd have some additional issues with dynamic char arrays, this is not what the script user should be concerned about. This should be built into the language itself.


Quote:Original post by ouraqt
PS: How do I quote someone, like in one of those cool quote boxes? I don't see a button anywhere.

Use a button in the upper-right corner of each post. Or just use [_quote_]quoted text[_/quote_] tags (without the "_"'s).
I think you may be right. Despite the fact that strings are dynamic and could potentially require different amounts of memory during their lifetime, they should be built into the language as intrinsic types. But should I still implement the 'char' type?
Quote:To allow the script writer to split the code chunks without changing the script logic (eg. function that can modify an object, and object is passed by reference).
That may save a little speed, but it's less object oriented then simply using the functions to modify variables indirectly. Although I guess I should allow it because I want it to be a multi-paradigm language.
Quote:Original post by F-Kop
Funny..this is what I got with VC6:
*snip*



My bad - I've got the VC7 compiler hooked up to my VC6 install (long story). So that was VC7's output.

OK, I'm done hijacking, I promise [smile]

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Is it better for a scripting language to be compiled in a list of assembly-like instructions (bytecode)? My original design was NOT like this, but now I realize it might be a lot faster (and harder to implement).
There's no "best" that works for everyone. What is best for you depends entirely on your goals.

What are you designing this to be used for? How fast does it need to be? How much programming language/compiler technology do you want to learn?


On the one end the scale are purely interpreted languages, where every time the program runs, the interpreter reads the original source code and figures out how to run it. There's nothing at all wrong with this - it's pretty simple to implement, although it has a bit of storage overhead and is usually one of the slower ways to implement a language. However, if you don't need huge amounts of speed (e.g. writing an entire game or something) then it's fine. Purely interpreted languages like QuickBASIC were the bread and butter of many a game programmer back in the early/mid 90's.

The opposite extreme is compiling directly to machine code. More realistically, you'd usually compile to a dialect of assembly, and use an existing assembler program to get machine code. Another popular approach is to compile to C, and use a C compiler to generate the final machine code. (C++ was actually started that way.)


A reasonable middle-of-the-road is to compile to bytecode and write a virtual machine. This is a fantastic exercise as it really helps understand what's going on under the hood of any of your code, and it'll be a good programming experience. However, it's not a small job - be prepared to spend a bit of time and study a lot of things you probably don't work with on a regular basis.

How abstract your bytecode gets is entirely up to you. Personally, I say look for a medium that balances easy implementation of the compiler with easy implementation of the VM; easy compiler implementation is probably the side to favor, because writing a VM is quite simple by comparison.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

I'm sticking with the compile to bytecode method, and surprisingly, it's going very well. I'm not looking for a tremendous amount of speed, however, I don't want the interpreter/virtual machine to be spending most of its time trying to parse a file.

The bytecode is very language-specific, however. That's not much of a problem though, because I'm only planning to make one scripting/programming language anyway.

Also, if I didn't say it already, I want this to be a general purpose scripting language because I believe in reusable code. Right now I favor rapid application development (it's more fun that way) and so I need a fairly simple language that is still somewhat flexible (but I'm too lazy to go out and learn a scripting language like Python/Lua!?! I must be crazy. I would rather go out and implement my own compiler/interpreter for a language than spend a few hours learning one. w-o-w. Oh well...it's a great learning experience.).

I'm uploading a new version of the design document right now. Tell me if you think this language is suitable for scripting NPC behavior in games as well as developing full-gui applications (rapidly, not really for efficiency). (ex. level editors)

Oh, and this language will have a bunch of intrinsic functions when I complete it, sort of like a standard library (but intrinsic to the language itself). If you've ever used Game Maker before, it will be like GML.

This topic is closed to new replies.

Advertisement