Sign in to follow this  

ECMAScript Interpreter from Scratch Tutorials

This topic is 2836 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi guys, I am writing a game engine for a Free Software game called Humm and Strumm, a 3D, cartoony adventure game in which the two main characters, Humm and Strumm, need to stop the evil Dr. Geoff from taking over the world. The game engine will be written completely from scratch, partly because I am not fully satisfied with other Free/Open Source Software game engines as a platform to build this, and partly for more development experience. Because of this, I need to write every subsystem myself, such as the scripting interpreter. This is not exactly a trivial task, but I think it would be very useful for others to see how to go about this task. So, I am writing a series of blog entries on my blog, freeSoftwareHacker();, about the process over the next month or so. I will outline both the formal design and specification of the module and the implementation of it. If you are interested in learning how to write a scripting interpreter (for a dialect of ECMAScript that will integrate with the rest of an engine), just-in-time compiler for x86 systems, or dual-language video game with complete integration between languages, please head over to my blog. The first post is basically an Introduction, but you can subscribe to the RSS feed or periodically check back over the next month to watch for new posts in the series. If you have any comments, criticism, or thoughts on the entry, please, I'd love to hear them, either on the posts' comment sections or on this forumn entry. Cheers, Patrick

Share this post


Link to post
Share on other sites
Quote:
Original post by bubu LV
Why to invent bicycle (again) if you have one already ready - V8?


V8 does not integrate well with C++ code like is required in a game engine. It is simply a javascript interpreter, general-purpose at that. That is why I have chosen not to use V8 or SpiderMonkey, even with something like Flusspferd.

I need something that will natively integrate with my game engine. Because of this, it will not be entirely ECMA-compliant, especially with basic data types. I could not use V8, because it doesn't like working with C++ code in the way I need it to; I could not use SpiderMonkey+Flusspferd, because it is javascript, not a custom-built language for my engine's data types. Download the source code for my game engine (not much at the moment) from the project's downloads page, and check back later on the engine documentation site; I will be posting more information about the design, which arguably should have been done long ago, later today.

I hope this cleared it up.

Cheers,
Patrick

Share this post


Link to post
Share on other sites
No, V8 integrates in C++ code very well - it is meant to be used in C++ project Chrome after all.
Also it has JIT compiler, not only interpreter (I'm not sure if it have interpreter at all).

Quote:
general-purpose at that
.
Exactly! General-purpose. That why it is suited also for embedding into game engine, not only browsers.

See here for example: http://www.garry.tv/?p=661
or V8 public interface here.

Share this post


Link to post
Share on other sites
The post you sent me shows that it does not integrate well with C++, even though it is written in it:

Quote:
I use binding to describe binding c functions to the scripting world. The binding is similar to basic Lua. You define proxy c functions, ...


By my design, there is no need to define separate proxy functions. You will see that in a later post.

V8, again, is a javascript interpreter. It uses its own String implementation, Date implementation, and so on. My game engine already has classes for these, deriving from an Object class which gives runtime type information, runtime object creation, multithreaded heap support, and reference counting to every Object that extends it. Integrating both V8 and my custom data types would be a nightmare, but I will not give up the data types, because they allow all integers, fixed- and floating-point decimals, and strings to have the same functionality as every other Object in the engine.

This is my main problem with FOSS game engines: they aren't really game engines. They are really just 3D engines with other things like sound, networking, and scripting stuck on as an afterthought. Professional game engines provide support for all these things integrated into their design. Both Ogre and Irrlicht take this approach. My engine is designed from the ground up as a game engine, not a 3D engine.

General purpose does not necessarily mean suited for a game engine. General purpose things are not extremely adapted or optimised for anything. Take the C++ language as an example: it is a general purpose language, but you wouldn't think of writing a web application with it, would you? Yes, it has been done, but languages like javascript and PHP are much more suited for this task. C++ can do these things, but it doesn't mean there are not more efficient, better ways.

Also, in a game, you want flexibility: you can't just have a JIT compiler. Take the case of a Quake-style console. You don't want to compile every single line that it feeds you, and then call the compiled code. It is more efficient and easier to just use an interpreter. However, if you have a long script that runs every time the game does with no modification during runtime, then a compiler is a great idea. GNU/Emacs takes this approach with elisp, allowing both normal LISP code and byte-code compiled LISP code.

Again, I will post the design goals, requirements documentation, and initial specification of my engine in a few hours on the Project Documentation Site.

Cheers,
Patrick

Share this post


Link to post
Share on other sites
I have used SpiderMonkey and QtScript module which is based on javascriptCore and I think both integrated very well in C/C++. I have no used V8 because of not being 64bit compatible but looking at the documentation it seems to provide a nice interface.
What exactly is your problem with these libraries?
You can of course write your own interpreter but it would take time and wouldn't be well tested like those libraries and probably not as optimized.

Qt can generate bindings automatically for C++ classes and I have never felt like I need to replace the internal data types they are converted on the fly. Also one reason for having scripting capacities is to let non programmers tweak the game and there are many people who are already familiar with javascript, making things incompatible seems to me to be a bad idea.

Also, JIT compilation does not mean than every single line of code gets compiled (I think). On the other hand have you noticed any lags using JIT compilation?

[Edited by - Kambiz on February 27, 2010 12:09:01 PM]

Share this post


Link to post
Share on other sites
My game engine has a top level class, Object, which provides certain functionality that almost all classes in my game engine need. For example,


  • Runtime Type Information: The system I have is far more powerful than that of C++, and allows me to pass Objects on a network with just a name ("hummstrumm::engine::type::String", for instance), and have it be created on a different computer.

  • Runtime Type Creation: From the above type information, I can dynamically create an Object. This, as I have mentioned above, is nice in a multiplayer, network game, but it also lends itself nicely to a scripting language.

  • Reference Counting: A lot of simple reference counting systems put reference counting information in the smart pointer itself. An Object's reference count is not an attribute of the pointer, but rather of the Object. Take this simple example, in which the reference count is in the Pointer<T>:
    Pointer<Object> p1 (new Object); //< p1 Reference count = 1
    Pointer<Object> p2 (p1); //< p2 Reference count = 2
    Pointer<Object> p3 (p1); //< p3 Reference count = 2

    What happens if p1 and p2 go out of scope, but p3 does not, and is still pointing to the Object? Well, it becomes invalid with a reference count of 1, which is not acceptable in a game. What if we put the reference count in the Object?
    Pointer<Object> p1 (new Object); //< object Reference count = 1
    Pointer<Object> p2 (p1); //< object Reference count = 2
    Pointer<Object> p3 (p1); //< object Reference count = 3

    This effectively solves our problem. In fact, it is the method that the Java interpreter and Python interpreter use, too.

  • Multithreaded Custom Heap: The game engine is multithreaded, so from the start, memory management is a problem. First of all, always using the system memory allocator is pretty slow--it involves many context switches to and from the kernel. Instead, many programs use a custom allocator, in which a big chunk of memory is allocated, and then the game engine gives you blocks from that, avoiding context shifts. This is faster, and by design, my game engine uses a lot of dynamic memory allocations, so it helps. Also, to avoid locks with multiple processors on one heap, the heap is partitioned into the same number of parts as the number of processors. The currently running processor allocates from its own part, and the Object remembers, so it can later free itself, regardless of the processor it would then be running on. This will, for the most part, avoid having to lock on processor out while another is allocating. (It won't always work to make the allocation faster; the processor running a thread could switch during the allocation.) The heap still is locked, because multiple threads could still be allocating from the same partition.



Normal C++ data types don't provide these functions: I can't have an ``int'' with a reference count, for instance. So, I need to derive wrapper classes from the Object class. They also provide additional functionality, like FixedPoint<size>, which is essentially an integer shifted by a certain number of decimal places, and String, which is a Unicode UTF-16 string class.

With an existing javascript interpreter, I could not use these as javascript types themselves; instead, I'd have to write a layer that bridges the V8 or SpiderMonkey data types with my own, custom ones. This is basically what QtScript is doing for you, but I do not want to have that extra layer in there. My Object class itself is pretty efficiently suited for dynamic scripting languages, so I think that using it natively would be better. I have looked into both V8 and SpiderMonkey, but I have found both of them to be extra layers that I do not want.

No, the JIT compiler would not be as optimised, but it would be optimised. I am worried less about the efficiency and more about the homogeneity of the engine. With the scripting engine built-in, I can access all other parts of the engine just like I could in C++. As I say in my post, I want to use ECMAScript for its advantages, and C++ for its advantages, but I want them to easily mesh together. In creating my own, I can make this possible.

Cheers,
Patrick

Share this post


Link to post
Share on other sites
Your post is inaccurate when talking about reference counting, a reference counted shared pointer is possible without storing the count in the object, see boost::shared_ptr. In fact, boost can be a "best of both" solution, as it provides both boost::intrusive_ptr for when a type already has a reference count field, and boost::make_shared, which avoids an additional dynamic allocation for the reference count for shared_ptr.

Also, Java does not use reference counting, nor has it been interpreted for a good number of years.

That said, I'd be interested to read more about how you went about writing your scripting language.

Share this post


Link to post
Share on other sites
Quote:
Your post is inaccurate when talking about reference counting, a reference counted shared pointer is possible without storing the count in the object, see boost::shared_ptr. In fact, boost can be a "best of both" solution, as it provides both boost::intrusive_ptr for when a type already has a reference count field, and boost::make_shared, which avoids an additional dynamic allocation for the reference count for shared_ptr.


I haven't really looked into Boost that much, though this is quite interesting.

Quote:
Also, Java does not use reference counting, nor has it been interpreted for a good number of years.


Ah, yes, I'm sorry. Java itself does not use reference counting, but a particular virtual machine may.

Java still is interpreted, just not as Python is. There are Java JIT compilers (like Sun's, I think), but others (and maybe even Sun's), interpret the compiled byte code.

Quote:
That said, I'd be interested to read more about how you went about writing your scripting language.


Thank you! I will be posting more entries over the next month.

Cheers,
Patrick

Share this post


Link to post
Share on other sites
In some points you are wrong.

Quote:
Original post by Patrick NiedzielskiBy my design, there is no need to define separate proxy functions. You will see that in a later post.

I thought you meant that it will not work well with C++, not that it will be harder to interference with native C++ functions.

Anyway it is pretty simple to write boost::Python or luabind alike wrapper over V8. And it will definitely involve less work than to write ECMAScript parser/interpreter or jit from scratch.

Quote:
V8, again, is a javascript interpreter.

No it is JIT, not interpreter. And you can always disable/remove/delete string/date whatever else builtin classes.

Share this post


Link to post
Share on other sites
Quote:
Anyway it is pretty simple to write boost::Python or luabind alike wrapper over V8. And it will definitely involve less work than to write ECMAScript parser/interpreter or jit from scratch.


Quote:
No it is JIT, not interpreter. And you can always disable/remove/delete string/date whatever else builtin classes.


As I said above with the QtScript wrapper, I can't use javascript. It isn't that V8 is an interpreter or compiler (though I need both, as I have explained), it is that I can't use a pre-made interpreter by the nature and requirements of my engine. I already have a String class which I am using. I can't just disable the javascript String class; too many things in the standard depend on it. String literals are of the EMCAScript String type, but I need them to natively be of my own String type. In short, this isn't javascript, so I can't use a javascript interpreter.

Again, my goal is to have a completely integrated game engine, not just a 3D engine with certain things thrown in. Everything in the engine is designed to work well together, and that is not something I can do when using V8. Don't get me wrong, V8 is awesome at what it does. I am using many of the ideas from it and various other script interpreters and compilers (GCC, Python, and so on), but I am not using any of them, simply because I cannot.

Sorry for the misunderstanding,
Patrick

Share this post


Link to post
Share on other sites
Quote:
Original post by Patrick Niedzielski
I already have a String class which I am using.

One can use qScriptRegisterMetaType to define functions that are used to converts script values to and from C++ types implicitly. This means that your engine in C++ can use MyString and when a function like
void load(MyString filename)
gets called from the script the string is converted automatically to MyString. Writing such wrappers takes just few minutes since the conversion is trivial in most cases (Is there anything beside string where You have a custom implementation for?).
Is there any case where this would not solve your problems? Are you going to write an interpreter just because you have written a string class?
Seriously, how long do you think it takes to write to write an interpreter? Don't you have any deadlines? Have you already finished the C++ side of your engine such that you cant make it go along better with wrappers?

Quote:
Original post by Patrick Niedzielski
Again, my goal is to have a completely integrated game engine, not just a 3D engine with certain things thrown in. Everything in the engine is designed to work well together, and that is not something I can do when using V8.

Have you tried? Why don't you give us an example class that can not be used with V8 while being compatible with ECMAScript.

There are good reasons not tho write the scripting engine yourself:
* Very good scripting engines are available. All are well test and are under active development. They all come with good documentation and are suitable to be used in an game engine.
* Doing it yourself will take a lot of time. Just imaging how much time you have to spend testing and debugging. When there is bug you have not only to check the script and relevant C++ code but also the interpreter.
* Those libraries come with some useful utilities like QScriptEngineDebugger.

[Edited by - Kambiz on February 27, 2010 4:09:33 PM]

Share this post


Link to post
Share on other sites
Quote:
One can use qScriptRegisterMetaType to define functions that are used to converts script values to and from C++ types implicitly. This means that your engine in C++ can use MyString and when a function like
void load(MyString filename)
gets called from the script the string is converted automatically to MyString. Writing such wrappers takes just few minutes since the conversion is trivial in most cases.
Is there any case where this would not solve your problems?


Yes; these will convert the value, but will not preserve attributes of the data type in the javascript code.

As I have said, I cannot use javascript. javascript is a general purpose language. Instead, my ECMAScript interpreter will use only my engine data types, which means it will have the same access to classes as the C++ part of the engine will. The engine should appear as native in the scripting environment as it does in C++.

Quote:
Seriously, who long do you think it takes to write to write an interpreter? Don't you have any deadlines?


Actually, no. The game is Free Software. It has "deadlines" that I set when the release gets close to being completed.

Quote:
Have you already finished the C++ side of your engine such that you cant make it go along better with wrappers?


The game engine is designed thoroughly. One of its main design points is interoperability and consistency in modules. Taking out the conflicting parts would require a complete redesign of the engine, which would take away the benefits of the engine, compared to other Open Source engines.

Quote:
Very good scripting engines are available. All are well test and are under active development. They all come with good documentation and are suitable to be sued in an game engine.
Doing it yourself will take a lot of time. Just imaging how much time you have to spend testing and debugging. When there is bug you have not only to check the script and relevant C++ code but also the interpreter.


I do know how long this will take. javascript is not a terribly difficult language to parse, even compared to C. I have written C compilers before, with optimizations based on a lexical tree. I do understand this.

In using a wrapper, I would be sacrificing some of the benefits of my engine for my ease of development. This is not something I want to do.

Thank you for your suggestions, though,
Patrick

Share this post


Link to post
Share on other sites
I'm interested in reading your blog posts on developing an ECMAScript interpreter (though I would also be interested in seeing a V8 luabind-like library come about, too).

On a side a note, I had a look at your engine thus far. You seem to have put the implementation of your pointer type into a .cpp file. You can't put a templated class's implementation in a translation unit (ie, a .cpp file). It has be in a header.

Share this post


Link to post
Share on other sites
Thank you for catching that. I just looked it up; apparently g++ accepts this, but other compilers will not accept it. I'm glad you have been looking through the code, though there is not much. It certainly does simplify some header file troubles, anyway.

Thanks,
Patrick

Share this post


Link to post
Share on other sites
Google V8 doesn't fit well in multi-threaded environments, it isn't designed for concurrency.

For executing V8 scripts in parallel you have to raise parallel processes, by handling inter-executable communication. That's a very down-feeling issue for anyone that wants to make a scalable game engine for online games.

And that's why I see that making a new ECMA script engine is necessary: for integrating efficiently the script with engine objects, for better memory management and for better support for concurrency.

Most of the avaliable scripting engines are designed for single thread applications. But only a few languages could support many concurrent instances of script executions, like Erlang, Scala or Haskell.

Actually LUA and Python depend on global variables in scripting processes and they have to lock some structures with critical sections, specially when creating new objects in memory and manages the garbage collector, and unique garbage collector for all execution contexts.

Wouldn't be better if they could support an independent garbage collector per thread? Too much concurrent locks harms the multi-thread performance seriously.

Share this post


Link to post
Share on other sites

This topic is 2836 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this