ECMAScript Interpreter from Scratch Tutorials

Started by
14 comments, last by superoptimo 14 years, 1 month ago
Hi guys, I am writing a game engine for a Free Software game called Humm and Strumm, a 3D, cartoony adventure game in which the two main characters, Humm and Strumm, need to stop the evil Dr. Geoff from taking over the world. The game engine will be written completely from scratch, partly because I am not fully satisfied with other Free/Open Source Software game engines as a platform to build this, and partly for more development experience. Because of this, I need to write every subsystem myself, such as the scripting interpreter. This is not exactly a trivial task, but I think it would be very useful for others to see how to go about this task. So, I am writing a series of blog entries on my blog, freeSoftwareHacker();, about the process over the next month or so. I will outline both the formal design and specification of the module and the implementation of it. If you are interested in learning how to write a scripting interpreter (for a dialect of ECMAScript that will integrate with the rest of an engine), just-in-time compiler for x86 systems, or dual-language video game with complete integration between languages, please head over to my blog. The first post is basically an Introduction, but you can subscribe to the RSS feed or periodically check back over the next month to watch for new posts in the series. If you have any comments, criticism, or thoughts on the entry, please, I'd love to hear them, either on the posts' comment sections or on this forumn entry. Cheers, Patrick
Advertisement
Why to invent bicycle (again) if you have one already ready - V8?
Quote:Original post by bubu LV
Why to invent bicycle (again) if you have one already ready - V8?


V8 does not integrate well with C++ code like is required in a game engine. It is simply a &#106avascript interpreter, general-purpose at that. That is why I have chosen not to use V8 or SpiderMonkey, even with something like Flusspferd.

I need something that will natively integrate with my game engine. Because of this, it will not be entirely ECMA-compliant, especially with basic data types. I could not use V8, because it doesn't like working with C++ code in the way I need it to; I could not use SpiderMonkey+Flusspferd, because it is &#106avascript, not a custom-built language for my engine's data types. Download the source code for my game engine (not much at the moment) from <a href="http://code.google.com/p/hummstrumm/downloads/list">the project's downloads page</a>, and check back later &#111;n the <a href="http://sites.google.com/site/hummstrummdoc/home/humm-and-strumm-engine">engine documentation site</a>; I will be posting more information about the design, which arguably should have been done long ago, later today.<br><br>I hope this cleared it up.<br><br>Cheers,<br>Patrick
No, V8 integrates in C++ code very well - it is meant to be used in C++ project Chrome after all.
Also it has JIT compiler, not only interpreter (I'm not sure if it have interpreter at all).

Quote:general-purpose at that
.
Exactly! General-purpose. That why it is suited also for embedding into game engine, not only browsers.

See here for example: http://www.garry.tv/?p=661
or V8 public interface here.
The post you sent me shows that it does not integrate well with C++, even though it is written in it:

Quote:I use binding to describe binding c functions to the scripting world. The binding is similar to basic Lua. You define proxy c functions, ...


By my design, there is no need to define separate proxy functions. You will see that in a later post.

V8, again, is a &#106avascript interpreter. It uses its own String implementation, Date implementation, and so on. My game engine already has classes for these, deriving from an Object class which gives runtime type information, runtime object creation, multithreaded heap support, and reference counting to every Object that extends it. Integrating both V8 and my custom data types would be a nightmare, but I will not give up the data types, because they allow all integers, fixed- and floating-point decimals, and strings to have the same functionality as every other Object in the engine.

This is my main problem with FOSS game engines: they aren't really game engines. They are really just 3D engines with other things like sound, networking, and scripting stuck on as an afterthought. Professional game engines provide support for all these things integrated into their design. Both Ogre and Irrlicht take this approach. My engine is designed from the ground up as a game engine, not a 3D engine.

General purpose does not necessarily mean suited for a game engine. General purpose things are not extremely adapted or optimised for anything. Take the C++ language as an example: it is a general purpose language, but you wouldn't think of writing a web application with it, would you? Yes, it has been done, but languages like &#106avascript and PHP are much more suited for this task. C++ can do these things, but it doesn't mean there are not more efficient, better ways.<br><br>Also, in a game, you want flexibility: you can't just have a JIT compiler. Take the case of a Quake-&#115;tyle console. You don't want to compile every single line that it feeds you, and then call the compiled code. It is more efficient and easier to just use an interpreter. However, if you have a long script that runs every time the game does with no modification during runtime, then a compiler is a great idea. GNU/Emacs takes this approach with elisp, allowing both normal LISP code and byte-code compiled LISP code.<br><br>Again, I will post the design goals, requirements documentation, and initial specification of my engine in a few hours &#111;n the Project Documentation Site.<br><br>Cheers,<br>Patrick
I have used SpiderMonkey and QtScript module which is based on &#106avascriptCore and I think both integrated very well in C/C++. I have no used V8 because of not being 64bit compatible but looking at the documentation it seems to provide a nice interface.
What exactly is your problem with these libraries?
You can of course write your own interpreter but it would take time and wouldn't be well tested like those libraries and probably not as optimized.

Qt can generate bindings automatically for C++ classes and I have never felt like I need to replace the internal data types they are converted on the fly. Also one reason for having scripting capacities is to let non programmers tweak the game and there are many people who are already familiar with &#106avascript, making things incompatible seems to me to be a bad idea.<br><br>Also, JIT compilation does not mean than every single line of code gets compiled (I think). &#79;n the other hand have you noticed any lags using JIT compilation?<br><br><!--EDIT--><span class=editedby><!--/EDIT-->[Edited by - Kambiz on February 27, 2010 12:09:01 PM]<!--EDIT--></span><!--/EDIT-->
My game engine has a top level class, Object, which provides certain functionality that almost all classes in my game engine need. For example,


  • Runtime Type Information: The system I have is far more powerful than that of C++, and allows me to pass Objects on a network with just a name ("hummstrumm::engine::type::String", for instance), and have it be created on a different computer.

  • Runtime Type Creation: From the above type information, I can dynamically create an Object. This, as I have mentioned above, is nice in a multiplayer, network game, but it also lends itself nicely to a scripting language.

  • Reference Counting: A lot of simple reference counting systems put reference counting information in the smart pointer itself. An Object's reference count is not an attribute of the pointer, but rather of the Object. Take this simple example, in which the reference count is in the Pointer<T>:
    Pointer<Object> p1 (new Object); //< p1 Reference count = 1Pointer<Object> p2 (p1); //< p2 Reference count = 2Pointer<Object> p3 (p1); //< p3 Reference count = 2

    What happens if p1 and p2 go out of scope, but p3 does not, and is still pointing to the Object? Well, it becomes invalid with a reference count of 1, which is not acceptable in a game. What if we put the reference count in the Object?
    Pointer<Object> p1 (new Object); //< object Reference count = 1Pointer<Object> p2 (p1); //< object Reference count = 2Pointer<Object> p3 (p1); //< object Reference count = 3

    This effectively solves our problem. In fact, it is the method that the Java interpreter and Python interpreter use, too.

  • Multithreaded Custom Heap: The game engine is multithreaded, so from the start, memory management is a problem. First of all, always using the system memory allocator is pretty slow--it involves many context switches to and from the kernel. Instead, many programs use a custom allocator, in which a big chunk of memory is allocated, and then the game engine gives you blocks from that, avoiding context shifts. This is faster, and by design, my game engine uses a lot of dynamic memory allocations, so it helps. Also, to avoid locks with multiple processors on one heap, the heap is partitioned into the same number of parts as the number of processors. The currently running processor allocates from its own part, and the Object remembers, so it can later free itself, regardless of the processor it would then be running on. This will, for the most part, avoid having to lock on processor out while another is allocating. (It won't always work to make the allocation faster; the processor running a thread could switch during the allocation.) The heap still is locked, because multiple threads could still be allocating from the same partition.



Normal C++ data types don't provide these functions: I can't have an ``int'' with a reference count, for instance. So, I need to derive wrapper classes from the Object class. They also provide additional functionality, like FixedPoint<size>, which is essentially an integer shifted by a certain number of decimal places, and String, which is a Unicode UTF-16 string class.

With an existing &#106avascript interpreter, I could not use these as &#106avascript types <i>themselves</i>; instead, I'd have to write a layer that bridges the V8 or SpiderMonkey data types with my own, custom &#111;nes. This is basically what QtScript is doing for you, but I do not want to have that extra layer in there. My Object class itself is pretty efficiently suited for dynamic scripting languages, so I think that using it natively would be better. I have looked into both V8 and SpiderMonkey, but I have found both of them to be extra layers that I do not want.<br><br>No, the JIT compiler would not be <i>as</i> optimised, but it would be optimised. I am worried less about the efficiency and more about the homogeneity of the engine. With the scripting engine built-in, I can access all other parts of the engine just like I could in C++. As I say in my post, I want to use ECMAScript for its advantages, and C++ for its advantages, but I want them to easily mesh together. In creating my own, I can make this possible.<br><br>Cheers,<br>Patrick
Your post is inaccurate when talking about reference counting, a reference counted shared pointer is possible without storing the count in the object, see boost::shared_ptr. In fact, boost can be a "best of both" solution, as it provides both boost::intrusive_ptr for when a type already has a reference count field, and boost::make_shared, which avoids an additional dynamic allocation for the reference count for shared_ptr.

Also, Java does not use reference counting, nor has it been interpreted for a good number of years.

That said, I'd be interested to read more about how you went about writing your scripting language.
Quote:Your post is inaccurate when talking about reference counting, a reference counted shared pointer is possible without storing the count in the object, see boost::shared_ptr. In fact, boost can be a "best of both" solution, as it provides both boost::intrusive_ptr for when a type already has a reference count field, and boost::make_shared, which avoids an additional dynamic allocation for the reference count for shared_ptr.


I haven't really looked into Boost that much, though this is quite interesting.

Quote:Also, Java does not use reference counting, nor has it been interpreted for a good number of years.


Ah, yes, I'm sorry. Java itself does not use reference counting, but a particular virtual machine may.

Java still is interpreted, just not as Python is. There are Java JIT compilers (like Sun's, I think), but others (and maybe even Sun's), interpret the compiled byte code.

Quote:That said, I'd be interested to read more about how you went about writing your scripting language.


Thank you! I will be posting more entries over the next month.

Cheers,
Patrick
In some points you are wrong.

Quote:Original post by Patrick NiedzielskiBy my design, there is no need to define separate proxy functions. You will see that in a later post.

I thought you meant that it will not work well with C++, not that it will be harder to interference with native C++ functions.

Anyway it is pretty simple to write boost::Python or luabind alike wrapper over V8. And it will definitely involve less work than to write ECMAScript parser/interpreter or jit from scratch.

Quote:V8, again, is a &#106avascript</i> interpreter.<!--QUOTE--></td></tr></table></BLOCKQUOTE><!--/QUOTE--><!--ENDQUOTE--><br>No it is JIT, not interpreter. And you can always disable/remove/delete string/date whatever else builtin classes.

This topic is closed to new replies.

Advertisement