How does scripting work

Started by
3 comments, last by cr88192 11 years, 2 months ago

Hello guys,

so for a while now I have been programming in c++ with Direct3D 11, and I know that soon i might reach a point where scripting will be required, but huge performance questions pop up. Most of them are below here, you don't need to answer all if you don't want to, but it would be nice. wink.png

  • How does the program understand the code? Does it read per each line of string (Wouldn't this be SLOW?), or does it semi compile (like setting all the commands into an array from the beginning), or how?
  • Should i just leave it, and use direct c++ (The problem is that i would like to have a way that the user didn't need to compile anything).
  • How (if you imagine you had to compile it, but not in c++) should i handle the script code, or in other words, how should my program try to understand it (e.g. Now there's a bracket, so go to...).

Maybe scripting is too hard (not TO script, but for the program to understand it) as I never had any experience, but I'm willing to give it a try.

Thank You

FastCall22: "I want to make the distinction that my laptop is a whore-box that connects to different network"

Blog about... stuff (GDNet, WordPress): www.gamedev.net/blog/1882-the-cuboid-zone/, cuboidzone.wordpress.com/

Advertisement

In general, unless you roll your own scripting language you won't have to deal with the text of the script at all. You just pass it off to an interpreter/runtime and then the script performs whatever tasks. This is a little tricky to get in your head until you play around with it, but in general you will just write some bindings from your C++ class and functions which maps them over to be available in your scripting environment.

For game and graphics development, Lua is a fairly common choice. It is fairly flexible, and allows you to decide how you want to use it (which can make it even more confusing when you are just getting started...). However, after you see a few examples it will make sense about what you can and can't do. If you want to try out Lua, I would recommend reading through the Programming in Lua text, which is available at the Lua website. Then you can take a look at the basic scripting sample in my engine, Hieroglyph 3.

Lua does indeed compile its code before it is executed. However, the compilation can be performed at different granularities, and it can be done on the fly at runtime without much performance impact. Typically you will compile a bunch of functions and or classes, and then just use them in your scripting environment (i.e. you only need to compile them once). In addition, you can compile from a file, a string, or anywhere in between as well (including dynamically generated code if you like...).

This is a big topic, and you can use it to your advantage - take the time to learn about it and I think you will in general be very happy that you did!

How does the program understand the code? Does it read per each line of string (Wouldn't this be SLOW?), or does it semi compile (like setting all the commands into an array from the beginning), or how?

Like Jason said, you probably won't be handling the parsing or execution of the scripted instructions. You'll be using a library that does that for you.
Usually, when you ask the library to load a script, it will get compiled into bytecode, but some more advanced techniques exist that will actually compile the script code into native instructions that your CPU can directly execute, which improves performance a lot.
This should all be specified somewhere in your chosen library's feature list.

Should i just leave it, and use direct c++ (The problem is that i would like to have a way that the user didn't need to compile anything).

There are various reasons one would choose to use scripting in a project. You should research about why should you use it, and then decide whether you'll get advantages out if it or not. Don't just go and force yourself to use scripts if you don't need to, unless you're just learning and have some spare time.
I personally am using scripting (Mono) in my current game project for various reasons, including:

  • game code separation from engine code
  • easier to maintain and add features
  • very fast compilation times

Some libraries (don't know if all) allow you to distribute precompiled files which contain bytecode. This should reduce your scripts filesizes, and protect them against easy editing.

How (if you imagine you had to compile it, but not in c++) should i handle the script code, or in other words, how should my program try to understand it (e.g. Now there's a bracket, so go to...).

That is called Parsing, and the library does it for you.

If interested on how to comunicate with the script methods, you should see what others are doing to get some ideas of the implementations. I have an implementation similar to Unity3D where nodes with attached scripts call the script's "update" function every frame and so on...

Differently from many, I have wrote my scripting language and the accompanying compiler. So I guess I can give some extra insight.

How does the program understand the code?

It doesn't. Have you ever compiled something by hand, say for a Motorola CPU? A compiler is in fact a table where MOVE from_reg to_reg equals a certain bit pattern. A component in your program knows those bit patterns and evolves its internal state accordingly, just as a CPU would do. More advanced scripting languages compile the script just in time, so the CPU can execute it directly.

Anyway, if you're thinking at scripting for gaming and performance, I'm afraid the design has gone wrong in many ways. Scripting must be implemented in a way that makes it a non-issue performance wise, which typically means being event-driven.

Should i just leave it, and use direct c++ (The problem is that i would like to have a way that the user didn't need to compile anything).

You can get a lot of mileage out of C++. Use scripting for "gameplay logic".

How (if you imagine you had to compile it, but not in c++) should i handle the script code, or in other words, how should my program try to understand it (e.g. Now there's a bracket, so go to...).

Just in the same way you read the code. Are you aware of formal languages and BNF? Perhaps take a look at them. They're very easy at their core (albeit they will give you headaches).

Previously "Krohm"

FWIW, for my project, I am using my own script language (sort of like JavaScript and ActionScript mixed with C, Java, and C#).


and, generally the way it is run works in a number of stages.

first, there is a parser, which:
starts breaking the code down into individual "tokens", for example, a brace, or an identifier (variable name), or a number, ... each according to their specific rules (numbers contain digits, ...);
starts matching the tokens against various syntax patterns, going down the parts of the tree for which the syntax matches;
in doing so, produces a tree-like structure (an "AST") representing the code it has seen along the way.

the next stage is the front-end compiler which:
walks along this structure, figuring out various pieces of information (where is this variable declared? does this operator have a known type? ...) and starts spitting out appropriate globs of bytecode.

at this point, bytecode may be saved for later, pulled from files, or simply passed to the back-end.


then, we get to the interpreter backend, which in my case currently:
begins by converting this bytecode into a list-like structure (representing individual bytecode operations), usually whenever a function or method is first called;
splits apart this structure into a collection of "traces", which represent various non-branching sequences of bytecode instructions (technically, this is a "Control Flow Graph", or "CFG");
it may either directly execute these traces (via embedded function-pointers), or (if a counter runs out), pass it off to the JIT, which then spits out a mix of function-calls back into the interpreter, and directly compiled machine-code sequences for various operations.


granted, these last stages aren't strictly required, and there are many interpreters which instead directly execute bytecode (by decoding and invoking the logic for each bytecode instruction one-at-a-time).

and, some much simpler interpreters will simply walk over the AST, and some directly on the input text.


the tradeoff mostly has to do with performance, and the relative amount of resources "invested" into a piece of code.
simpler strategies are better when code will likely at most only ever be seen once, and never be run again. typically, execution speeds are "very slow" (often 1000x-10000x or more slower than native).

directly interpreting bytecode is generally better when code will be run more than once (so hopefully isn't "dead slow"), but doesn't need to be "particularly" fast. bytecode interpretation can usually get within about 100x-200x of native speeds (IME with "while()" loops and large "switch()" blocks).

(some people have also had good luck with getting good performance out of "computed goto" and "label pointers", but these are generally GCC specific).


trace-graphs can get a little faster, IME generally around 50-60x of native (though some experimental interpreter designs have run much faster than this, like 8-10x of native, but it seems problematic to replicate this effect in a "real" interpreter). (granted, some could have to do with architecture: my experimental interpreter was more Dalvik like, using untagged registers and three-address operations, whereas my main interpreter is a stack-machine and currently still uses tagged references).

this is partly because a lot of the "figuring out what to do" work can be moved out of the execution path, so largely it becomes a matter of simply calling through function-pointers (and generally using a "trampoline loop").


a JIT can still be faster though, reaching comparable speeds to native C or C++ code (and doing pretty much anything else a traditional compiler can do), though this isn't always necessarily the case, for example, my current JIT is fairly naive and can get within about 3x of native C speeds in some cases, but typically runs often runs a bit slower than this (as much of its operation is generally still handled via interpreter machinery), but often this doesn't matter as much as I am not usually using the scripting-language for speed critical tasks. (basically, it goes fastest if treated like it were a C-like subset of Java, but using it this way is lame, and is not really what it was designed for...). (reaching C-like speeds generally requires things like figuring out how to cache variables in registers, perform register allocation, ... which some of my past JITs have done, but my current JIT doesn't do).


note that, while interpreters can be interesting, writing them can also end up eating up a large amount of time and effort, so are not necessarily a good idea if a person has other things they want to get done (and, it may take effectively years of time and effort invested into them before they stop looking like a joke if compared with more mature pieces of technology).

This topic is closed to new replies.

Advertisement