Jump to content
  • Advertisement
Sign in to follow this  
Vincent_M

Understanding Emscripten

This topic is 1238 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm getting into Emscripten, and it sounds pretty interesting. When I first started to learn about the concept of Emscripten, it sounded like I would compile my C/C++ through a compiler (gcc/++ or clang/++), and the object files would be translated into compiled JavaScript. By compiled JavaScript, I thought they meant byte-codes that'd normally be generated by the JS compiler, thus reducing the amount of JS code sent to the browser (for browser implementations) saving compile time. This compiled JS would also have many given optimizations, such as strongly-typed variables, and other optimizations that gcc/clang would generally do.

 

After learning more about Emscripten, it sounds like it uses LLVM to generate byte-code that gets run through the a JavaScript-based interpreter, where that JS interpreter is executed through the JS engine (Node, V8, SpiderMonkey, etc). Is this correct?

 

Also, LLVM isn't a compiler, but rather a later phase in the compilation process, right? It sounds like Emscripten works with LLVM, and its own Fastcomp compiler core to create the byte-code.

 

I've read that when configured correctly, Emscripten can output code that executes faster than JavaScript code written by hand. It may also have comparable speed to C/C++ code (up to 50% C/C++ speed).

Share this post


Link to post
Share on other sites
Advertisement
I know vanishingly little about Emscripten, but I'm pretty familiar with LLVM.

LLVM is a toolkit with a couple of highly relevant components for this sort of work. First and more general-purpose is LLVM Bitcode, a sort of portable assembly language using Static Single Assignment rules. Secondly it is a collection of compiler back-ends which convert Bitcode into machine code for various machines.

I can't say for sure what Emscripten actually uses LLVM for.

Share this post


Link to post
Share on other sites


There is no "JS-based interpreter". C/C++/.. code -> LLVM IR -> JS code (that does not contain any interpreters).

That is not correct. Emscripten's compiler produces bytecode compatible with asm.js,which is an array-based virtual machine implemented in javascript.

 

It also packages up the asm.js script, and all the other components necessary to run the program, which is why one might have assumed that no interpreter was involved.

Share this post


Link to post
Share on other sites

LLVM is a toolkit with a couple of highly relevant components for this sort of work. First and more general-purpose is LLVM Bitcode, a sort of portable assembly language using Static Single Assignment rules. Secondly it is a collection of compiler back-ends which convert Bitcode into machine code for various machines.

I can't say for sure what Emscripten actually uses LLVM for.

 

AND


snake5, on 23 Jul 2015 - 9:56 PM, said:

There is no "JS-based interpreter". C/C++/.. code -> LLVM IR -> JS code (that does not contain any interpreters).
That is not correct. Emscripten's compiler produces bytecode compatible with asm.js,which is an array-based virtual machine implemented in javascript.

 

I'm still pretty confused, but from what I've gathered previous to starting this thread, this is along the lines of what I thought. I read this presentation on asm.js back in February, and it this presentation sounds like it ports compiled C/C++ code over to optimized JS (strong typing, etc), like what SeanMiddleditch mentioned:

 

 


Emscripten also provides implementations of various C libraries that wrap the JavaScript environment. That is a slightly more complex topic that requires you to understand linkers and how the full C/C++ compilation model works, and I'm not sure that's the case.

The result is that C code like
int a = func(b * 2); int c = a - 1;gets translated into LLVM IR like
$0 = %b * 2; %a = call func $0; %c = %a - 1;and then the Emscripten backend translates that into JS like
var _0 = (b * 2)|0; var _1 = func(_0)|0; var c = (_1 - 1)|0;The final bit is the form described by asm.js, which is legal JavaScript but with some extra hints to help specialized JIT engines generate faster machine code.

The JS is just JS. You load it into a browser just like any other JS. It can be interpreted by any JavaScript engine, though it'll probably be very slow on any browser that doesn't explicitly optimize for asm.js formatted code. Emscripten is just converting C++ source into JavaScript source.

 

And, backing up a statement from snake5 here, I think I can see how LLVM fits into the process: as optimization during the conversion from C/C++ to JS:

 

 


One does not _need_ something like LLVM to do this, but given the vast differences in the computing model of C and JavaScript, it really helps to have a good optimizer simplify the input C++ code to simpler machine-like instructions that can be translated to the target language.

 

I still can't wrap my head around how Fastcomp fits into the picture if asm.js also generates JS. Is Fastcomp used by asm.js to actually generate the JS? It sounds like Fastcomp was a replacement for the compiler core by Emscripten.

 


The asm.js is so named because it only really supports simple operations on integer-like constructions and typed arrays, much like assembly language only really supports simple operations on registers and memory. It's not a special byte code, just a valid subset of JavaScript that is capable of emulating the C++ environment and which is easy for specialized JavaScript JIT engines to optimize into actual machine code.

It also sounds like asm.js could just be doing JS generation of simple C/C++ operations. Things like class scope, and possibly C function scoping and structs, are left to LLVM and Fastcomp.

 


Projects like WebAssembly seek to provide an actual binary "machine code" for the Internet to get around the size and speed limitations of using JavaScript for distributing and executing compiled languages like C++ or C#.

This interests a lot. JS opens many doors, but it can't do everything. The fact that we currently have to convert all dependency libraries into JS source gives me a nasty feeling. It basically open-sources your applications too. It makes me wonder how heavier applications, like Google's HTML5 player for YouTube works. It makes sense to stream the compressed data over a stream (I.E. a WebSocket), then uncompress the data on the client-side. When users upload videos in the supported format, the server probably converts it to a common video format that's stored on their cloud. This is probably why videos take so long to upload. Then, have a WebGL fragment shader convert the video's color space (most-likely YUV) to RGB that can be output to the canvas when streaming. Of course, this means that the HTML5 Player's JS is sent over the pipe to the client, and everyone has access to it. I haven't been able to find the source in my dev tools to verify this, however.

 

Hopefully an open-standard machine language spec is developed so that browsers can write a more efficient VM for code. It'd be really interesting to be able to write C, C++ or C# applications as libraries and such, and be able to compile them into a common web-standard binary. We could go back to compiling our dependency libraries and the executable as separate, compiled sources that both have the advantage of being smaller in download size, less bloated in memory consumption, and faster in execution. In a way, I think I'm describing an open standard for a web-based OS packaged into a browser.

Edited by Vincent_M

Share this post


Link to post
Share on other sites
...

That is not correct. Emscripten's compiler produces bytecode compatible with asm.js,which is an array-based virtual machine implemented in javascript.

 

It also packages up the asm.js script, and all the other components necessary to run the program, which is why one might have assumed that no interpreter was involved.

 

asm.js is not a virtual machine, where did you get that idea? (also, "array-based", wat?) It's a specification that is supported by some browsers that allows to leverage otherwise rather useless operations (such as "x|0") to specify the type of a variable (integer/float), allowing to generate simplified (and thus faster) code provided that some usage rules are not broken.

 

Also, what "all the other components" are you talking about? There's just the runtime library to include, and that doesn't include an interpreter, just the C/C++ runtime library implementation, including special ArrayBuffer-based heap emulation code. I wonder what you're confusing for an interpreter here.

Share this post


Link to post
Share on other sites


including special ArrayBuffer-based heap emulation code

Hence 'array-based virtual machine'.

 

You can't implement most of the semantics of a C-like language without support for a flat, random-access memory model, and javascript lacks such a thing. Thus we have to treat each segment as a giant array, and generate code that interacts almost entirely with those arrays.

 

You imply it's just the C/C++ runtime library that is affected, but its much lower level than that - you can't even implement something as simple as the assignment operator without the heap emulation.

Share this post


Link to post
Share on other sites


I still can't wrap my head around how Fastcomp fits into the picture if asm.js also generates JS. Is Fastcomp used by asm.js to actually generate the JS? It sounds like Fastcomp was a replacement for the compiler core by Emscripten.

 

Historically, the final Emscripten JavaScript code generation pass was written as a JavaScript command line application (executed in node.js shell). After a while, this became a performance bottleneck (mostly due to having to parse LLVM IR bytecode files in JS), so we needed to optimize the JavaScript code generation pass. This was done by implementing the final JS code generation in C/C++, and directly as a backend for LLVM (note that this is not currently a tablegenning backend, but similar to the LLVM CppBackend pass). The project lived in the github branch "fastcomp", and the name stuck around as the colloquial term for the portion of the Emscripten compiler that is implemented as part of the LLVM toolchain, as opposed to the Emscripten frontend, which a bunch of other python and JS scripts that comprise the whole Emscripten toolchain.

 

Currently "fastcomp" is the only supported code generation mode, but for a while back, we supported both old and new code generation modes in parallel. Also, fastcomp is only able to target asm.js style of JavaScript. The old pre-fastcomp JS compiler backend was able to target both asm.js and non-asm.js style of JavaScript, in order to give projects a managed migration path from the very old non-asm.js codegen to the more performant asm.js codegen. Today there is no reason to use the old non-fastcomp compiler backend anymore, and it has long since been deprecated and removed from Emscripten code tree, as the fastcomp backend has matured enough to be the only codegen path.

Share this post


Link to post
Share on other sites


Quote
It may also have comparable speed to C/C++ code (up to 50% C/C++ speed).
Only in benchmarks. I've yet to see an actual project that did not use 10x more memory, load many times slower and execute enough code to stress the CPU well enough for measurements.

 

There are quite a few projects out there by now that utilize Emscripten to deliver full applications on the web. Most of these certainly do not use ten times the memory compared to native (or handwritten JavaScript), nor do they load many (how many?) times slower. Here is a list of some off the top of my head that have shipped:

   - The Humble Mozilla Bundle: https://marketplace.firefox.com/app/humble-asmjs-store

        - Contains asm.js versions of the following native games that can be run directly in a web browser: AAAaaa for Awesome!, Democracy 3, Dustforce DX, FTL: Faster than Light, Jack Lumber, Osmos, Super Hexagon, Voxatron, Zen Bound 2

   - The Internet Archive MS-DOS Games: https://archive.org/details/softwarelibrary_msdos_games

   - The Internet Archive MESS and MAME Arcade Games: https://archive.org/details/messmame

   - Unity3D Dead Trigger 2 demo (very old now): http://beta.unity3d.com/jonas/DT2/

   - Unity3D AngryBots demo (very old now): http://beta.unity3d.com/jonas/AngryBots/

   - Unity3D WebGL Benchmark (somewhat old now): http://beta.unity3d.com/jonas/WebGLBenchmark/

   - Since Unity 5.0, Unity supports asm.js+WebGL deployment: http://docs.unity3d.com/Manual/webgl-building.html

   - Since Unreal Engine 4.5, it has been possible to deploy to asm.js+WebGL. UE 4.7 makes the UI workflow much easier without having to "hack" the engine internals.

   - Unfortunately I was not able to find good demos of Unreal Engine offerings hosted on the web. It is possible to locally build the UE4 demos if one downloads the engine though. There is the Tappy Chicken demo at least: https://www.unrealengine.com/html5/

   - Dune 2, Transport Tycoon Deluxe, X-COM: Enemy Unknown, Caesar 3: http://epicport.com/en

   - My HTML5 ScummVM demo (very old now, predates asm.js even, so performance is not representative): http://clb.demon.fi/html5scummvm/

   - If you are on a Firefox OS mobile device, it is possible to download the Disney Where's My Water? and Where's My Perry? games from the Marketplace. Those ports use Emscripten to deploy the games on the phone.

   - Autodesk FormIt 360: http://formit360.autodesk.com/app/ . This is a good example of a non-game application using asm.js.

 

The number of projects deploying to the web has been steadily been rising for the past year. In the future, we are working on the WebAssembly specification, which aims to optimize asm.js experience even further on the web, and in addition to that, adding support for WebGL 2, SIMD and multithreading in order to bring the feature and performance parity even closer to native.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!