cr88192

Members
  • Content count

    564
  • Joined

  • Last visited

Community Reputation

1570 Excellent

About cr88192

  • Rank
    Advanced Member
  1. when I was messing with WebGL (to a very limited extent), I had a similar issue.   I had addressed it mostly by disabling Alpha in the Canvas, as apparently the browser interprets the canvas alpha as alpha-blending with the page background, so for any "transparent" parts of the canvas, it peeks through to the page background.   dunno if this helps.
  2.   yes, this is a big one.   with the old project, generally nearly all the infrastructure is already in place, but with a new project, one is often pretty much back with the bare language, or are using a different language with less familiar APIs (ex: Java, Dart, or Java via GWT, ..., vs me generally much preferring to use C).     likewise, with new design: I have had much more success with designs which build on prior designs, than with things that are designed entirely new.   like, designs which already exist seem to often have a few advantages: they are implementable, and generally free of "didn't take something into account, so it is unimplementable as-described or will otherwise be internally overly complex or inefficient" and with a smaller amount of "this feature seemed nice on paper but is nearly useless in practice"; often a lot of the details related to various issues will already be made (people will have fiddled with it, making sure it addresses various special cases, is relatively efficient, ...).   this is not necessarily to say much for "elegance" or "perceived simplicity", where often the more elegant and simple-looking the thing is, the more dragons that lurk in the implementation (hidden complexities and inefficiencies). often times "things which work well" end up very ugly in a design-aesthetics sense (and sometimes with features which seem blatantly counter-intuitive, ...).   this doesn't mean one can't write new code or create new designs, but does often give some favor to going the "embrace and extend" route with the design aspects, and relying more on iteration than on clean design.
  3. I seem to have fair bit of difficulty with a few major things: making UIs/artwork/... which doesn't suck; doing stuff which lacks an obvious way forwards (I can do pretty good at throwing together implementations of various specs or cloning stuff, but often things are a lot slower/painful if I have to try to find the way forward for myself); trying to do anything "from a clean slate" (vs hacking on or extending things); ...   some general things I seem to have difficulty with: thinking "high-level", pretty much my entire world is "whatever I have on hand at the moment" (like, it gets frustrating with people always expecting me to think "higher level", for me, these sorts of "high-level" thoughts simply don't really exist in the first place); thinking of "the future" as something which actually exists (it always just seems so distant, whereas whatever is going on at the moment, I can see it as it is taking place); being expected to plan/research/... things in isolation, vs just going and "doing whatever" (and working out the specifics as they come up); ...   my way forwards is usually just to pile up various things and options / experiences / ..., and see whichever is more usable/promising/works-better/... then each thing seemingly opening the way to the next thing (like, if I do something myself, I have a better feel for what all is involved and how everything works), or if all else fails, lots of fiddling with stuff.
  4.   (if I understand correctly) this is basically what I had meant by a slab allocator. a slab allocator can be fast, but also reduce memory use by allowing multiple objects to share the same object-header (all of them will implicitly have the same logical-type and share other details).   for 4 byte items, you would allocate space for 256 items (1KiB), and then add them to a free-list of 4 byte items. for 8 or 16 byte items, it would be similar, just allocating 2KiB or 4KiB instead.   then, for a range of items, you would use a number of free-lists of fixed-size members, say, using something like:     nb=(sz+15)>>4; to map the size to a specific free-list.   while this isn't perfect, allocations/frees can generally be made pretty fast (in the case where the free-lists aren't empty).   allocation in this case basically looks like:     if(mm_freelist[nb])     {         ptr=mm_freelist[nb];         mm_freelist[nb]=*(void **)ptr;         return(ptr);     }     ... fall back to general-case allocation ...     faster is possible via a few tricks, namely having functions hard-coded for a specific allocation/free size, and using separate per-thread contexts (to avoid a need for locking during free-list allocations). for a general-case allocator, it is usually necessary to use a lock to prevent threads from stomping on each other (and for operations which work with or update the heap).   another trick here is to use pre-made "allocation handles", where one first creates a handle-object describing the type of allocation (size, type, flags, ...), and then does allocations/frees using this handle. internally, the handle may contain function-pointers for the specific allocation/free logic, and some data members that may be used for special free-lists or other things.   these may or may not use slabs, but will typically use a pool-allocation strategy (where dedicated free-lists exist for specific types of object, ...).     this may not necessarily apply to "general" code, but in my case code sometimes pops up which requires high-speed allocation/freeing of typically-small objects (often with specific type-information and special semantics/behaviors).   likewise, these large numbers of small objects comprise a big part of the heap (there are often millions of them in dumped memory-stats, 16-64 byte objects comprising the vast majority of these), vs the relatively small number of significantly-larger allocations (thousands of kB range objects, and tens of MB-range objects).     keeping the object-header small is generally also a goal.   currently, my MM/GC uses a 16-byte header, which basically holds:     the object type-ID (objects in my main MM each have a qualified type, externally represented as a string giving a type-name);     the object allocation size;    data about where the allocation-call was made (useful for debugging);    reference-counts (for ref-counted data);    some flags and other things;    a check-value (used to help detect if the header was corrupted, say by an overrun, *).   a lot of this is basically bit-packed, with a few special header sub-variants (such as for objects currently in the free-list, ...).   it was originally an 8-byte header, but was expanded to 16 partly as: I wanted to support larger objects (the same header is used by large-objects as well, and would otherwise limit allocations in 64-bit apps to 256MB); needed space for additional metadata; many objects require 16-byte alignment anyways (for SSE/etc); ...   *: detecting corrupt headers is useful for things like heap forensics, where the MM may detect that a header is corrupt and try to dump diagnostic information and try to detect which memory object overflowed or underflowed, and try to make a guess as to where in the codebase to start looking for the offending code, in addition to raising an exception. also, we may free an object and it helps the MM to detect if a given object overran its bounds without requiring additional memory for additional check-values.   though, say, using 12-bits for a check-value, there is a 1/4096 chance for trashed-headers to escape notice, but this can be reduced some by use of additional "sanity checks" (example: does the size in the header fall within the range of what could fit within the given memory-cells and is appropriate for this part of the heap; is the object type-ID and source-ID valid?), but this needs to be balanced with performance (leaving it more as something for heap-analysis).
  5. FWIW: for allocating small objects (generally under 1-4kB), I prefer to use bitmaps over fixed-size cells instead of linked-lists.   pros (vs a linked-list allocator): they tend to have a smaller per-object overhead (no internal pointers needed, but a constant per-space overhead for the bitmaps); they will implicitly merge adjacent free memory once memory is freed; object lookup from arbitrary pointers can be made very fast (good for things like dynamic type-checking and similar, which can be made O(1)); basic logic is fairly simple; easier to add logic to detect/analyze overruns (they will corrupt object data, but will not necessarily cause critical damage to heap metadata, which may be stored elsewhere in memory).   cons: direct allocation/freeing becomes linearly slower depending on object sizes (need to scan/update bitmaps); high overheads for larger objects (due using lots of bitmap entries / cells); more limited "effective range" of object sizes (vs being similarly effective over a wide range of sizes, as-is the case for lists).   partial solutions: use a pool allocator to speed up allocation/freeing (the pool allocator then serves as the primary allocator, and uses the bitmap allocator for backing, where the pool-allocator has free-lists of fixed-size items); use several different strategies, picking the one most appropriate for the given object size.   for example, in my current main allocator, it works sort of like this (16 byte cells): * under 16 bytes: use a slab-based allocator (say, allocate memory for items in groups of 256 and pad them to a power-of-2 size); * 16-6143 bytes: use bitmap allocator (with a pool allocator to make it faster); * 6kB+: use a "large object" heap, which manages memory objects mostly via sorted arrays pointers and object headers (mostly using binary lookup and binary-insert).   I had at one point considered adding an additional layer of 256-byte cells which would be used for 4kB to 256kB, but had not done so.     though, yes, a linked list works. a few ideas: pad object sizes up to multiples of a certain value (ex: 8 or 16 bytes), which will help reduce fragmentation; use free-lists of fixed allocation sizes (avoids the need for searching), or size-ranges (reduces searching); only merge with next item on free, and/or have a special case to walk the heap and merge adjacent heap items as-needed.
  6. Your first game / programming project?

      so I could license my engine and other stuff under whatever terms I felt like. this meant discarding and needing to rewrite some amount of code, but this wasn't a huge loss.     generally, now I am using MIT/X11 licensing for most of my infrastructure and utility code (so people can basically do whatever with it), however my 3D engine proper is mostly proprietary (note that the source is still available, so it is sort of like MS Shared-Source or similar).   personally, I feel at present that MIT/X11 or BSD licensing is likely better for "things in general" than the GPL, and also GPL and LGPL puts some legal burden on users of the code (whereas with MIT and BSD, people can pretty much "do whatever").     the reason for putting most of the 3D engine under a proprietary license was so that people wouldn't be able to (legally) just grab all the engine source and run with it (or make their own games and not pay anything), and also so that I could (at least theoretically) ask for people to pay for it (and not just go put it up on download sites or whatever...).   though, Creative-Commons BY-NC-ND would be pretty close to the existing terms here, and I could potentially consider moving the 3D engine over to CC-BY-NC-ND or CC-BY-NC or similar at some point.     practically though, it doesn't really make a difference, as basically it is just going on donations and the honor system, though no one is donating anything or otherwise showing any real interest in the project.
  7. Your first game / programming project?

    when I was young, I was mostly messing around with tools to manipulate Doom and Quake data in QBasic.   also went about as far as trying to do Wolf3D style rendering in QBasic, but it was very laggy even drawing a single block.   by the time I migrated to C (by this point, in middle-school, ~16 years ago), I had migrated mostly to trying to write OS stuff. however, the relative lack of coding skills (spent years essentially dealing largely with disk-driver, filesystem, and kernel-space memory-management issues) and the ultimate realization that I had little hope of competing with existing OS's eventually killed this.   I had briefly recently considered a partial revival of this project, in the form of an OS built around partial Win32 emulation with either native or emulated x86, but this didn't get particularly far. in this case, at best, it would have been a small hobby project anyways, more likely something unlikely to ever see much use out of (maybe) people running it in VMware or QEMU or similar.   after the collapse of the original OS project (~ 2003/2004), a lot of the code was re-purposed (initially with globs of code ripped off of Quake glued on) as an attempt at making 3D modeling and mapping tools (but they were never particularly good in a user-interface sense). and also my 3D modeling and mapping skills are terrible in-general it seems.   I didn't really start looking seriously into game-development into around 2010 or so, and then (for sake of being free of GPL), decided to drop all Quake related code and use my 3D tools code as the basis for a 3D engine, and then spent years mostly battling with performance issues (as basically, one may find that performance will often be eaten up by endless "little things", *).   *: something may seem pretty fast in-isolation, but may often be not-so-fast when workloads are scaled up a bit, and when competing for CPU cycles with lots of other "pretty fast" things (and profilers don't really answer questions like "what in particular is going on that is making my framerates not-particularly-smooth?...").     now, it is now, and this has been my life thus far...
  8. lib hell

      simple answer: yes.     less simple answer: there is some informal/de-facto standardization of LIB and OBJ file-formats, at least as far as Windows goes (as AR archives and COFF objects, known as OBJ/LIB with MSVC and O/A with GCC). however different compilers have various minor format differences (in terms of structures both within LIB/AR and the COFF objects), making things problematic. some compilers had also used different formats entirely (for example, a few older compilers had used a 32-bit version of the OMF format and its associated LIB format instead of COFF/AR). there are also differences in the contents of various header-files (ex: what exactly is a "FILE"?), there are differences in terms of parts of the C ABI, and often a very different C++ name mangling scheme and ABI (so C++ object-files are usually incompatible).   the end result is that while sometimes it is possible to force things to link between compilers, usually the result will either not work (at all or at least not correctly) and/or be prone to crashing.     it can apply between compiler versions for the same compiler as well. for example, both MSVC and GCC have changed their ABIs in various ways between compiler versions, so code linked between different compiler versions is prone to not work and/or result in crashes. this is generally less of an issue for DLLs, as the basic C calling conventions (cdecl and stdcall) are pretty much frozen, and also both MSVC and GCC agree in most areas WRT the basic C ABI, and MS is adverse to changing things which will break existing software. however, the C++ ABI is compiler specific, and there are a lot of minor edge cases where the C ABI differs between compilers (namely WRT passing/returning structs, the handling of types like "long double" and "__m128", ...), so some care is still needed. it may also be observed that many/most Windows libraries tend to use a fairly constrained C subset for their APIs (and COM and similar are built on top of the C ABI).   when possible, build things from source all with the same version of the same compiler.
  9. New VM (WIP), thoughts?...

    This is very common on 'closed platforms'... I've worked on a bunch of games that were written in Lua, but then *interpreted* by LuaJIT (because we had to disable the JIT part). In many cases, it's still only ~5x slower than native though ;)     yeah, this is one good point about having an interpreter fallback. may be more portable, and doesn't leave things SOL if for some reason JIT is not really a possibility.   though 5x does seem a little fast for an interpreter, but I guess it depends a lot on what kinds of code one is testing with, and maybe also the specifics of the hardware one is running on, ...   my figures were mostly from past tests involving calculation-centric tests on AMD x86 hardware with prior interpreters.   I don't yet have any real information on the current interpreter though (now being rewritten in C).     ADD: very initial test (artificial, randomized counter-opcodes), interpreter loop pulls off around 130-200 Mips (~ 17-26 clock cycles per opcode). could be better, could be worse... will need to get it more complete to be able to determine its C-relative performance.   ADD 2: Simple NOPs: 360 Mips (~ 9 cycles per operation), so about a 2x speed difference is due to each operation incrementing a counter in the former case ( "frm->ctx->stat_n_op++;" ), vs doing nothing (and accumulating at the end of the current trace). it is 107 Mips (~ 32 cycles / operation) with randomized 3-address arithmetic operators (ex: "frm->reg[op->d].i=frm->reg[op->s].i + frm->reg[op->t].i;" ). operation payload seems to be a notable factor in this case.
  10. New VM (WIP), thoughts?...

      yeah.   part of the issue I am facing is that a few targets are kind of a pain to deploy C onto, hence why a VM to run the C code starts looking like more of an option (less need to deal with an assortment of wacky build-tools, multiple CPU architectures, ...).   (can't really support iOS though, don't really have money to afford Apple HW...).     as for RWX memory, yep.   not sure about iOS (never developed on it), but I am left wondering if the double-mapping trick would work (this being a workaround for targets which don't allow RWX memory, but only RW and RX, where one basically maps the memory twice using the RW mapping to write into the RX mapping), or if they went further and disallowed any user-created executable memory.       in the latter case, probably one will need an interpreter fallback.   for interpreters, I have generally had good luck with more or less unpacking the bytecode into structs and function pointers, then typically using manually-unrolled loops (themselves given via function pointers), to execute them.   like, with a "loop-and-switch", it seems hard to break much below the 100x mark (100x slower than native C, *), but with the function-pointer tricks, I have got it down to around 10x for some experimental interpreters (and around 40x for some in-use interpreters).   but, yeah, getting below 10x is pretty hard IME absent a JIT. for a JIT, the last 2x-3x is the hard part (one is then competing with the C compilers' optimizer). to a large degree, I haven't really gone up this final ramp.     *: at around 100x, one usually finds that the "while()" loop and "switch()" consume nearly the entire running time, earlier on I had called this the "switch limit" thinking this to be a theoretical limit for interpreter speed (until noting all the craziness I could do with function-pointers).   now, the limit is how quickly I can run things like: op=ops[ 0]; op->run(frm, op); op=ops[ 1]; op->run(frm, op); op=ops[ 2]; op->run(frm, op); op=ops[ 3]; op->run(frm, op); op=ops[ 4]; op->run(frm, op); op=ops[ 5]; op->run(frm, op); op=ops[ 6]; op->run(frm, op); op=ops[ 7]; op->run(frm, op); op=ops[ 8]; op->run(frm, op); op=ops[ 9]; op->run(frm, op); op=ops[10]; op->run(frm, op); op=ops[11]; op->run(frm, op); op=ops[12]; op->run(frm, op); op=ops[13]; op->run(frm, op); op=ops[14]; op->run(frm, op); op=ops[15]; op->run(frm, op); ... but, then one is still left back to trying to figure out ways to cut down how many operations are needed to accomplish a given task, which is part of the reason for some of the funkiness in my bytecode (basically, trying to match common-cases in C code).   so, for example, while a register-IR doesn't beat out a stack-machine either in terms of code-density nor necessarily in time-cost-per-operation, it can beat it out in terms of needing fewer operations-per-statement (by addressing variables more directly, giving more freedom to the front-end compiler, ...).   typically, there is a process which decodes the bytecode and tries to build a "trace-graph", and part of the overly low-level design is related to trying to keep this logic relatively simple in this case (for example: minimizing need for analysis or pattern-matching, in favor of linear unpacking, and mostly exposing the "guts" of the interpreter at the bytecode level).   also, partly because the VM would be distributed along with the code that runs on it (as opposed to distributing them separately), generality is less critical in this case.     ( ADD: for an x86 JIT, typically each VM operation will map to 3-5 x86 instructions, but this can be cut down a bit if the JIT has the logic needed to cache things in registers...   a "naive JIT" may not bother, so say:     BINOP.I add, r0, l1, l2     BINOP.I add, l4, r0, l1 might become, say:    ;;first op    mov eax, [ebp-12]    mov ecx, [ebp-16]    add eax, ecx    mov [ebp-48], eax    ;;second op    mov eax, [ebp-48]    mov ecx, [ebp-12]    add eax, ecx    mov [ebp-24], eax   whereas, a slightly less stupid JIT would probably notice that some of this sort of stuff is redundant. a naive JIT though has an advantage of being smaller and simpler. )
  11. New VM (WIP), thoughts?...

    work well on Android + web-browsers?   Unreal and Unity manage just fine. Keep in mind that the output of Emscripten is not compressed. You typically serve Web content with DEFLATE enabled, which compresses text very well. In some cases the compressed asm.js has been shown to be _smaller_ than the equivalent native machine code (I don't know if this is the common case or a special case though). In any event, the size of your textures/sounds/models/etc. are going to outstrip your binary size in any event if we're talking games. Toss in the traffic from interaction (WebSockets, XHR, etc.) and the size of your compiled binary is the least of your concerns, especially if you properly use the HTML5 application cache.   could be... though, as-is, I have most of my assets compressed as well (most textures using a custom JPEG variant, and audio also uses a custom lossy codec). the release-assets in their original formats (PNG, WAV, ...) are closer to about 400MB, but lossy compress to around 80MB. the development-assets directory is closer to around 11GB (though includes lots of working data and data that is not currently included in the final version, ...). the last release version of my game project was around 132MB, with about 80MB of this being data, and 52MB for code and other stuff (while Deflate-compressed). text usually compresses to about 10%-30% its original size IME. uncompressed, this is about 30MB for binaries (EXE/DLL), 27MB for C source+headers (21 MB C, 6 MB H). project size is approx 880 kLOC.   ( EDIT / Removed: prior observations of significant expansion in a few early tests. )   this was in contrast to GWT and Google Dart typically seemed to have a much smaller amount of code expansion for their JS output (though I lack any sufficiently-large globs of Java to throw at GWT or similar to really find out). so, using Java (via GWT) or Dart would probably be mostly a non-issue. the hope had been mostly to try generating most of the ASM.js client-side. could look into it more... was mostly just messing with it for now mostly due to lacking much better to do...     ADD / EDIT: more testing with Emscripten has shown that the relation between input and output file sizes is apparently far from a simple matter, as I have gotten between significant code expansion, and little expansion, depending on which code is tested...   example of very little expansion: SQLite, for which the JS output is nearly the same size as the C input.   example of more significant expansion: my codec library. granted, it consists almost entirely of micro-optimized fixed-point arithmetic, bit-twiddly, and pointer arithmetic, and a fair bit of macros as well (Self-Correction: memory error, it was only ~ 10x bigger, not 100x bigger...).     ADD 2: looking into it a bit more, I may be better off using C and Emscripten for compiling the FRBC VM backend (as opposed to the existing Java-based interpreter), and then maybe consider compilation to ASM.js from within Emscripten (apparently, this is plenty doable...).   advantages: can leverage Emscripten's existing address-space and API wrappers (less cross-language issues); should probably (hopefully) perform better than an interpreter written in Java (and compiled to JS).   still needs further investigation.   would probably implement (as-needed): * basic threaded-code interpreter; * ASM.js JIT (for Emscripten); * x86 (and maybe x86-64) JIT (for native x86).
  12. New VM (WIP), thoughts?...

    @Apoch:   granted, but the choice for a very low-level representation was deliberate: matches the execution model expected from typical C code (where C is the primary HLL); is IME fairly effective at allowing going more directly from bytecode to passably efficient output with a "naive backend".     for the Java-based interpreter, much of the complexity thus-far actually is going more into the emulated Virtual-Address-Space, but this is pretty much inescapable I think (most C code assumes a linear address space).   most of the rest is basically a big mass of inner-classes, whole thing thus far being roughly about 9 kLOC. the core ISA is basically implemented, but not implemented yet is the system-calls and API wrapping, but I will basically need to have the compiler written for a lot of this part.   while, arguably, the value sizes are not ideal, say, for targeting JavaScript, non-fixed sizes would pose a big headache for running C. again, mostly as for better or worse, lots of C code tends to assume certain value sizes. like, you will have a big mess if "int" isn't 32-bits, ... most of this stuff basically has to be emulated anyways on these targets.     I have used a lot of higher-level designs for VMs in the past (typically stack-machines), but these have the drawback of typically leaving more complexity in the backend, either in terms of code complexity or slower execution times (a simple stack-based interpreter is simple, but not particularly fast).   generally it requires (internally) largely going over to a register-machine model or similar anyways to get good speeds out of a threaded-code backend or JIT (typically by "unrolling" the stack over the logical temporaries/registers).   ( ADD/CLARIFICATION: I had prioritized being able to have a moderately fast and low-complexity backend over that of having a simple compiler. )   the FRBC model is actually fairly close to that used in my BGBScript VM JIT backend, albeit the BSVM uses a stack-based high-level bytecode.   granted, there is a difference in that FRBC2C allows accessing locals and arguments directly, whereas most of my other VMs would require loading/storing and performing operations only on temporaries. this should cut operation counts by about 2x-3x.   for example:     y=m*x+b; could be compiled into 2 operations rather than 6.     while x86 emulation would also be an option, my past x86 emulator would be pretty bulky and had pretty bad performance (~ 70x slower than native).   though, the address-space machinery in the Java-based interpreter was at least vaguely similar (though in some ways has more in common with how I did pointer-operations in BGBScript, namely using "pointer objects" for internal operations rather than raw addresses).     ADD 2: also like the BSVM, it breaks code into EBBs/Traces and uses Call-Threaded-Code combined with a trampoline-loop (as opposed to "loop-and-switch" or similar, which is usually a bit slower IME). some aspects of the bytecode were also designed around the behavior of the algorithms which do the trace-building (mostly to allow the trace-builder to be single-pass).     as for LLVM and PNaCl, I am aware of them.   PNaCl is a drawback mostly in that it is browser-specific, and I wanted to be able to support non-Chrome browsers, leaving pretty much JavaScript as the primary target (either directly, or via something like Google Web Toolkit).   I couldn't think of a good way to fit LLVM in with all this, and it wasn't an ideal fit for what I was doing, so I didn't bother, and I also had most of a C compiler front-end available anyways (from a past project trying to use C as a script-language, turned out C wasn't a great option for scripts...).   I had also looked at the Quake3 C VM, but noted that it would need a bit of modification to fully support C (they were essentially running a C subset), and also I would need to rewrite most of it anyways to not be stuck with GPL.     ADD: as for "why?": partly because I am bored / feel like it, and partly for personal use (probably mostly for my own code).   or such...
  13. (I had debate whether to put this here or in the journal, but I am looking for peoples thoughts on some of this...)     well, here is the how this got going: for a while, I wanted a VM which could allow extending the reach of C to targets where C is normally unavailable or overly inconvenient (namely web-browsers and Android).   while there is Emscripten for browsers, my initial tests had shown that the size of the output code expands considerably (*), bringing doubts to its realistic viability for moderate/large codebases, more so as I am running a personal-run server and don't exactly have massive upload speeds (they give like 30 Mbps down but 2 Mbps up, but a person could get 25 down / 5 up for "only" $179/month on a 3 year contract... cough...).   while the NDK is available for Android, it has some serious drawbacks, making it not really a great option (which version of ARM does the device run? what if it runs x86? ...). things are nicer if Java/Dalvik can be used here.   *: (EDIT: Removed. Issue turns out to be "not so simple".).     also, recently, I have put my game project on hold, for sake of basically reaching a stage where I have run out of ideas for new functionality and burnt out about dealing always with the same basic issues.   have gone and messed with writing small apps, as tests, to test out various pieces of technology (ex: various transcompilers to JavaScript, ...).     so, decided recently (~ several weeks ago) to start working on a new VM, consisting of several major components: a C compiler front-end, which compiles C source into the bytecode format; * based on a pre-existing C compiler frontend of mine, which originally targeted a different IL. ** it also had partial support for C++, so a C++ subset may be possible (will probably lack templates). ** it has spent recent years mostly just being used as a code-processing / glue-code generation tool. * IR is statically-typed and register based, vaguely Dalvik-like   an interpreter backend, which will be rewritten in the form as needed for each target. * current (still incomplete/untested) backend is written in Java, but C and Dart or JS backends are planned. * the plan for the JS backend would be to dynamically compile the bytecode into JS on the client. ** main roadblock: dealing with JS, not sure the best approach to go about debugging in JS. ** Java via Google Web Toolkit is a more likely initial target for browsers ATM. * it uses a design intended mostly to allow a (relatively) simple interpreter to generate efficient code. ** this basically means a Register-IR ** while stack-machines are simpler overall, a simple stack interpreter will give lackluster performance. *** getting more speed out of a stack machine means a lot of additional complexity. ** also don't confuse simplicity with smallness *** N*M cases may make code bulkier, but don't add much to overall implementation complexity. * bytecode images are likely to be Deflate or maybe LZMA compressed.     the VM design in question here is much lower-level than some of my other VMs, and makes the compiler logic a fair bit more complicated, so this is still the main hold-up at present (can't finish/test the backend without being able to produce code to be able to run on it).   the design is intended to be high-level enough to gloss over ABI differences, and allow some implementation freedom, but this is mostly about all it really does (it otherwise that far above machine-code in a level-of-abstraction sense).   note that, unlike, say, Flash, this will not require any browser plugins or similar, rather the plan is that the VM will itself be sent over to the browser, and then used to transform the code client-side. this does carry the assumption that the VM will be smaller than the code which runs on it.     this is worrying as, as-is, it means lots of code I have as-of-yet been unable to verify, but I would need to write a bytecode assembler and disassembler to test the backend more directly. will probably need these eventually though.   for those interested, here is a recent-ish working-spec for the bytecode: http://cr88192.mooo.com:8080/wiki/index.php/FRBC2C   still to be seen if I will be able to get all this to usably complete levels anytime soon.     thoughts?...
  14. Questions for all programmers.

      1. QBasic (had moved over to C by middle-school though...).   2. No (was in elementary school, others were trying to read simple words and do arithmetic, but back then, IIRC I was hyperlexic or something...).   3. self-taught   4. self-taught (initially)   5. CS in college was probably a bit of a waste, didn't really teach anything new, mostly just busy-work and general-ed classes, and getting owned by math classes, as my ability to deal with higher-level math (ex: calculus and similar) sucks.   6. initially the QBasic IDE on MS-DOS (6.20 IIRC). then, later, on DOS, mostly EDIT, and also JMACS when using Linux (back in these days, if one bothered to start X, it would mostly just give an Xterm and a clock).   after that, and when later using mostly Windows (NT4 and 2K), had mostly been coding using Notepad, but have since moved mostly to Notepad2, which has a few more features and also syntax highlighting, which is sometimes useful, albeit I am annoyed that in doing so it doesn't exactly preserve fixed-width (bold chars are several pixels larger, and I am annoyed...).   sometimes use VS for C# and Eclipse when coding in Java though, and for C and C++ mostly just use VS as a debugger.
  15. Riddle

      EDIT: made a guess, decided to leave it for other people to figure out (can't just hand it to them on a platter...).