• entries
627
1447
• views
1008922

A bipolar guy in a pressure-cooker industry

## Code Reuse In Actual Practice

It’s very common to hear engineers talking about "code reuse" - particularly in a positive light. We love to say that we’ll make our designs "reusable". Most of the time the meaning of this is pretty well understood; someday, we want our code to be able to be applied to some different use case and still work without extensive changes.

But in practice, code reuse tends to fall flat. A common bit of wisdom is that you shouldn’t even try to make code reusable until you have three different use cases that would benefit from it. This is actually very good advice, and I’ve found it helps a lot to step back from the obsession with reusability for a moment and just let oneself write some "one-off" code that actually works.

This hints at the possibility of a few flaws in the engineering mindset that reuse is a noble goal.

## Why Not Reuse?

Arguing for reuse is easy: if you only have to write and debug the code once, but can benefit from it multiple times, it’s clearly better than writing very similar code five or six times…​ right?

Yes and no. Premature generalization is a very real thing. Sometimes we can’t even see reuse potential until we’ve written similar systems repeatedly, and then it becomes clear that they could be unified. On the flip side, sometimes we design reusable components that are so generic they don’t actually do what we needed them to do in the first place.

This is a central theme of the story of Design Patterns as a cultural phenomenon. Patterns were originally a descriptive thing. You find a common thread in five or six different systems, and you give it a name.

Accumulate enough named things, though, and people start wanting to put the cart before the horse. Patterns became prescriptive - if you want to build a Foo, you use the Bar pattern, duh!

So clearly there is a balancing act here. Something is wrong with the idea that all code should be reusable, but something is equally wrong with copy/pasting functions and never unifying them.

But another, more insidious factor is at play here. Most of the time we don’t actually reuse code, even if it was designed to be reusable. And identifying reasons for this lapse is going to be central to making software development scalable into the future. If we keep rewriting the same few thousand systems we’re never going to do anything fun.

## Identifying Why We Don’t Reuse

Here’s a real world use case. I want to design a system for handling callbacks in a video game engine. But I’ve already got several such systems, built for me by previous development efforts in the company. Most of them are basically the exact same thing with minor tweaks:

• Define an "event source"

• Define some mechanism by which objects can tell the event source that they are "interested" in some particular events

• When the event source says so, go through the container of listeners and give them a callback to tell them that an event happened

Easy. Except Guild Wars 2 alone has around half a dozen different mechanisms for accomplishing this basic arrangement. Some are client-side, some are server-side, some relay messages between client and server, but ultimately they all do the exact same job.

This is a classic example of looking at existing code and deciding it might be good to refactor it into a simpler form. Except GW2 is a multi-million line-of-code behemoth, and I sure as hell don’t want to wade through that much code to replace a fundamental mechanism.

So the question becomes, if we’re going to make a better version, who’s gonna use it?

For now the question is academic, but it’s worth thinking about. We’re certainly not going to stop making games any time soon, so eventually we should have a standardized callback library that everyone agrees on. So far so good.

But what if I want to open-source the callback system, and let other people use it? If it’s good enough to serve all of ArenaNet’s myriad uses, surely it’d be handy elsewhere! Of course, nobody wants a callback system that’s tied to implementation details of Guild Wars 2, so we need to make the code genuinely reusable.

There are plenty of reasons not to use an open-source callback library, especially if you have particular needs that aren’t represented by the library’s design. But the single biggest killer of code reuse is dependencies.

Some dependencies are obvious. Foo derives from base class Bar, therefore there is a dependency between Foo and Bar, for just one example. But others are more devilish.

Say I published my callback library. Somewhere in there, the library has to maintain a container of "things that care about Event X." How do we implement the container?

Code reuse is the name of the game here. The obvious answer (outside of game dev) is to use the C++ Standard Library, such as a std::vector or std::map (or both).

In games, though, the standard library is often forbidden. I won’t get into the argument here, but let’s just say that sometimes you don’t get to choose what libraries you rely on.

So I have a couple of options. I can release my library with std dependencies, which immediately means it’s useless to half my audience. They have to rewrite a bunch of junk to make my code interoperate with their code and suddenly we’re not reusing anything anymore.

The other option is to roll my own container, such as a trivial linked list. But that’s even worse, because everyone has a container library, and adding yet another lousy linked list implementation to the world isn’t reuse either.

## Policy-Based Programming to the Rescue

The notion of policy-based architecture is hardly new, but it is sadly underused in most practical applications. I won’t get into the whole exploration of the idea here, since that’d take a lot of space, and I mostly just want to give readers a taste of what it can do.

class ThingWhatDoesCoolStuff
{
std::vector<int> Stuff;
};

This clearly makes our nifty class dependent on std::vector, which is not great for people who don’t have std::vector in their acceptable tools list.

Let’s make this a bit better, shall we?

template <typename ContainerType>
class ThingWhatDoesCoolStuff
{
ContainerType Stuff;
};

// Clients do this
ThingWhatDoesCoolStuff<std::vector<int>> Thing;

Slightly better, but now clients have to spell a really weird name all the time (which admittedly can be solved to great extent with a typedef and C++11 using declarations).

This also breaks when we actually write code:

template <typename ContainerType>
class ThingWhatDoesCoolStuff
{
public:
{
Stuff.push_back(stuff);
}

private:
ContainerType Stuff;
};

This works provided that the container we give it has a method called push_back. What if the method in my library is called Add instead? Now we have a compiler error, and I have to rewrite the nifty class to conform to my container’s API instead of the C++ Standard Library API. So much for reuse.

You know what they say, you can solve any problem by adding enough layers of indirection! So let’s do that real quick.

// This goes in the reusable library
template <typename Policy>
class ThingWhatDoesCoolStuff
{
private:
// YES I SWEAR THIS IS REAL SYNTAX
typedef typename Policy::template ContainerType<int> Container;

// Give us a member container of the desired type!
Container Stuff;

public:
{
}
};

// Users of the library just need to write this once:
struct MyPolicy
{
// This just needs to point to the container we want
template <typename T> using ContainerType = std::vector<T>;

template <typename T>
{
static inline void PushBack (MyPolicy::ContainerType * container, T && element)
{
// This would change based on the API we use
container->push_back(element);
}
};
};

Let’s pull this apart and see how it works.

First, we introduce a template "policy" which lets us decouple our nifty class from all the things it relies on, such as container classes. Any "reusable" code should be decoupled from its dependencies. (This by no means the only way to do so, even in C++, but it’s a nice trick to have in your kit.)

The hairy parts of this are really just the syntax for it all. Effectively, our nifty class just says "hey I want to use some container, and an adapter API that I know how to talk to. If you can give me an adapter to your container I’ll happily use it!"

Here we use templates to avoid a lot of virtual dispatch overhead. Theoretically I could make a base class like "Container" and inherit from it and blah blah vomit I hate myself for just thinking this. Let’s not explore that notion any further.

What’s cool is that I can keep the library code 100% identical between projects that do use the C++ Standard Library, and projects which don’t. So I could publish my callback system exactly once, and nobody would have to edit the code to use it.

There is a cost here, and it’s worth thinking about: any time someone reuses my code, they have to write a suitable policy. In practice, this means you write a policy about once for every time you change your entire code base to use a different container API. In other words, pffffft.

For things which aren’t as stable as containers, the policy cost may become more significant. This is why you want to reuse in only carefully considered ways, preferably (as mentioned earlier) when you have several use cases that can benefit from that shared abstraction.

## Concluding Thoughts

One last idea to consider is how the performance of this technique measures up. In debug builds, it can be a little ugly, but optimized builds strip away literally any substantial overhead of the templates.

So runtime performance is fine, but what about build times themselves?

Admittedly this does require a lot of templates going around. But the hope is that you’re reusing simple and composable components, not huge swaths of logic. So it’s easy to go wrong here if you don’t carefully consider what to apply this trick to. Used judiciously, however, it’s actually a bit better of a deal than defining a lot of shared abstract interfaces to decouple your APIs.

I’ll go into the specific considerations of the actual callback system later. For now, I hope the peek at policy-based decoupling has been useful.

Remember: three examples or you don’t have a valid generalization!

## Source-Level Debugging For Epoch Programs

This weekend marks a major milestone for the development of the Epoch programming language. For the first time, Windows debuggers such as Visual Studio and WinDbg can perform source-level debugging on Epoch programs.

In a nutshell, this means that the comfortable modern development features of setting breakpoints and stepping through code are now available to Epoch programmers.

One notable thing left to achieve is runtime state inspection. There is currently not enough data generated by the Epoch compiler to reliably inspect variables, function parameters, and so on in the debugger. This will be my next major point of focus.

## How We Got Here

Attaining this functionality was not easy, but it was definitely worth the investment. It all started almost exactly a year ago, when I decided that being unable to debug the self-hosting process for 64-bit Epoch was unacceptable.

Initially debug information was generated via piping some bogus line numbers into LLVM and then routing the generated block of CodeView symbols into MSPDB140.dll to generate a somewhat-working PDB file on disk. This implementation took about two weeks.

That wasn’t enough, though; it introduced a heavy dependency on Visual Studio (something I’ve been keen to avoid, despite strongly encouraging use of VS with Epoch) and also had limitations via the API of MSPDB140.dll that were…​ inscrutable, to say the least.

So I set out in search of a complete understanding of the PDB file format and how to generate my own debug information for it. The intervening year wasn’t all dedicated to PDB work; a fair amount of time went into Visual Studio integration and other tidbits of self-hosting effort. (Not to mention there were a few major spans of downtime. This gets exhausting after a while!)

The Epoch repo commit log shows the gory details of how everything came together, but the high-level is pretty simple; using a suite of tools, I reverse engineered large sections of the PDB format and developed an Epoch implementation of code to write them out.

Noteworthy projects:

Target debuggers have been VS2015 and WinDbg, both of which work now with source-level breakpoints and stepping. x64dbg also sort-of works, although it doesn’t like to display source for some reason; everything else seems to be fine, so I don’t know the tool well enough to say if it’s my bug or theirs.

## What Comes Next

As alluded to above, my next major project is to get variable data inspection working. This is a dark corner of the PDB format that seems poorly understood in the community, so it should be exciting to try and forge ahead here.

At a minimum, I’ll need to start generating types data in the PDBs, so that debuggers know how to interpret various memory addresses correctly. There’s probably some other voodoo too, but that’ll have to wait until I discover what’s in store.

## Using Poison to Reverse Engineer Code

Recently IaEUR(TM)ve been working on a rather difficult task, namely creating PDB debug database files from the href="https://github.com/apoch/epoch-language">Epoch Language compiler toolchain.

This is difficult in part because the format of PDB files is generally not well-understood, and is certainly poorly documented. I canaEUR(TM)t go much further without a hearty thanks to the LLVM project and particularly their tool llvm-pdbdump which makes it much easier to test whether or not a generated PDB is sane. When llvm-pdbdump has good information about the state of a given PDB, it is invaluable; and when it falls short, as is inevitably the case with a format like PDB, it at least gives me a starting point for understanding why things have gone wrong.

However, there is another tool, from Microsoft themselves, called Dia2Dump.exe which uses an authoritative implementation of the PDB format, via the file MsDia140.dll on Visual Studio 2015. This library is (as near as I can tell) close to or identical to the code used by Visual Studio itself for debugging programs. It also seems to parallel the implementations in WinDbg and DbgHelp.dll, both of which I use extensively in my research.

Last but not least, I must mention the Microsoft-PDB repo on GitHub, which is partial source for the implementation of the PDB format. It does not actually compile right now, so itaEUR(TM)s hard to use, but it has a significant purpose for me: I can cross-reference functions in MsDia140.dll with this code, and use that for some serious reverse-engineering.

## Debugging

Sometimes when feeding data into a black box like MsDia140.dll it can be hard to know what code paths are taken and why. For example, letaEUR(TM)s look at the function GSI1::readHash (see here to follow along in the source).

This function does some stuff I still donaEUR(TM)t fully understand, so letaEUR(TM)s walk through the process of gaining more understanding.

First we need a partially malformed PDB. This is easy to do since PDB files are sensitive to tiny changes, often in non-obvious ways. In particular, IaEUR(TM)m going to work on the Publics stream. This is a fragment of a Multi-Stream File (aka MSF) which contains, among other things, publicly visible debug symbols for some program.

At the beginning of the stream, there is a structure which llvm-pdbdump is sadly cryptic about. Thankfully, llvm-pdbdump contains some sanity checks which seem to align well with the checks made by MicrosoftaEUR(TM)s code, so itaEUR(TM)s at least easy to use the tool to verify what weaEUR(TM)re spitting out.

readHash is responsible for decoding part of this data structure, which appears to be some kind of hash table for accelerating symbol lookups. Inside the code for readHash (see link above) there is a call to a pesky function called fixHashIn. By attaching WinDbg to a running copy of Dia2Dump.exe and setting liberal numbers of breakpoints, I traced a failure in my PDB generation code to this single function. fixHashIn is vomiting because IaEUR(TM)m feeding it data it doesnaEUR(TM)t like.

The first thing to note is that fixHashIn begins with a decrement instruction to decrease the value of one of its parameters. This parameter is supposedly the number of buckets in the hash table, or so I extrapolate from the source.

In my case, the parameter has a value of zero! Clearly I donaEUR(TM)t want my hash table to have zero buckets, so it becomes apparent why fixHashIn is choking. What I donaEUR(TM)t immediately understand is why it thinks zero is the number of bucketsaEUR|aEUR' I had thought that I was passing a value in (8 bytes per entry * 16 entries) that would work. Clearly I was wrong, but where was the zero coming from?

## MSF Files

A little more background is in order. In an MSF file (MSF being a superset of PDB files), data is divided into streams, each of which is built up of one or more blocks. A block can be different sizes, but IaEUR(TM)m using 1KB (1024 bytes) for convenience. Data not used is filled with junk bytes.

Crucially, I pad my blocks with zeroes. If somehow the PDB interpreter is reading one of my padding bytes, it might be incorrectly assuming I want to feed it a zero-size hash tableaEUR|aEUR' obviously a problem. So what to do?

## Poison

And now the meat of everything!

Instead of padding my file with zeroes, I use carefully crafted poison values. For my purposes IaEUR(TM)m working with 32-bit data, so a poison value is usually 4 bytes long. A good example is 0xfeedface which is a funny but valid hex number that happens to be the right size.

The important thing is that we canaEUR(TM)t just pad every 32-bit slot with 0xfeedface. Instead, we want to make permutations of the poison value - one unique permutation per slot. Every possible 4-byte sequence of my PDBaEUR(TM)s "padding" is now a unique string of digits.

HereaEUR(TM)s the magic part: when I run this in the debugger, I can walk into the fixHashIn function, and look at its parameters.

My first run of this process is surprising - despite poisoning a bunch of data around where I thought this zero was coming from, the value is still zero when we reach the fixHashIn function! This indicates one of two things.

1. The value is read from a place I didnaEUR(TM)t poison

2. The value might be computed somehow

To rule out the possibility that IaEUR(TM)m not poisoning enough, I expand the poison to the entire file instead of just one blockaEUR(TM)s worth of padding bytes. The debugger still stubbornly shows the parameter as zero, meaning that the zero is being computed from some other data being fed in, not read directly from the file on disk.

This line of the Microsoft PDB source is illuminatingaEUR|aEUR' but href="https://github.com/Microsoft/microsoft-pdb/blob/master/PDB/dbi/gsi.cpp#L65">this line even more so. At line 65 is a comment stating that fixHashIn is called from two placesaEUR|aEUR' one of them is the loader for the Publics stream, but one is for a totally unrelated stream called Globals!

## Conclusions

It turns out IaEUR(TM)ve been hitting breakpoints all evening in fixHashIn, but the call stack is wrong. The calls IaEUR(TM)ve been seeing are from a totally different stream of data.

This post may not have a cheerful ending, but I hope the value of poisoning data is clear: I may have taken days to realize my mistake without having 100% proof that the evil zero was not coming from my Publics stream.

In any event, I use the poison technique a lot, and this is just one sampling of my adventures with the PDB format. Maybe IaEUR(TM)ll have a better story of success tomorrow!

Source

## Using Poison to Reverse Engineer Code

Recently I’ve been working on a rather difficult task, namely creating PDB debug database files from the Epoch Language compiler toolchain.

This is difficult in part because the format of PDB files is generally not well-understood, and is certainly poorly documented. I can’t go much further without a hearty thanks to the LLVM project and particularly their tool llvm-pdbdump which makes it much easier to test whether or not a generated PDB is sane. When llvm-pdbdump has good information about the state of a given PDB, it is invaluable; and when it falls short, as is inevitably the case with a format like PDB, it at least gives me a starting point for understanding why things have gone wrong.

However, there is another tool, from Microsoft themselves, called Dia2Dump.exe which uses an authoritative implementation of the PDB format, via the file MsDia140.dll on Visual Studio 2015. This library is (as near as I can tell) close to or identical to the code used by Visual Studio itself for debugging programs. It also seems to parallel the implementations in WinDbg and DbgHelp.dll, both of which I use extensively in my research.

Last but not least, I must mention the Microsoft-PDB repo on GitHub, which is partial source for the implementation of the PDB format. It does not actually compile right now, so it’s hard to use, but it has a significant purpose for me: I can cross-reference functions in MsDia140.dll with this code, and use that for some serious reverse-engineering.

## Debugging

Sometimes when feeding data into a black box like MsDia140.dll it can be hard to know what code paths are taken and why. For example, let’s look at the function GSI1::readHash (see here to follow along in the source).

This function does some stuff I still don’t fully understand, so let’s walk through the process of gaining more understanding.

First we need a partially malformed PDB. This is easy to do since PDB files are sensitive to tiny changes, often in non-obvious ways. In particular, I’m going to work on the Publics stream. This is a fragment of a Multi-Stream File (aka MSF) which contains, among other things, publicly visible debug symbols for some program.

At the beginning of the stream, there is a structure which llvm-pdbdump is sadly cryptic about. Thankfully, llvm-pdbdump contains some sanity checks which seem to align well with the checks made by Microsoft’s code, so it’s at least easy to use the tool to verify what we’re spitting out.

readHash is responsible for decoding part of this data structure, which appears to be some kind of hash table for accelerating symbol lookups. Inside the code for readHash (see link above) there is a call to a pesky function called fixHashIn. By attaching WinDbg to a running copy of Dia2Dump.exe and setting liberal numbers of breakpoints, I traced a failure in my PDB generation code to this single function. fixHashIn is vomiting because I’m feeding it data it doesn’t like.

The first thing to note is that fixHashIn begins with a decrement instruction to decrease the value of one of its parameters. This parameter is supposedly the number of buckets in the hash table, or so I extrapolate from the source.

In my case, the parameter has a value of zero! Clearly I don’t want my hash table to have zero buckets, so it becomes apparent why fixHashIn is choking. What I don’t immediately understand is why it thinks zero is the number of buckets…​ I had thought that I was passing a value in (8 bytes per entry * 16 entries) that would work. Clearly I was wrong, but where was the zero coming from?

## MSF Files

A little more background is in order. In an MSF file (MSF being a superset of PDB files), data is divided into streams, each of which is built up of one or more blocks. A block can be different sizes, but I’m using 1KB (1024 bytes) for convenience. Data not used is filled with junk bytes.

Crucially, I pad my blocks with zeroes. If somehow the PDB interpreter is reading one of my padding bytes, it might be incorrectly assuming I want to feed it a zero-size hash table…​ obviously a problem. So what to do?

## Poison

And now the meat of everything!

Instead of padding my file with zeroes, I use carefully crafted poison values. For my purposes I’m working with 32-bit data, so a poison value is usually 4 bytes long. A good example is 0xfeedface which is a funny but valid hex number that happens to be the right size.

The important thing is that we can’t just pad every 32-bit slot with 0xfeedface. Instead, we want to make permutations of the poison value - one unique permutation per slot. Every possible 4-byte sequence of my PDB’s "padding" is now a unique string of digits.

Here’s the magic part: when I run this in the debugger, I can walk into the fixHashIn function, and look at its parameters.

My first run of this process is surprising - despite poisoning a bunch of data around where I thought this zero was coming from, the value is still zero when we reach the fixHashIn function! This indicates one of two things.

1. The value is read from a place I didn’t poison

2. The value might be computed somehow

To rule out the possibility that I’m not poisoning enough, I expand the poison to the entire file instead of just one block’s worth of padding bytes. The debugger still stubbornly shows the parameter as zero, meaning that the zero is being computed from some other data being fed in, not read directly from the file on disk.

This line of the Microsoft PDB source is illuminating…​ but this line even more so. At line 65 is a comment stating that fixHashIn is called from two places…​ one of them is the loader for the Publics stream, but one is for a totally unrelated stream called Globals!

## Conclusions

It turns out I’ve been hitting breakpoints all evening in fixHashIn, but the call stack is wrong. The calls I’ve been seeing are from a totally different stream of data.

This post may not have a cheerful ending, but I hope the value of poisoning data is clear: I may have taken days to realize my mistake without having 100% proof that the evil zero was not coming from my Publics stream.

In any event, I use the poison technique a lot, and this is just one sampling of my adventures with the PDB format. Maybe I’ll have a better story of success tomorrow!

## Debugging Information Success

Early this morning, Epoch achieved a minor success in the debugging department. I was able to generate a working PDB file with symbols for a small test program, run the program in Visual Studio and WinDbg, and step through the disassembly.

Along the way, the current Epoch function was correctly tracked, indicating two things: stack unwinding information is working correctly, and function-address-to-symbol mapping is working as well.

The next step is to get line number information piped through to the PDB from the parser. This will be a major haul, but well worth it since it will allow source-level debugging of Epoch programs.

Once that is complete, I plan to tackle type metadata and variable tracking, so that the values of variables can be visualized in the debugger. That’s an even bigger lift, so I don’t expect it any time soon.

That said, I plan on waiting until debugging is in a good state before resuming work on self-hosting, because having debug information available makes that process vastly more convenient and approachable.

All in all it’s a good day for Epoch!

## Debugging Information Success

Early this morning, Epoch achieved a minor success in the debugging department. I was able to generate a working PDB file with symbols for a small test program, run the program in Visual Studio and WinDbg, and step through the disassembly.

Along the way, the current Epoch function was correctly tracked, indicating two things: stack unwinding information is working correctly, and function-address-to-symbol mapping is working as well.

The next step is to get line number information piped through to the PDB from the parser. This will be a major haul, but well worth it since it will allow source-level debugging of Epoch programs.

Once that is complete, I plan to tackle type metadata and variable tracking, so that the values of variables can be visualized in the debugger. ThataEUR(TM)s an even bigger lift, so I donaEUR(TM)t expect it any time soon.

That said, I plan on waiting until debugging is in a good state before resuming work on self-hosting, because having debug information available makes that process vastly more convenient and approachable.

All in all itaEUR(TM)s a good day for Epoch!

Source

## Debugging Epoch Programs

My recent adventures in self-hosting the 64-bit Epoch compiler have led me to a significant conclusion: it isn’t worth trying to self-host a compiler when you can’t debug the target language.

A much better use of my time would be to improve the languishing PDB generation experiment and get the code set up to actually emit usable debug symbols for Visual Studio and WinDbg.

It presently takes several minutes to build a candidate compiler; given that fact, it makes little sense to try and brute-force my way to correctness. Debuggers are valuable tools and shouldn’t be left as afterthoughts in the development of what aims to be a production language.

So I’m dusting off the code for PDB emission and working on a tiny shim DLL that will provide some hard-coded one-off features that might be needed in the course of getting the legacy 32-bit compiler to generate debug information about 64-bit executables.

One such thing that has come up is that, since vanilla Epoch lacks pointer arithmetic, it is hard to do serialization well. The shim DLL currently contains a single function, GetBufferPtr, which takes an input pointer and offset and returns the pointer adjusted by that offset. In other words, it’s a glorified pointer-add.

This isn’t really satisfying to me as a long-term way to write Epoch code, but I’ve decided that debug information is more important than implementing 64-bit features, including self-hosting. As such, it’ll have to suffice for a while.

## Debugging Epoch Programs

My recent adventures in self-hosting the 64-bit Epoch compiler have led me to a significant conclusion: it isnaEUR(TM)t worth trying to self-host a compiler when you canaEUR(TM)t debug the target language.

A much better use of my time would be to improve the languishing PDB generation experiment and get the code set up to actually emit usable debug symbols for Visual Studio and WinDbg.

It presently takes several minutes to build a candidate compiler; given that fact, it makes little sense to try and brute-force my way to correctness. Debuggers are valuable tools and shouldnaEUR(TM)t be left as afterthoughts in the development of what aims to be a production language.

So IaEUR(TM)m dusting off the code for PDB emission and working on a tiny shim DLL that will provide some hard-coded one-off features that might be needed in the course of getting the legacy 32-bit compiler to generate debug information about 64-bit executables.

One such thing that has come up is that, since vanilla Epoch lacks pointer arithmetic, it is hard to do serialization well. The shim DLL currently contains a single function, GetBufferPtr, which takes an input pointer and offset and returns the pointer adjusted by that offset. In other words, itaEUR(TM)s a glorified pointer-add.

This isnaEUR(TM)t really satisfying to me as a long-term way to write Epoch code, but IaEUR(TM)ve decided that debug information is more important than implementing 64-bit features, including self-hosting. As such, itaEUR(TM)ll have to suffice for a while.

Source

## Epoch 64-bit compiler progress

Just a short while ago, the first working 64-bit compiler for Epoch was produced!

Well, "working" might be a minor stretch; it launches, prints a simple text banner message, and then exits cleanly. But that represents a lot of operational code by itself.

• The 32-bit compiler is able to lex, parse, type-check, and code-gen the entirety of the 64-bit compiler’s source code.

• The 32-bit linker can emit 64-bit binaries, assisted by LLVM’s machine code generation facilities.

• The 64-bit compiler binary is a completely functional Windows executable image.

• This executable can run to completion on 64-bit Windows versions.

• Inside the compiled binary is a table of string constants.

• 64-bit Epoch code can load those strings and route them out to the command-line console.

• A number of support DLL calls are involved in this process, including loading garbage collection metadata and stack tracing for identifying GC roots.

All told there are hundreds of thousands of lines of code involved. Building the 64-bit compiler takes about 164 seconds (just over two and a half minutes) when using debug versions of LLVM. (For comparison, the 32-bit compiler can self-host in under 20 seconds, but that’s an unfair comparison because that build process uses optimized Release versions of LLVM.)

I’m pretty pleased with this progress. There are still many things left to get working, though.

• 64-bit globals do not work correctly; all of them are currently stuffed into a single random address which may or may not crash when dereferenced.

• More support DLL calls need to be implemented or eliminated.

• Certain code constructs do not work correctly yet; this is worked around for the time being by not using them in the compiler, but they will be good to get working as soon as is practical.

• A large number of hacks and temporary shims exist in the linker. This will need to be cleaned up substantially before self-hosting is really practical.

• Debug metadata and symbols are not generated correctly yet.

• Visual Studio integration has a number of bugs, ranging from the pesky to the outright unusable.

• It is exceedingly likely that there will be bugs in the compiler, meaning that 64-bit self-hosting is still a ways out even if the basics are operational.

Making the highly optimisic presumption that this will all happen soon, I think it’s fair to say that once all of the above is addressed (and 64-bit self-hosting is complete) it will be time to cut another release of Epoch.

In all probability, though, I’ll course-correct sometime between now and then, but it never hurts to have objectives!

## Epoch 64-bit compiler progress

Just a short while ago, the first working 64-bit compiler for Epoch was produced!

Well, "working" might be a minor stretch; it launches, prints a simple text banner message, and then exits cleanly. But that represents a lot of operational code by itself.

• The 32-bit compiler is able to lex, parse, type-check, and code-gen the entirety of the 64-bit compileraEUR(TM)s source code.

• The 32-bit linker can emit 64-bit binaries, assisted by LLVMaEUR(TM)s machine code generation facilities.

• The 64-bit compiler binary is a completely functional Windows executable image.

• This executable can run to completion on 64-bit Windows versions.

• Inside the compiled binary is a table of string constants.

• 64-bit Epoch code can load those strings and route them out to the command-line console.

• A number of support DLL calls are involved in this process, including loading garbage collection metadata and stack tracing for identifying GC roots.

All told there are hundreds of thousands of lines of code involved. Building the 64-bit compiler takes about 164 seconds (just over two and a half minutes) when using debug versions of LLVM. (For comparison, the 32-bit compiler can self-host in under 20 seconds, but thataEUR(TM)s an unfair comparison because that build process uses optimized Release versions of LLVM.)

IaEUR(TM)m pretty pleased with this progress. There are still many things left to get working, though.

• 64-bit globals do not work correctly; all of them are currently stuffed into a single random address which may or may not crash when dereferenced.

• More support DLL calls need to be implemented or eliminated.

• Certain code constructs do not work correctly yet; this is worked around for the time being by not using them in the compiler, but they will be good to get working as soon as is practical.

• A large number of hacks and temporary shims exist in the linker. This will need to be cleaned up substantially before self-hosting is really practical.

• Debug metadata and symbols are not generated correctly yet.

• Visual Studio integration has a number of bugs, ranging from the pesky to the outright unusable.

• It is exceedingly likely that there will be bugs in the compiler, meaning that 64-bit self-hosting is still a ways out even if the basics are operational.

Making the highly optimisic presumption that this will all happen soon, I think itaEUR(TM)s fair to say that once all of the above is addressed (and 64-bit self-hosting is complete) it will be time to cut another release of Epoch.

In all probability, though, IaEUR(TM)ll course-correct sometime between now and then, but it never hurts to have objectives!

Source

## Epoch Code-Generation Update

A few minutes ago, the first 64-bit self-hosted compiler for Epoch finished the code-generation process…​ unsuccessfully.

For context, this means that the 64-bit compiler (as built by the existing 32-bit compiler) was lexed, parsed, type-checked, and turned into LLVM IR. What didn’t happen is a successful code emission, i.e. the compiler is not yet producing a working executable image.

What it does produce is about 6.8 MB of errors, or just over 121,000 lines of output. This indicates that something in the code-gen process is off. We’re generating LLVM IR but it can’t be turned into machine code because it is malformed in some way.

Inspection of the error output shows that one of the biggest offenses is bad linkage on a global variable. Epoch aspires to minimize the use of global state but it’s a useful construct while bootstrapping a compiler. Fixing this mistake is trivial and reduces the error volume to much less.

In fact, the vast bulk of the output is actually the text of the LLVM IR in pretty-printed form. This "dump" is generated to help diagnose code-gen bugs, but it’s meant for much smaller programs than the entire compiler! Culling the dumped IR shows that there are in fact only 208 errors left (after the global linkage fiasco was addressed). And all of them are the same "sort" of error…​

## Terminator found in the middle of a basic block!

LLVM divides code into sections called basic blocks. A single basic block represents a linear sequence of instructions, i.e. every instruction in it is executed exactly once. The way to accomplish branching flow control is to end a basic block with a branch instruction, likely a conditional branch of some kind. Branches target other basic blocks to allow different code paths to execute based on the branch.

The dreaded error "Terminator found in the middle of a basic block!" means that the constraints have been violated. Someone tried to issue a branch instruction in the middle of a block, which ruins the idea that every instruction in the block executes exactly once.

In concrete terms, this error signals a bug in the code generation process. It means that somewhere along the line, the Epoch compiler lost track of a basic block being terminated, and continued shoving instructions into it after a branch.

Thankfully, LLVM barfs a "label" when it emits this error, and that label is sufficient to locate the offending basic block:

1>  ; <label>:41                                      ; preds = %11
1>    br label %9, !dbg !3338
1>    br label %42

Sure enough, there are two branches being attempted here. The larger context is uninteresting (it’s a nested if-statement inside a binary tree insertion routine) but the specific failure appears many times, meaning that it’s probably a small number of actual code-generation bugs to solve.

## Testing, 1 2 3

As with any good software, robustness in a compiler happens only when enough bugs have been fixed while simultaneously ensuring that no new ones are introduced. The best tool I know of for doing this is automated testing. Now that a compiler bug has been identified, the objective is to replicate it in as tiny a program as possible.

This "test case" provides two things: a way to reproduce the bug on-demand so a fix can be tested, and a way to detect if the bug ever reappears. The Epoch compiler test suite is still small, but invaluable for addressing this sort of problem. I will add this particular code to the test suite and hopefully have a fix in short order.

## Epoch 64-bit self-hosting progress

For a decent while now, I’ve been working on self-hosting the Epoch 64-bit compiler. This involves getting the compiler to a point where it is robust enough to actually compile itself. In order to do this, I’m using a modified 32-bit compiler which generates 64-bit binaries. Once a working 64-bit compiler is emitted, I can feed that compiler back into itself, thus completing the head-trip ouroboros that is self-hosting or "bootstrapping" a compiler.

At the moment, the compiler can successfully lex, parse, type-check, and partially code-gen itself. In practical terms, this means that the front-end of the compiler is working fine, but the back-end - the set of systems responsible for turning code into machine language and emitting a working executable - remains incomplete. For a slightly different perspective, I’m generating LLVM IR for most of the compiler at this point.

The bits that are left are corner cases in the code generation engine. There are things like intrinsic functions that need to be wired up, special semantics to implement, and so on. In particular, right now, I’m working on solving a corner case with the nothing concept. nothing is an Epoch idiom for expressing the idea that there is no data; except, unlike traditional null, nothing is its own type. If something has a type it cannot be nothing - again, unlike null. The usefulness of this may seem questionable, but the distinction makes it possible to avoid entire classes of runtime bugs, because you can never "forget" to write code that handles nothing - the compiler enforces this for you!

Anyways, the trick with nothing is that you can pass a literal nothing to a function as an argument, to signify that you have no semantically valid data to pass in. This is handled correctly by the parser and type checker, but falls down in code generation because we can’t actually omit the parameter from the function call.

What happens is the code generator creates a function with, say, 3 parameters. If the second parameter is nothing at a call site, we have to still pass something over to the function, from LLVM’s perspective. So we generate a dummy parameter that essentially translates the nothing semantics into null semantics - something LLVM can recognize.

Now things get complicated.

If we have an algebraic sum type that includes the type nothing, and we pass a sum-typed variable into a function which expects concrete types, the code goes through a process called type dispatching. This process basically matches an overload of a function with the runtime types of the arguments passed in. Think of it like virtual dispatch with no objects involved. (Strictly speaking, type dispatch in Epoch is multiple dispatch rather than the single dispatch seen in more popular languages.)

To facilitate all this, the compiler inserts annotations into the code, so that it can deduce what set of overloads to choose from when the runtime dispatcher is invoked. Some of these annotations survive at runtime - analogs of virtual-table pointers in C++.

Annotations are passed as hidden parameters on the stack when invoking a function. And at last we reach the real wrinkle: a nothing annotation can come from two distinct places: either the construction of a sum-typed variable which allows nothing as a base type, or a literal nothing passed to a function call.

The headache is that, to LLVM, both uses look like a function call. There is special case logic that exists to fix up the annotations for sum-typed constructors. Unfortunately, that logic collides with the logic needed to fix up annotations for general function call usage because LLVM doesn’t know the difference.

It’s an imminently solvable problem, but it’s a headache. Hopefully once this bug is gone there won’t be too many more to swat before I can start code-generating working 64-bit compilers.

(Spoiler: I’m not optimistic.)

## Epoch 64-bit self-hosting progress

For a decent while now, IaEUR(TM)ve been working on self-hosting the Epoch 64-bit compiler. This involves getting the compiler to a point where it is robust enough to actually compile itself. In order to do this, IaEUR(TM)m using a modified 32-bit compiler which generates 64-bit binaries. Once a working 64-bit compiler is emitted, I can feed that compiler back into itself, thus completing the head-trip ouroboros that is self-hosting or "bootstrapping" a compiler.

At the moment, the compiler can successfully lex, parse, type-check, and partially code-gen itself. In practical terms, this means that the front-end of the compiler is working fine, but the back-end - the set of systems responsible for turning code into machine language and emitting a working executable - remains incomplete. For a slightly different perspective, IaEUR(TM)m generating LLVM IR for most of the compiler at this point.

The bits that are left are corner cases in the code generation engine. There are things like intrinsic functions that need to be wired up, special semantics to implement, and so on. In particular, right now, IaEUR(TM)m working on solving a corner case with the nothing concept. nothing is an Epoch idiom for expressing the idea that there is no data; except, unlike traditional null, nothing is its own type. If something has a type it cannot be nothing - again, unlike null. The usefulness of this may seem questionable, but the distinction makes it possible to avoid entire classes of runtime bugs, because you can never "forget" to write code that handles nothing - the compiler enforces this for you!

Anyways, the trick with nothing is that you can pass a literal nothing to a function as an argument, to signify that you have no semantically valid data to pass in. This is handled correctly by the parser and type checker, but falls down in code generation because we canaEUR(TM)t actually omit the parameter from the function call.

What happens is the code generator creates a function with, say, 3 parameters. If the second parameter is nothing at a call site, we have to still pass something over to the function, from LLVMaEUR(TM)s perspective. So we generate a dummy parameter that essentially translates the nothing semantics into null semantics - something LLVM can recognize.

Now things get complicated.

If we have an algebraic sum type that includes the type nothing, and we pass a sum-typed variable into a function which expects concrete types, the code goes through a process called type dispatching. This process basically matches an overload of a function with the runtime types of the arguments passed in. Think of it like virtual dispatch with no objects involved. (Strictly speaking, type dispatch in Epoch is multiple dispatch rather than the single dispatch seen in more popular languages.)

To facilitate all this, the compiler inserts annotations into the code, so that it can deduce what set of overloads to choose from when the runtime dispatcher is invoked. Some of these annotations survive at runtime - analogs of virtual-table pointers in C++.

Annotations are passed as hidden parameters on the stack when invoking a function. And at last we reach the real wrinkle: a nothing annotation can come from two distinct places: either the construction of a sum-typed variable which allows nothing as a base type, or a literal nothing passed to a function call.

The headache is that, to LLVM, both uses look like a function call. There is special case logic that exists to fix up the annotations for sum-typed constructors. Unfortunately, that logic collides with the logic needed to fix up annotations for general function call usage because LLVM doesnaEUR(TM)t know the difference.

ItaEUR(TM)s an imminently solvable problem, but itaEUR(TM)s a headache. Hopefully once this bug is gone there wonaEUR(TM)t be too many more to swat before I can start code-generating working 64-bit compilers.

(Spoiler: IaEUR(TM)m not optimistic.)

Source

## Epoch Code-Generation Update

A few minutes ago, the first 64-bit self-hosted compiler for Epoch finished the code-generation processaEUR|aEUR' unsuccessfully.

For context, this means that the 64-bit compiler (as built by the existing 32-bit compiler) was lexed, parsed, type-checked, and turned into LLVM IR. What didnaEUR(TM)t happen is a successful code emission, i.e. the compiler is not yet producing a working executable image.

What it does produce is about 6.8 MB of errors, or just over 121,000 lines of output. This indicates that something in the code-gen process is off. WeaEUR(TM)re generating LLVM IR but it canaEUR(TM)t be turned into machine code because it is malformed in some way.

Inspection of the error output shows that one of the biggest offenses is bad linkage on a global variable. Epoch aspires to minimize the use of global state but itaEUR(TM)s a useful construct while bootstrapping a compiler. Fixing this mistake is trivial and reduces the error volume to much less.

In fact, the vast bulk of the output is actually the text of the LLVM IR in pretty-printed form. This "dump" is generated to help diagnose code-gen bugs, but itaEUR(TM)s meant for much smaller programs than the entire compiler! Culling the dumped IR shows that there are in fact only 208 errors left (after the global linkage fiasco was addressed). And all of them are the same "sort" of erroraEUR|aEUR'

## Terminator found in the middle of a basic block!

LLVM divides code into sections called basic blocks. A single basic block represents a linear sequence of instructions, i.e. every instruction in it is executed exactly once. The way to accomplish branching flow control is to end a basic block with a branch instruction, likely a conditional branch of some kind. Branches target other basic blocks to allow different code paths to execute based on the branch.

The dreaded error "Terminator found in the middle of a basic block!" means that the constraints have been violated. Someone tried to issue a branch instruction in the middle of a block, which ruins the idea that every instruction in the block executes exactly once.

In concrete terms, this error signals a bug in the code generation process. It means that somewhere along the line, the Epoch compiler lost track of a basic block being terminated, and continued shoving instructions into it after a branch.

Thankfully, LLVM barfs a "label" when it emits this error, and that label is sufficient to locate the offending basic block:

1>  ; <label>:41                                      ; preds = %111>    br label %9, !dbg !33381>    br label %42

Sure enough, there are two branches being attempted here. The larger context is uninteresting (itaEUR(TM)s a nested if-statement inside a binary tree insertion routine) but the specific failure appears many times, meaning that itaEUR(TM)s probably a small number of actual code-generation bugs to solve.

## Testing, 1 2 3

As with any good software, robustness in a compiler happens only when enough bugs have been fixed while simultaneously ensuring that no new ones are introduced. The best tool I know of for doing this is automated testing. Now that a compiler bug has been identified, the objective is to replicate it in as tiny a program as possible.

This "test case" provides two things: a way to reproduce the bug on-demand so a fix can be tested, and a way to detect if the bug ever reappears. The Epoch compiler test suite is still small, but invaluable for addressing this sort of problem. I will add this particular code to the test suite and hopefully have a fix in short order.

Source

## Welcome to the Bag of Holding

This is a quick test of HubPress.io to see how I like it. Assuming all goes well, I will probably resume posting Bag of Holding entries here soon. Ancient archives from the Bag of Holding are on GameDev.net if you happen to like my writing. They are very, very old though.

For now, here’s a sneak preview of where Epoch is headed:

simplelist<integer> types = new 0, nothing

## Welcome to the Bag of Holding

This is a quick test of HubPress.io to see how I like it. Assuming all goes well, I will probably resume posting Bag of Holding entries here soon. Ancient archives from the Bag of Holding are on GameDev.net if you happen to like my writing. They are very, very old though.

For now, hereaEUR(TM)s a sneak preview of where Epoch is headed:

simplelist types = new 0, nothing

Source

## On Embedding General-Purpose Scripting Languages

I spend a lot of time thinking about programming languages. Most often it's from the perspective of a consumer, i.e. a programmer using Language X. Sometimes it's from the viewpoint of someone who wants to explore Language Y, either for fun or for practical usage. Occasionally I see languages from the vantage of a language creator.

Recently, I've sunk a lot of brain cycles into thinking about languages as a non-programmer. More specifically, I've been thinking about the tendency for large (and sometimes small) games to use "scripting" languages to do various things.

I think there are basically two (often overlapping) desires in play when using an embedded scripting language. One is to make programmers more productive. Often this is believed to be possible by using "higher level" languages to write certain types of logic. The other common desire is to make non-programmers more productive, by enabling them to directly modify logic in the host program/game.

The more I ponder this, the more I'm convinced that in most cases, neither desire actually justifies the use of a general-purpose scripting language.

It is fairly common in programming circles to hear assertions like "Language X is more productive than Language Y." A popular tag for this used to be "rapid development" although I hear that term less nowadays. I will not attempt to tackle the debate about whether or not X can be more productive than Y. However, what I would like to point out is that this is actually irrelevant!

If we're talking about embedding a scripting language into a host program, like a game, we aren't comparing X and Y. We're comparing "X+interop layers+Y" to Y by itself. This distinction may seem pedantic, but it's crucial. If we wanted to add X to our Y-based program in order to boost programmer productivity, we have to be careful here.

Depending on how the job is done, adding a language to a program can actually be a massive net loss of productivity. Suppose we have a C++ program, written by C++ programmers, who implement a Lua interpreter in their program. Now the average programmer working on the program must know not only two different languages, but must also be intimately familiar with how the two interoperate.

If we designed APIs like this, we'd revile the result as a leaky abstraction, and rightly so. I content that embedding a general-purpose language is almost always a net loss, for the following reasons:

• Programmer knowledge is now required to span multiple languages
• Programmers MUST be fluent in the boundary layer between languages
• Interop is likely less performant than staying in a single language, especially if experts in the interop are unavailable
• Impedance mismatches are guaranteed to crop up, i.e. languages think of things differently and have different idioms
• The mental overhead of managing multiple languages is taxing and leads to higher bug rates and nastier bugs
So if you want to embed a general-purpose scripting language in your game to make development more productive, don't. Focus on making better tools and making better architectures. A good architecture allows you to build high-level logic almost as if it were scripting anyways. There is no need to do it in a different language.

Note that this is in no way an argument against using, say, C# to write your level editor and C++ to write your game.

The other motivation I mentioned above for doing scripting systems is to allow non-programmers to do powerful things.

I'm going to point out something that feels painfully obvious to me, but in harsh experience, far too many people just cannot wrap their head around:

Shoving a general-purpose programming language in a non-programmer's face is a great way to lose team members and make people very frustrated.

I can safely assume that most of my readers are not astronauts. So if I were to cram you into an Apollo capsule and yell GOOD GODDAMN LUCK while slamming it shut, I can safely assume that you'd react negatively. You're not trained to think about the tools in front of you. You're not informed as to what they're for. You don't know what to expect after the door closes and the last echoes of my jerk voice fade away. You're in an alien environment and can't even tell what will happen next.

That is what putting a designer in front of a general-purpose scripting engine feels like.

Why do I keep mentioning languages as "general purpose"? Because there is an alternative, which works, and works well.

Domain-Specific Languages
Many people conflate DSLs with "very limited languages." This is unfair. A DSL can be immensely powerful and still be confined to a particular domain. It can even be Turing complete if you really want.

I half-jokingly posted on Twitter the other day that Excel combined with a verb system would make a great DSL for game designers.

And the more I think about it, the less I think that's a terrible joke of an idea.

I might just give it a shot someday.

## MMOs! Y u no EZ?

A common aspiration among nascent game developers is to make an MMO. It's actually a pretty common aspiration in many perfectly functional, well-established studios, as well; but at least on GDNet, we mainly see newbies who want to make the next World of Warcraft, and so it's easy to be misled into believing that the MMO fascination is somehow peculiar to those who "don't know any better."

The reality, though, is that plenty of non-newbies have tried and failed to make successful MMOs. Many projects begin and never see the light of day; still more launch and are ultimately failures as business ventures. There's a common cause here, I think: making an MMO is hard, and for reasons which are not altogether obvious. The newbie wants to make an MMO because he doesn't understand how hard it is; the veteran industry studio fails to make an MMO because they don't understand how hard it is, either.

To do something hard, and do it well, requires preparation and a lot of awareness. Even understanding why it is hard can be tough in and of itself. It doesn't help that basically every successful MMO developer ever is highly protective of their secrets - and rightly so.

As a successful MMO developer, I'm going to uphold the tradition, and not talk about the secret sauce that makes it possible to solve the hard problems of MMOs. But I will take some time to shed some light on what those problems really are, and why they are hard.

Content Volume
This is the one most people get hung up on, so I'll dispense with it right away. Producing the sheer amount of content that goes into a genuine MMO is hard, yes, but only because it's expensive. If you have a boatload of money and a lot of talented content creators, this problem isn't really that big of a deal. After all, huge single-player games get made all the time.

If you're not driving home in a dump truck full of hundred dollar bills, then this problem is obviously going to be pretty hard.

The Speed of Light
This is a genuinely hard problem because we can't change the speed of light. However, the speed of light (or more accurately, the speed of signal propagation in physical network equipment) represents a very real limit to how responsive a multiplayer game can be. The farther away your computer is from the "other" computer, the longer it takes to send a signal to that computer. For the sake of simplicity in this article, just assume that the back-of-the-napkin math limits us to some given latency in the tens of milliseconds for an average trip time. (I'm being very fuzzy deliberately.)

The bottom line here is that if I want to have realtime interactions with someone on the opposite side of the country, I'm going to have to wait an almost perceptible amount of time for my signals to reach them, and then an almost perceptible amount of time for their response to come back to me. Put together, this means that even in the purely optimal case where I can talk directly to other players, I might notice latency, or "lag", if they are far enough away.

So what about if I can't talk directly to my peers, but need a server?

Network Architecture
This is where things get truly fun. MMOs are traditionally client-server based, meaning that for me to talk to player B, I talk to my server first, who then relays my interactions to player B on my behalf. This is why server performance matters so much; if the server is busy and takes 60 milliseconds to forward my message to player B, I just made the speed-of-light problem that much worse.

But suppose I need to not just poke player B, but also demonstrate to the other 60 people in the immediate area that I did in fact poke player B. Now my server has to send 60 more messages. Each of those messages takes time to send - we don't have the luxury of sending multiple messages in parallel.

This means that the network that handles the game (i.e. typically a datacenter setup) has to be able to successfully blast out 60 of those messages to a potentially geographically diverse set of people, while maintaining minimal latency. Remember, again, every millisecond we pile on to the message trip time is a millisecond the player might end up feeling as lag.

Now, routing internet traffic is a pretty well-handled problem these days (in most of the world). Figuring out how to get our messages to each of our 62 players in a timely fashion is thankfully not that hard. What is hard is the fact that access to that level of internet beefiness is not cheap and not easy to attain. Internet Service Providers (ISPs) have peer agreements with datacenter installations, which govern how much traffic can come and go from the datacenter, and what quality of service it receives - i.e. how much latency and how much bandwidth will be involved.

A lot of people like to suggest "the cloud" as a solution here. The problem is that cloud providers generally can't compete with the quality of service that a good peering contract will give you. Yes, cloud services are great and still improving fast, but as of today, a dedicated datacenter is a better way to host a latency-sensitive game.

Datacenters, and peering agreements, are expensive propositions. This is back to the content volume problem, though - money can solve it. So let's look at other network related headaches.

The N^2 problem
Suppose we have a client-server model where 10 players are gathered to do whatever. Player 0 suddenly decides to jump up into the air. Now, players 1-9 need to see that, and player 0 needs confirmation that his character is jumping. So that's a total of 10 messages to relay.

Now suppose all the players jump simultaneously. Each player generates 10 messages, as we saw a second ago. Total up the messages for all the players: we must now relay 100 messages - 10 squared. This is generally referred to as "an N-squared problem."

N^2 problems occur all over the place in networked game construction. They have implications on many aspects of implementing an MMO: physics, game rule simulation, network throughput, security (think DDoS problems) and so on. There are ways to solve and/or mitigate the pain, but generally, every system that is affected by N^2 needs to solve the problem a little bit differently.

This means an interesting thing for software architecture: what solves your graphics N^2 may not solve your physics N^2, and both of those might fail to work with combat rules, and even then you need to figure out the network throughput issues, and so on and so forth. Suddenly you have a dozen systems that behave all slightly differently that have to be stitched together into a harmonious whole. That's not always easy, and in fact, it's one of the things that makes implementing a good MMO very difficult.

Nonlinear scaling
The final general principle I'd like to cover is that scaling is rarely linear. Let's say you make a multiplayer game. With 5 players, your server is doing great - 10% CPU usage. With 10 players, 18% CPU usage. With 20 players, 32% CPU usage. If you're following the trend, this sounds awesome! We can probably hit 60 players before running out of CPU! Except you add the 40th player and the CPU pegs out at 100%. What went wrong?

Scaling is rarely linear! We extrapolated that we could triple our 20-player numbers and get 60 players comfortably, but this extrapolation turns out to be flawed. What most people don't realize is that nonlinear scaling is everywhere in MMOs.

Take our N^2 scaling problem from earlier, even. It looks kinda linear if you don't look hard enough, but once you throw on the numbers, it bites hard. Believe it or not, there are worse ways to scale than N^2.

Network traffic, memory usage, CPU usage, database transaction volume, communication between back-end server systems... all these can scale in nonlinear ways. And when they do, you want to know about it long before it turns into a player-facing outage. This means testing, and load testing in particular. Good load tests are hard to do in general, but stress-testing an entire MMO infrastructure and proving it robust is particularly difficult. It makes the difference between a good launch and a bad launch, between profitable operation and burning money.

Parting Thoughts
There are many more challenges to making an MMO a successful business: security, legal issues, attracting a big enough player base, maintaining the player base (aka retention), monetization models, community management, and so on. Even for the issues I've touched on above, I've barely scratched the surface.

I hope this underscores the fact that MMOs are hard. Not impossible, certainly not impossible. Just very, very hard. It takes a diverse collection of skills and expertise to do them right, and sadly that mix just doesn't happen that often. I hope this writeup isn't depressing to aspiring MMO-makers; instead, I hope it gives you a start at understanding what exactly you will need in order to be successful.

## Fun with templates. "Fun." Ha, ha.

So I have an interesting situation where I wrote a math-intensive API to allow easy composition of various functions. The details are pretty easy so I'll just show an example of what the API usage looks like:

formula.Create() .Set(0.0f) .AddWithWeight(some_data_source, 0.5f) .Multiply(other_data_source) .Display();The API is meant to operate on non-primitive objects, such as 2D arrays. This gives the end user a really powerful way to stitch together different operations without making the code unreadable or long.

Internally it does what you would guess: Create() allocates some working space to do the math in, and the operations pretty much do what it says on the tin, using that working space. Each function simply returns a reference to the "formula" object (*this) so that the operations can be chained arbitrarily.

Now, here comes the interesting bit: I want to change the API to support not just 2D arrays, but more sophisticated data structures, like, say, a quadtree. Operations should be able to mix and match types so that you can do things like prune the quadtree to only reference certain cells, grab all the objects in those cells, and do some calculus on some property of those objects.

Also, dynamic allocation is forbidden (inside the API implementation itself) and the syntax needs to retain the readability and succinctness of the original.

My first thought (being obsessed with compilers) was to build a sort of syntax tree that represents the operations and then traverse that to do the math. This is a failure on the dynamic allocation front, though, and also introduces a lot of overhead in terms of understanding how the code works. In other words, it involves a lot of machinery that the end user shouldn't have to care about, and maintainers shouldn't be burdened with.

My second thought (being obsessed with compilers) was to throw template metaprogramming at it. This has only one weakness, namely that templates do not work with floating-point numbers - making it impossible to do things like the AddWithWeight() call in the first example. Well, impossible without some fudging, anyways.

To fix the limitation on floating point literals, I decided to use a simple layer of indirection. Instead of putting "1.0" or "7.5" in my code, I use an enumeration value like CONSTANT_ONE or CONSTANT_FOO, since those can be used for template specializations. Then I do just that: specialize some templates to yield the correct floating point numbers on demand.

The next step is to describe the API from a user's perspective. I fiddled around a bit and came up with this:

 CreateFormula< int, Add, Add, Multiply, DumpResult >();This does add some ugly to things, namely in that the "int" type is now littered all over the code. Frankly, though, I think this is more of a problem in the silly examples than in real life, because in real life the API is used with many different types, so being able to specify a type at each step of the process is pretty handy. It also opens up the door for template-specialization-based optimizations.

Now that I know what I want the API to look like, it's time to build the implementation:

#include enum EDataSource { DATA_SOURCE_CONSTANT_0, DATA_SOURCE_CONSTANT_1, DATA_SOURCE_CONSTANT_4,};template struct Add { };template struct Add { static void Invoke (T * state) { *state += T(1); }};template struct Multiply { };template struct Multiply { static void Invoke (T * state) { *state *= T(0); }};template struct Multiply { static void Invoke (T * state) { *state *= T(4); }};struct DumpResult { static void Invoke (int * state) { std::cout << *state << std::endl; }};struct Null { static void Invoke (int *) { }};template struct CreateFormula { CreateFormula () { T state = T(); O1::Invoke(&state); O2::Invoke(&state); O3::Invoke(&state); O4::Invoke(&state); }};int _tmain (int argc, _TCHAR* argv[]) { CreateFormula< int, Add, Multiply, DumpResult >(); CreateFormula< int, Add, Add, Multiply, DumpResult >(); return 0;}What we have here is pretty straightforward. Each operation is represented by a struct template. In this case, operations always take exactly two parameters: one to tell the code what type the operation works on, and one to specify the data for the operation. We then use template specialization, as described above, to yield specific data values based on the enumeration. In more realistic code, instead of just having constants, we could have things like "list containing the locations of all nearby sandwiches" or whatever else.

Note that the operation structs have a single static function, Invoke. Invoke takes some "state" and goofs with it; nothing more, nothing less. This is the guts of each operation.

There are two operations worth calling out: DumpResult and Null. DumpResult is simply a way to take the state from the previous operation and put it on the screen. Null is a way to say "don't do anything" - this will come in handy momentarily.

The ugly beast is the CreateFormula template. This guy needs lots of optional template parameters so we can have formulas of varying length. Each optional parameter defaults to Null so that operations which are omitted from the template's type definition can be elided entirely.

Internally, CreateFormula has a single constructor, which does something that should sound really familiar: creates some state, passes it off to a series of operations, and then returns. Note that no dynamic allocation is necessary here; more sophisticated data structure usage might make it tempting to add more dynamic allocations, but that's not really the API's problem - it's down to the user to do the right thing.

Last but not least, in the main() function we see two examples of the template in action. It looks pretty darn close to the original, exactly like the syntax we wanted for the new model, and has the added benefit (over the original) of being more explicit about what's going on at each step.

As you might guess, the real application for this is a lot more complex. It doesn't limit itself to simple operations like add and multiply - there's analytical differentiation, integration by sampling, optimization (i.e. "highest" and "lowest") queries, and so on. So the actual operations are much richer. This is the main reason why operator overloading is not useful; we simply run out of squiggly symbols for all the different operations.

Another thing worth mentioning here is that modern compilers are exceptionally good at figuring out what's really going on in this code. When using int as the state type, for example, Visual C++ compiles the entire thing out and just passes a literal result to std::cout. Pretty cool.

I have no idea if this is remotely useful, mostly because I'm not convinced it's useful to me even, but it was a nifty trick and I felt like sharing. So there.

## Stupid bit rot

After a couple of days of preoccupation with other things, I sat down tonight to start hacking on Epoch. I was unprepared for the bizarre puzzle that unfolded immediately after.

I had been mucking around for a little while when it came time to compile and test my changes. Everything seemed to build cleanly, and then I started the program, only to be greeted by bright red letters stating "Unsupported type for native code generation."

Mind you, this is my own error message, in my own code, which I wrote, and should remember. Needless to say, there was a rather loud WTF emitted regardless. Time to start poking around.

My usual routine when things go wrong is to binary search my changes until I pinpoint what's causing the error. So I dutifully removed a bunch of new code, rearranged some other code back to its old state, and tried again.

Unsupported type for native code generation.

Rinse, repeat, eventually get to a point where none of my changes are left. Uh... wat? So I did the only thing that makes sense: shelved my entire set of changes (thankfully Mercurial has a great shelf function) and try the code that was demonstrably working a couple of days ago.

I checked all the timestamps of the relevant files, looking for DLLs that might have accidentally been recompiled, looking for source code that might have snuck in a bogus edit or three, probing as carefully as I could for anything and everything that might be different between the known good build and today's.

After concluding that everything was in order, I confidently compiled the working program and launched it, fully expecting to see the successful execution that for fuck's sake was still showing in my terminal window from days prior.

Unsupported type for native code generation.

God. Damn. It.

Same compiler, same runtime, same libraries. Different binary output. Unfortunately I didn't think to keep a copy of the old working .EXE to compare with the new busted one, because hey, I expected the universe to play fair and obey the typical laws of physics whereby when you don't change anything it doesn't mysteriously break three fucking days later for inexplicable reasons.

Compiling a different project yields the expected results: no changes = binary, bit-for-bit identical output to what it used to produce.

Something is different between now and last week, and I have no idea what.

Out of desperation, I decide to look at the revision history on the Mercurial repository, and start rolling back changes until I find something that does work.

It takes exactly ten seconds to realize what went wrong.

I had a chunk of changes that I made late one night in between the last successful test build and now. I never did run them to make sure they worked. Oops.

Turns out, obviously, they didn't work. There's a bug in the compiler that got triggered by the changes I made that night, and that was cascading into the eventual barf that shocked me so much earlier this evening.

So that got squared away, at the expense of most of my brain power and energy for the night. I did manage to hack in support for assertions and reeaaaaallly rudimentary unit test "success" in the new runtime, but that's about all I can stomach of this mess for one night.

My general plan of attack is to slowly get the compiler test suite passing again, one test at a time. So at least now one test passes... one of nearly 100. It'll take time, but I'm still enthusiastic about the direction the project is taking, and I'm sure the heavy lifting will be over sooner than I expect.

Time to become unconscious.

## More hacking on native executables

I've started the long and tedious process of slowly but surely hooking up every single Epoch language feature to the new LLVM bindings, so that the compiler can emit 100% native binaries instead of JIT compiling the native code when the program is started.

So far the only thing that works is printing out strings, and only statically defined strings at that. But that's something, because it means that the import thunk table works, function invocation works, and the embedded .data segment works. In less obscure terms, the Epoch runtime can be called from a native program and use constant data that's stored directly inside the executable file.

The infrastructure for doing all this took a bit of work to set up. The module responsible for generating EXE files from compiled IR is, give or take, about 1000 lines of Epoch code. The C++ shim that talks to LLVM is another 600 or so lines of code. The runtime itself is dead trivial but that's only because it doesn't have 99% of the language functionality in it, nor a garbage collector, nor a threading/task switching model.

It may not sound like a particularly large volume of code, but every line represents a significant battle to figure out all the intricacies of the Windows Portable Executable program format, the way early-bound DLLs work, how to map Epoch constructs into LLVM constructs, and so on. The amount of effort put into every ounce of this code is tremendous, given that I only have a few hours a week to hack on this project typically. The biggest hurdle is losing my mental context every time I have to call it quits for the night; if I could concentrate a solid five or six hours of focused work on Epoch, I could probably triple my productivity. Sadly, I just don't have that luxury right now.

Given the constraints I'm under, I'm pretty happy with progress thus far. It may take a while to get all of the various language features back to a fully functional state, but the project as a whole is already benefiting immensely from the reduced complexity of the pipeline as well as the general flexibility and power of the new architecture.

For now, though, I desperately need some sleep.

## Native Binary Project: Day Whatever

I continue to hack on the Epoch compiler, slowly shaping it into a powerhouse of executable binary generation. So far I've gotten the program string table and DLL import table built in a flexible and extensible manner. This is the first step towards getting actual code generation back online at full capacity.

Now that I can link to DLLs from native Epoch programs, I can start replacing the old runtime library with a newer, cleaner, and slimmer version. As part of the process, I'm rewriting the LLVM bindings, which means that code generation is going to be broken for a while, at least in part.

A few minutes ago, the compiler generated a rudimentary .EXE that imports the new Epoch runtime DLL and calls a function in it. This is all completely flexible - none of the data is hardcoded or hackish to make it work. This is in direct contrast to the old method of building Epoch .EXEs, which basically amounted to hardcoding a tiny program that launched the main program after JIT compiling it with LLVM.

My goal here is to do absolutely everything I can in Epoch, and slice out as much C++ code as possible. I'm still playing around with ideas in my head for how to make the LLVM bindings as slim as possible, but it shouldn't be hard to seriously cut down on the amount of fragile C++ hackery being done there.

It's kind of annoying how much sheer effort I can dump into a project like this, and then be able to summarize the entire affair with a single sentence. I'd rail on about all the detailed hacking and prodding that went into making this happen, but it's actually pretty boring and involves a lot of profanity. So instead, I'll just leave you with a short journal entry.

And now to bed, and now to bed.

## Rewrite ALL the things!

OK so I'm not really rewriting everything... just... a lot of it.

The current runtime infrastructure for Epoch is precarious at best, and desperately needs some improvements. There are a number of things I want to do that all happen to coincide nicely with a desire to rewrite the runtime system:

• Destroy the EpochLibrary DLL forever. This is an ancient crufty artifact that deserves death, and getting rid of it should simplify the entire build pipeline noticeably.
• Improve program start times. I've posted about this before; basically the Epoch runtime JIT compiles your program every time it starts up, which in the case of the compiler leads to about a 6 second stall. Gross.
• Emit true native binaries - tied in with the above point, I'd like to emit genuine native .EXE and .DLL files instead of mutant VM hybrid files. This will enable the use of things like dynamic linking which will be a huge win.
• Separate the compilation pipeline into modular pieces. Right now the parser is "reused" by brute-force including the code in both Era and the compiler itself; instead, I'd like to make the parser and AST stuff into DLLs that feed data around to each other.
I have a loose mental strategy for doing all this, which looks vaguely like the following:

• Build a new DLL (tentatively EpochLLVM) which wraps LLVM and allows the compiler to make simple C-API calls to set up LLVM IR and generate executable machine code from it.
• Retrofit the existing Epoch compiler to use this new DLL when generating binaries.
• Rebuild garbage collection and other runtime infrastructure in a separate DLL (tenatively EpochRuntime) or maybe a few DLLs.
• Self-host the compiler through this new chain.
• Convert Era to use the new chain.
• Build support for emitting Epoch DLLs.
• Proof-of-concept the DLL system by replacing the C++ lexer used for Scintilla syntax highlighting with the actual lexer.
• Split the remaining parser, AST, and codegen logic into separate DLLs.
• Self-host again using this new infrastructure.
• Ship Release 16.
This ought to keep me busy well through the end of the year...

The benefits will be huge though. In addition to being able to write Epoch DLLs, getting faster start times, and cleaning up the modularity of the code a lot, this will pave the way towards integrating higher-level source processing tools with Era, as well as giving me an opportunity to revisit some historical but gross decisions, like the use of UTF-16. Last but not least, it gets even more of the implementation of the language and tools moved into Epoch instead of C++.

Overall it's a daunting amount of work, but I think it can be managed. The real trick will be staying interested in the process during the long dark period where nothing quite works. I hope my skeletal plan will give me plenty of moments to sit back and enjoy visible progress, but we shall see.

So, here goes nothing!

## Slimming down the Epoch runtime and improving program start times

Right now a big limitation of the Epoch program execution model is the fact that .EXEs generated by the compiler are not in fact native binaries. They are compact bytecode wrapped in a thin shim that churns said bytecode through an LLVM wrapper layer each time the program is launched. This means that every single time you start an Epoch program, it is basically getting re-built and re-optimized from bytecode and then JITted to native machine code.

This means that the compiler, for example, takes about 6 seconds to start on my machine. Pretty sad.

The way around this is to build native binaries using LLVM and just stuff them on disk. The runtime still needs to exist in order to provide garbage collection support, marshaling into other native processes, and so on. However, it will be much skinnier and more efficient without carrying the weight of the LLVM wrapper all over the place. Net result should be faster startup times for Epoch programs and less memory consumption overall.

Easier said than done.

LLVM doesn't directly generate machine code that's really all that suitable for emitting to a binary on disk. It leaves out a lot of platform-dependent details like how to link to DLLs and how to handle global data. Much of the infrastructure is designed to assume you're emitting machine code for JIT runtime purposes, not necessarily for serialization.

Now obviously this isn't insurmountable; plenty of compilers depend on LLVM (notably Clang) and generate binaries just fine. So the real magic is going to lie in finding all the places where JIT code doesn't serialize nicely and working around them.

Off we go!

## Epoch Mailing List

I've spun up a mailing list for the Epoch project, mostly because I'm tired of having conversations about it span half a dozen websites and PMs and blah blah.

Here's the link: clicky. Or you can email epoch-language@googlegroups.com to accomplish the same thing.

It should be open to anyone to come and talk about the language or just ask questions. I'll be seeding the discussion with a few subjects that are still open questions in my mind and hopefully people will jump in and kick some thoughts around.