One Weird Trick to Write Better Code

Published September 28, 2015
Advertisement
Developers hate him!

We'll cover some standard tips and tricks here, but we're not really interested in those. We're looking for the One Weird Trick to rule them all. Hopefully each trick we encounter brings us closer to coding Mecca.

In the beginning
The first video game I ever wrote was called Ninja Wars.
ninjawars.png
Yes, that is an HTML table of images. I changed the src attribute to move stuff around.

The top of the Javascript file looked like this:var x = 314;var y = 8;var prevy= 1;var prevx= 1;var prevsw= 0;var row= 304;var endrow= 142;var sword= 296;var yrow = 0;var yendrow = 186;var I = 0;var e = 0;var counter = 0;var found = 0;var esword = 26;var eprevsw = 8;var bluehealth = 40;var redhealth = 40;var n = 0;var you = 'ninja';var bullet = 'sword';var enemy = 'enemy';var ebullet = 'enemysword';var besieged = 0;var siegecount = 0;var esiegecount = 0;var ebesieged = 0;var healthcount = 0;var player = 0;var starcount = 0;var infortress = false;var prevyou = you;var einfortress = false;var prevenemy = enemy;var previmg = "";var prevbullet= "";var eprevbullet= "";var randnum = 0;var randnum2 = 0;var randnum3 = 0;var randnum4 = 0;var buildcount = 0;var characters = new Array(4);characters = ['ninja','tank','heli','builder'];var bullets = new Array(3);bullets = ['sword','bullet','5cal','sword'];var echaracters = new Array(3);echaracters = ['enemy','tank2','eheli','ebuilder'];var ebullets = new Array(3);ebullets = ['enemysword','bullet2','e5cal','enemysword'];var health = new Array(4);health = [40,30,20,10];var prevorb = 0;var prevnum = 0;
Hopefully this looks somewhat familiar, and I'm not the only one to start out writing code like this. Regardless, this debacle demonstrates our first trick:

Trick #1: globals are evil

We don't even know why they're evil, we just intuitively know.

edit: I should clarify that I'm no longer against globals. But that's getting ahead of ourselves.

Pretty soon we learn about objects. We can group our variables:class Ninja{ int x, y; int previousX, previousY; int health = 100;}class Sword{ int x, y; int previousX, previousY; int sharpness = 9000;}
We can even use inheritance to avoid copy-pasting:class Movable{ int x, y; int previousX, previousY;}class Ninja : public Movable{ int health = 100;}class Sword : public Movable{ int sharpness = 9000;}
Inheritance is nice! Nice enough to serve as our next trick:

Trick #2: object-oriented programming

Object-oriented is so great that it forms the core of many classic games, including Doom 3. Which, coincidentally, is open source.
Doom 3 loves inheritance. Don't believe me? Here's a small subset of its class hierarchy:idClass idEntity idAnimatedEntity idWeapon idAFEntity_Base idAFEntity_ClawFourFingers idAFEntity_Vehicle idAFEntity_VehicleFourWheels idAFEntity_VehicleSixWheels idAFEntity_Gibbable idAFEntity_WithAttachedHead idActor idPlayer idAI
Imagine you're an employee at id Software. This inheritance hierarchy works great for a few months. Then, one fateful Monday, disaster strikes. The boss comes in and says "Hey, change of plans. The player is now a car."

Look at idPlayer and idAFEntity_VehicleFourWheels in the hierarchy. Big Problem #1: we need to move a lot of code.

Big Problem #2: the boss comes to his senses and calls off the "player is a car" idea. Instead, we're adding turrets to everything. The car is becoming a Halo Warthog, and the player is getting a giant turret mounted on his back.

As lazy programmers, we decide to use inheritance again to avoid copy-pasting. But look at the hierarchy. Where can we put the turret code? The only ancestor shared by idPlayer and idAFEntity_VehicleFourWheels is idAFEntity_Base.

We'll probably put the code in idAFEntity_Base and add a boolean flag calledturret_is_active. We'll only set it true for the car and the player. This works, but the terrible, logical conclusion is that our base classes end up loaded with tons of cruft. Here's the source code for idEntity.

Go ahead, scroll through it. You don't have to read it all.

https://github.com/id-Software/DOOM-3-BFG/blob/9c37079c16015fc58de29d3de366e0d93dc11f8a/neo/d3xp/Entity.h#L163

The point is, that's a lot of code. Notice how every single entity -- down to the last piece of physics debris -- has a concept of a team, and of getting killed. Clearly not ideal.

If you're a Unity developer, you already know the solution: components! Here's what they look like:

unity.jpg

Rather than inheriting functionality, Unity entities are just bags of components. This solves our earlier turret problem easily: just add a turret component to the player and car entities.

Here's what Doom 3 might look like if it used components:idPlayer idTransform idHealth idAnimatedModel idAnimator idRigidBody idBipedalCharacterController idPlayerControlleridAFEntity_VehicleFourWheels idTransform idAnimatedModel idRigidBody idFourWheelController...
What have we learned?

Trick #3: in general, favor composition over inheritance

Take a moment to review the tricks we've covered so far: global variables bad, objects good, components better.

You won't believe what happens next!

Let's take a small detour into the world of low-level performance with a very simple question: which function is faster?double a(double x){ return Math.sqrt(x);}static double[] data;double b(int x){ return data[x];}
We'll hand-wave a lot of complexity away and just assume that these two functions eventually compile down to one x86 instruction each. Function a will probably compile to sqrtps, and function b might compile to something like lea("load effective address").
sqrtps takes about 14 CPU cycles on a modern Intel processor, according to Intel's manual. What about lea?

The answer is "it's complicated". It depends on where we load data from.Registers ~40 per core, sort of 0 cyclesL1 32KB per core 64B line 4 cyclesL2 256KB per core 64B line 11 cyclesL3 6MB 64B line 40-75 cyclesMain memory 8GB 4KB page 100-300 cycles
That last number is important. 100-300 cycles to hit main memory! This means in any given situation, our biggest bottleneck is probably memory access. And from the looks of it, we can improve this by using L1, L2, and L3 cache more often. How do we do that?

Let's return to Doom 3 for a real-life example. Here's the Doom 3 update loop:for ( idEntity* ent = activeEntities.Next(); ent != NULL; ent = ent->activeNode.Next() ){ if ( g_cinematic.GetBool() && inCinematic && !ent->cinematic ) { ent->GetPhysics()->UpdateTime( time ); continue; } timer_singlethink.Clear(); timer_singlethink.Start(); RunEntityThink( *ent, cmdMgr ); timer_singlethink.Stop(); ms = timer_singlethink.Milliseconds(); if ( ms >= g_timeentities.GetFloat() ) Printf( "%d: entity '%s': %.1f ms\n", time, ent->name.c_str(), ms ); num++;}
From an object-oriented perspective, this code is pretty clean and generic. I'm assuming RunEntityThink calls some virtual Think() method where we could do just about anything. Very extensible.

Uh oh, here comes the boss again. He has some questions.

  • What's executing? Er... sorry boss, it depends on what entities are active at the time. We don't really know.
  • In what order is it executing? No idea. We add and remove entities to the list at random times during gameplay.
  • How can we parallelize this? Gee boss, that's a tough one. Since the entities execute randomly, they may be accessing each other's state. If we split them up between threads, who knows what might happen.

In short:

shrug.gif
But wait, there's more! If we look closely, we see the entities are stored in a linked list. Here's how that might look in memory:

linked-list.png

This makes our L1, L2, and L3 cache very sad. When we access the first item in the list, the cache thinks, "hey, I bet the next thing they'll want is nearby", and it pulls in the next 64 bytes after our initial memory access. But then we immediately jump to a completely different location in memory, and the cache has to wipe out those 64 bytes and pull in new data from RAM.

Trick #4: line up data in memory for huge performance gains
Like this:for (int i = 0; i < rigid_bodies.length; i++) rigid_bodies.update();for (int i = 0; i < ai_controllers.length; i++) ai_controllers.update();for (int i = 0; i < animated_models.length; i++) animated_models.update();// ...
An object-oriented programmer might be infuriated at this. It's not generic enough! But look, now we're iterating over a set of contiguous arrays (not arrays of pointers, mind you). Here's what it looks like in memory:

components.png

Everything is all lined up in order. The cache is happy. As a side bonus, this version allows us to answer all those pesky questions the boss had. We know what's executing and in what order, and it looks much easier to parallelize.

edit: Don't get too hung up on cache optimization. There's a lot more to it than just throwing everything in an array. I only bring it up to prove a point, and I'm only qualified to give the basic introduction. Check out the links at the bottom for more info.

Time to wrap this up and get to the point. What's the One Weird Trick? What do tricks 1-4 (and many more) all have in common?

The One Weird Trick: data first, not code first
Why were globals so evil (trick #1)? Because they allowed us to get away with lazy data design.

Why did objects help us (trick #2)? Because they helped us organize our data better.

Why did components help us even more (trick #3)? Because they modeled our data better by matching the structure of reality better.

Even the CPU likes it when we organize our data correctly (trick #4). No really, what is the trick actually
Let's break it down in practical terms. Here's a representation of a typical Unity-like component-based game design.

uml.png

Each component is an object. It has some state variables listed at the top, and then some methods to do stuff with those variables. This is a well-designed object-oriented system, so the variables are private. The only code that can access them is the object's own methods. This is called "encapsulation".

uml-encapsulation.png

Each object has a certain amount of complexity. But fear not! OOP promises that as long as we keep the state private, that complexity will stay encapsulated within the object, and won't spread to the other objects.

Unfortunately, this is a lie.

throne-of-lies.jpg

Sometimes, we need a function to access two or three objects. We end up either splitting the function between those objects, or writing a bunch of getters and setters so our function can access what it needs. Neither solution is very satisfying.

Here is the truth. Some things cannot be represented as objects very well. I propose an alternate paradigm, one which represents every program perfectly:

data-flow.png

Once we separate process from data, things start to make more sense.

Object-oriented helps us write good code because it encourages us to encapsulate complexity (i.e. state). But it forces us to do so in a certain way. Why not instead encapsulate like this, if it makes sense within our problem space?

data-flow-encapsulation.png

In summary, design data structures to match your specific problem. Don't shoehorn a single concept into a bunch of separate, encapsulated objects.

Next, write functions that leave the smallest possible footprint on that data. If possible, write pure stateless functions.
That's the trick.

Conclusion
If this struck a chord with you, it's because I stole most of it. Get it straight from the source:
Thanks for reading!

Mirrored on my blog
14 likes 7 comments

Comments

snake5

I cannot overlook the numerous misunderstandings that together form this article.

We don't even know why they're evil, we just intuitively know.

Leave religion out of programming, please. There's nothing wrong with globals as a language construct. The issue is data access - the more code has access to some data, the harder it is to predict the state of the data and, as a consequence, it's harder to understand what the code does. Universally accessible class instances are not immune from this.

Then, one fateful Monday, disaster strikes. The boss comes in and says "Hey, change of plans. The player is now a car."

One not-so-weird trick: Instead of trying to use the slightest opportunity to prove a point, how about finding the minimal change set that satisfies the requirements and postponing refactoring to some time after the launch date? For example, putting an invisible player into a car and letting the car be controlled by the attached player. Some programmers like to moan about how things could be better if we all just used the One True Technique™. It's fine to think the code sucks, most code does for various reasons - but that technique is probably not going to yield the desired results.

Take a moment to review the tricks we've covered so far: global variables bad, objects good, components better.

You seem to be confusing composition with components. Components are those things you see in Unity. Composition is the act of combining multiple things into one. Plain class member variables qualify, pointers or handles to class instances qualify, even multiple inheritance qualifies. There are pros and cons to every method, so it helps to know all of them.

This makes our L1, L2, and L3 cache very sad.

What makes cache sad is that you don't use the memory it has loaded (which is very much the case with small heap-allocated objects and random small memory accesses), not the distance between loads. Using intrusive linked lists easily improves the situation. So does integration of very small arrays into container objects.
In the case of idEntity, looks like there could be a usual linked list but the entity objects themselves are probably big enough to avoid cache issues.

rigid_bodies[i].update();

This appears to be an extremely naive line of code. Every physics library I've seen exposes the rigid body pointer only, it doesn't allow structs to be embedded into custom code. So by your logic, cache should be crying waterfalls already. Most likely it's the same with those other two lines of code. Particles/bullets/gibs would be a better fit. Anything small in large quantities.

September 29, 2015 10:56 AM
evanofsky

Thanks for reading and commenting.

Leave religion out of programming, please. There's nothing wrong with globals as a language construct.


Sorry, I should clarify this in the article. These days, I agree with you and I'm not against globals at all. The article is written to mirror the things I learned over the years, and for a long time I thought globals were evil "without knowing why". I'll add a note explaining that my stance toward globals has since changed.

One not-so-weird trick: Instead of trying to use the slightest opportunity to prove a point, how about finding the minimal change set that satisfies the requirements and postponing refactoring to some time after the launch date? For example, putting an invisible player into a car and letting the car be controlled by the attached player.[/size]


I'll admit it's a contrived example, but the solution you propose here is basically composition, which was the whole point of that segment. So it doesn't bother me too much.

<snip> criticism about the cache bit


I completely agree, but this isn't an article about cache optimization. I'm not qualified to write that article. All I wanted to get across is that memory is often the biggest performance bottleneck. Of course if you really want to optimize cache use, you need to measure your code, print out your data, and all those other things Mike Acton recommends. I just wanted to introduce the topic in two pages.

I don't think you should design your game around cache. But I do think the methods I talk about in the article can set you up in a way that makes it easier to optimize for the cache when the time is right.

This appears to be an extremely naive line of code. Every physics library I've seen exposes the rigid body pointer only, it doesn't allow structs to be embedded into custom code.

Yes, I've used physics engines too. Those lines were pretty much pseudocode to simplify things. Somewhere in your physics engine of choice, there is a loop that boils down to "for (entity in list) entity.update()"

September 29, 2015 12:42 PM
Aardvajk
Great article. Shame about the pointless nitpicking comments.
September 29, 2015 02:44 PM
Alpha_ProgDes
In summary, design data structures to match your specific problem. Don't shoehorn a single concept into a bunch of separate, encapsulated objects.

Next, write functions that leave the smallest possible footprint on that data. If possible, write pure stateless functions.
That's the trick.

I'm not sure if you're promoting Data-Oriented Programming or Functional Programming (in particular Lisp/Scheme). You should make the title a bit more specific. It's a bit misleading. I thought this was an entry about better general programming. Not promoting a particular paradigm for Game Programming.

September 30, 2015 01:25 PM
WoopsASword

I'm not completly happy with this article.

You squished a lot of basic topics into 1 article. Basically :"HOW TO DEVEOP 101".

And the sad part, it's not true for everything you develop.

October 04, 2015 08:38 PM
Glass_Knife

It takes a lot of time, experience, practice, and many failures to really understand what you're saying. It is hard, because the things we learn at the beginning really seem to make sense, and going against them feels wrong. The whole time I'm reading this, I'm nodding, "Yep, yep, yep."

Hater's gunna hate. And great job!

October 07, 2015 10:14 PM
Aardvajk
Honestly, et1337, fancy posting your own thoughts in your own journal. How dare you. No wonder everyone is ripping it to pieces. The presumption of it all. Anyone would think you own the place. smile.png

(I fondly remember some hilarity in my journal a few months back when someone weighed in to tell me I didn't know what I was talking about. Was very funny.)
October 08, 2015 07:04 AM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Profile
Author
Advertisement
Advertisement