• entries
14
18
• views
13413

We see in 3, but 4 is more! The making of a 4D game using XNA Game Studio.

## New Years Resolution

Unfortunately, it appears as if DBP10 is dramatically sooner than I thought it would be. There's no way I'll be able to get anything ready by march, and I don't relish the prospect of waiting another year.

Perhaps it's an opportunity though. I'm growing increasingly fond of SlimDX, and would enjoy no longer having to constantly second guess every method and algorithm to avoid 360 performance pitfalls. If I punt XNA entirely, I rid myself of that baggage, plus make distribution much easier (unless they start packing the XNA runtimes in windows update, nobody will every download them). This also has the distinct benefit of dodging the content pipeline, which while an interesting concept, always felt too "magic" to me, and too tied into the IDE and project files. Being able to use a "plain" project is appealing.

Also, from what I've been hearing, XBLIG is a bit of a black hole, unless you have a total breakaway hit that strikes just the right chords like avatar drop. Let's face it, a 4D game is pretty esoteric and difficult to even explain, much less play (most likely, we'l just have to see when it's done). I don't even own my own 360, and have no particular desire to own one, much less give my payment credentials to a service that can't be cancelled without jumping through frustrating support call hoops. It's a shame to lose the huge install base of 360 and being able to just say "it's on XBLIG, check it out.", but I think the pros outweight the cons. Plus, making a SlimDX framework for the game will be something that's reusable later for other projects that I want to do after this.

Which brings me around to the new years resolution part. I want to put this thing to bed within 2010, so that I can move on to other, less eccentric, projects. A playable concept demo should be enough to gauge interest, and could be expanded on from there if it ends up being worthwhile. Otherwise, at least I can say I did it, and move on without feeling like a complete quitter :/

Almost forgot. Speaking of quitting, I've shelved the emulator project. Not due to any technical hurdles or anything, there was just no reason to continue. It was a wrapper of bsnes, a fantastic and extremely accurate SNES emulator. My goal was to add shader support and such, but recent developments on the emulator itself have rendered that goal redundant. I don't consider it a waste in the least though. I learned a lot about SlimDX that will serve me well, and a good change of pace is always nice. I just need to focus completely on the 4D project in the coming months and see if I can at least post some progress before my GDNet+ expires.

## New Approach

I suppose taking a break from the editor actually did some good afterall. I was constantly dreading working on it, and putting it off to do other things. Recently, however, I sat down and really thought about why that is. The main reason I suppose is obvious and hardly unique to me: GUI programming is complicated, but boring. There's a million details that have to be "just so" for a GUI heavy app to be considered decent and presentable. Any particular detail may be simple in itself, but their sum can add to a level of complexity that can quickly become pathological if not carefully architected first. I started to find myself buried under a pile of nuance and minutiae that was incredibly demotivating.

The second reason was also pretty clear, but until now I was plugging my ears and humming a tune ignoring it. My entire concept of a good editor was based upon experience using editors that were released for consumption, widely used, and officially supported products in their own right. This editor will only ever be used by me, is applicable only to this specific project, requires no support of any kind, and doesn't have a team of dedicated tool developers to make it their focus. Any effort spent making it into a "proper" product is wasted effort, distracting from the ultimate goal of making a game. The editor exists solely to ease the burden of producing levels, and nothing more.

And that was it. My initial concept of "easing the burden" was by default a full-on GUI app like all the others I'd ever used. It was either that, or just writing a raw text file and throwing it to the engine; there is nothing in between.

Or is there?

I paused to consider the only other "editor" I was using at the time: the WPF designer. It had never occured to me until recently, but in all the time that I've used it, I've NEVER clicked, dragged, instantiated, deleted, altered, moved, or otherwise manipulated a single thing in the designer window itself. Everything I ever did was done in the markup window; the designer window existing solely as feedback for my alterations. All that time, and I'd never considered my burden to be anything but sufficiently eased.

GUI layout is plenty different from level layout, but in the end, my levels are going to be very simple arrangements of basic 4D shapes. Anything so complicated as to require delicate manipulation with a mouse-rich interface, will be too complex for the player to comprehend and navigate in the game, to say nothing of too complex to render in realtime on the 360.

So, to drive this home, I plan to simplify my efforts significantly to a markup description of the scene, and a feedback window to view the results from different angles. While perhaps not as ideal as a proper editor, I feel the burden will be sufficently eased to meet my needs, while freeing most of my architectural efforts to be used on the game itself.

In other words, I'm too lazy and incompetent to produce a proper tool.

## Crazy Corruption

Everything was going pretty smoothly. The emulator core was wrapped up in a nice managed interface, the WPF UI was hosting a winforms control that could be rendered to via SlimDX, xaudio2 output via SlimDX, etc. All the pieces were falling into place... except for the occasional startup crash. I was using vs2010 beta 1, so I was just writing it off as some possible incompatibility with SlimDX and the beta framework. After doing a little refactoring however, I was suddenly getting crashes 15-30 seconds in, and startup crashes much more frequently. I was making use of a few threads for the different components (one for the core itself, one for audio, with rendering and UI on the main thread). Intermittent crashes absolutely scream "synchronization problem" so naturally I started there. Reviewing my WorkerThread class carefully though, I just couldn't see what might be wrong.

Why not use the debugger? Unfortunately, the startup crashes seemed to kill the app even with the debugger attached, which is why I expected something external was going wrong at first. Oh how wrong I was, but that's getting ahead of myself. After refactoring though, the crashes were happening later as well, not just during startup, so I decided to see if the debugger could catch them again. Fired it up, waited a little bit, and:

FatalExecutionEngineError.

That's... not good.

Especially when that was only the error half the time. Other times I'd get NullReferenceException from things that can't possibly be null, StackOverflowExceptions when the callstack couldn't possibly be that deep, "this" reference being the wrong type or even missing altogether, or maybe just a good old fashioned AccessViolationException. It was absolute insanity.

The managed heap was obviously becoming deeply corrupted somehow. The emulator core has a fibers library to implement coroutines for synchronizing all the various parts of the system, which is part of the reason I was interested in it. I've always been a fan of coroutines, and don't think they get enough play in general, but I digress. I was wary in the beginning of what effect that kind of thing might have on the CLR, but I'd seen articles around that illustrated how to use fibers with .NET, so I perhaps it wasn't a problem. The coroutine library, in addition to implementing them in assembly, also has support for just using the windows fiber API directly. Just to make sure nothing was going wrong there, I switched it to use the windows fibers version, but no joy. Content for the moment that the problem wasn't there, I turned back to slimdx. I commented out the video code; still crashing. I commented out the audio code; still crashing. Updated to the latest version of the emulator source; still crashing.

What the FRACK. Seriously.

I was failing utterly at debugging. Maybe I'd have to use windbg or something and look at the core dump, but I'm just not that hardcore. Without any kind of clue as to what was REALLY wrong, I'd just have to shelf the whole thing and go back to working on the Marble4D editor.

Giving up on it for the night, I tried to get some sleep (which REALLY sucks when you have a mysterious unresolved bug). In the morning I took a step back and tried to reason through it. I just KNEW it had to be something relating to fibers, since in the beginning I was unsure if it'd work while hosted by the CLR at all. Perhaps there was something to that.

Wouldn't video and audio always have crashed though if that were the case, since the events are triggered by the core? Well, video refresh is triggered after the coroutine scheduler comes back to the main thread, and the way I was buffering up audio, I have the video refresh callback in the native interface also fire the audio event with all the samples buffered during that entire frame. If not those, then what...

Then it hit me. In one of the obscure crashes, the "this" reference for my worker thread manager object was corrupted and listed as a DevicePolledEventArgs.

INPUT! The emulator core waits as long as it can to poll input, to get the very latest state. The implication of this was that it was happening in a fiber, rather than the main thread. The input request triggers a callback that crosses the managed barrier back into my app, which fires an input event with a DevicePolledEventArgs. One of the things I changed in refactoring was having the input event create a new one each time the event fires, rather than reusing the same one over and over. I was getting that strange feeling where you just KNOW that the problem is there, even if you're not sure exactly why. I commented out the input callbacks, crossed my fingers, and ran it again.

No crash.

I ran it again and again and again and again both from inside the IDE and from explorer. Still nothing (yet). It would seem that fibers and other forms of "fake" threads interfere with the managed heap and the GC, at least in beta1. This is a definite case in support of my personal programming mantra: a brain is the best debugger. The unfortunate implication is that I'll have to poll input before the frame starts, rather than waiting until the last possible moment, but them's the breaks.

If anyone reads this and is an expert on fibers, I'd be very interested in your thoughts. Is calling back into CLR code from a native fiber known to be unsafe? Am I just doing something wrong?

At any rate, once I have some kind of input up and running, I'll post some screenshots to reveal what system and emulator specifically that I'm wrapping. I'll probably keep working on this mostly until around the new year perhaps, then switch gears back to Marble4D with a fresh outlook. Hopefully be able to get some kind of simple demo ready for DBP 10. It's definitely nice to crawl out from under all the limitations imposed by the 360 clr for a while and use more features and APIs. Definitely learning a LOT from this, and SlimDX rules. :)

## Reacquainted with an old friend (enemy?)

Disappointed after my XNA module player idea went bust, I was itching for another side project to change things up and keep from getting burned out working on the same thing constantly. Most of the other ideas I had, while simpler than a 4D game to be sure, were full fledged projects in their own right. I wanted something smaller and less architecturally taxing, so I could get charged up from brisk progress and tangible results. In addition, I wanted something that would excersize a different skillset than just another XNA project.

I've been a fan of emulators for a long time, so when the notion got floated for a more windows-centric fork of one of my favorite emulators, I had an idea. The emulator is written in native code, as most are. Unfortunately, it's been years since I've touched the stuff. However, there was a middle ground: c++/cli.

Using c++/cli, the core could be wrapped into a CLI compliant object model, thus facilitating the use of my more recent skillset. I'm no fan of c++ to be perfectly honest, but the simple fact is that it's ubiquitous, particularly in the games industry. So I figure it's best to maintain a moderate working knowledge of it, particular since I had already invested the effort to learn it years ago.

To its credit though, c++/cli does a fantastic job of interop between native and managed code. I was constantly googling the syntax to get things like events and properties working, but that was to be expected. Also to be expected was general rustiness all around. I was constantly forgetting things like the semicolon after the class definition, the ^ before reference types, proper includes, etc.

Speaking of, one thing I sure as hell don't miss is header files. They feel so clumsy and archaic compared to a proper module system to which I had grown accustomed. Preprocessor macros are also a minefield of errors waiting to happen.

That aside, it's still relatively painless to get a managed class up and running that can then be consumed by C#. I was constantly trying to break it with different combinations of managed/unmanaged calls and marshalling, but every reasonable scenario I could think of was working. I'm sure there are scenarios when things get nasty, but for now it's working great, even in x64 mode (which took registry hacks to get working in vc++ express, but thankfully I found an automated script that handled that for me).

So we'll see how it goes. I'm definitely enjoying this a lot, as a fresh change of pace, and am finding it easier to work on it for longer. I'll discuss it in more detail and show pictures if and when it gets working. Ideas are still brewing for the 4D game, and in time I'll be able to attack it again from a fresh perspective.

## *Crickets*

Since it's been over a month without a post, I feel compelled to make one, even if just to make sure there's an entry for august. The wheels are still turning, albeit slowly. Another heatwave a while back sapped my motivation, and more recently I've been tempted by other, much simpler, projects. I'm not going to give up on this though. A 4D game simply must exist, and come hell or high water it's going to happen.

Anyway, work on the editor progresses slowly, and has revealed cause for refactory. The first thing wrong was my unnecessarily complex polychoron object model. I had a "Polychoron Part" class, which a polychoron would contain one or more of. As an example, I have a PlatformTesseract which will be the main surface object that the levels will be made of. The top and bottom of these have a grid-line shader applied, and all 6 sides (left/right, front/back, kata/ana) are just a solid color. This requires 2 separate geometry buffers, one for each shader. One polychoron part would be for the top and bottom, the other for the sides.

This worked, but was ugly and unwieldy. Since polychora are subclassed for specific purposes anyway, and they already contain a list of cells that they're made of, I shifted the responsibility of managing different surface materials to the polychoron subclass, rather than this awkward, extraneous part class. This worked swimmingly and greatly simplified slice code, allowing me to completely delete the part class. If refactoring were Tetris (and it kind of is), deleting an entire source file is like clearing 4 lines at once.

With that fresh, clean refactored feeling, I set out to allow my game objects to be consumed by the editor. This presented another problem though (of course it does, why wouldn't it).

In the editor, 4D objects are to be displayed in 2D panels that cover each permutation of axes. The question is then, what's the best way to flatten the object into the 2D plane without them just becoming jumbled messes. I decided that objects should draw the least number of lines possible (in other words, only their actual edges). Unfortunately, I have no way to achieve this yet.

Currently, polychora are just a list of 3D cells, each of which is sliced into a 2D face for rendering; the sum of the 2D faces making up the full 3D "slice" of the object, as detailed in one of my first posts. However, nothing in this process has anything to do with edges. My first instinct was just to draw the slices as wireframe, then flatten them down into 2D for the editor, but this will yield a lot of useless extra lines where the slice seams are (and there are a lot). These seams are typically invisible in 3D, except for the very occasional single-pixel gap wrought by floating point error. A little AA can smooth over those gaps for 3D, but what to do in 2D with all the extra lines?

Trying to detect "useless" lines dynamically would just be a mess, so I'm not even going to go down that road. Instead, what I plan to do is further elaborate on the Cell class so that it knows what 2D faces its made of. I can then write a slicing algorithm analogous to the current one, which will slice the 2D faces into 1D line segments. This should give me an optimal wireframe ideal for editing. Additionally, slicing an object down to edges will allow me to render objects as wirefram in-game, which could be useful for some kind of extra-dimensional awareness mechanic to alert the player of nearby objects on the hidden axis.

My new long term goal is to regroup and try for DPB 10, if there is one. Seems to be a pretty successful contest each year though, so I don't see why they'd stop.

## Sounds Like a Hack

While the side project is shelved for now, I did discover something interesting that might be useful to other XNA users working with sound. XNA 3.0 added the SoundEffect API to bypass the complexity of XACT. Unfortunately, the sounds must still be authored ahead of time and can only be instantiated through the content pipeline... Right?

NOT SO!

This is a total unabashed hack, but it works on both PC and 360. I successfully generated a sine wave at runtime with custom loop points, and it works great (after getting the units right for the loop point that is, but more on that later).

Also, this is NOT suitable for "interactive" audio, which is to say you can't have a rolling buffer of continuously generated sound data. It almost works for that, but the gap between buffers is noticeable, and especially jarring on the 360. Here's to hoping they improve that in a future XNA release. Nevertheless, the ability to generate sound effects at runtime still provides interesting possibilities.

Anyway, down to business. The first thing that bars our way is the fact that SoundEffect has no public constructor. This can be easily remedied with the crowbar that is reflection:
_SoundEffectCtor = typeof(SoundEffect).GetConstructor(	BindingFlags.NonPublic | BindingFlags.Instance, Type.DefaultBinder,	new Type[] { typeof(byte[]), typeof(byte[]), typeof(int), typeof(int), typeof(int) },	null);

As can be seen, SoundEffect has a private constructor that takes 2 byte arrays and 3 ints. Fantastic. So... what are they?
Digging deeper with Reflector (which is a tool any .NET developer should have handy) we find that the first byte array is a WAVEFORMATEX structure, and the second byte array is the PCM data. The first 2 ints are the loop region start and the loop region length (measured in samples, NOT bytes), and the final int is the duration of the sound in milliseconds. I'm not sure why that's a parameter, since it could be computed from the wave format and the data itself, but whatever.

While most of the parameters are straightforward, we'll need to construct a WAVEFORMATEX byte by byte. Fortunately, the MSDN page for it tells us what we need to know. Eventually, I came up with this:
#if WINDOWSstatic readonly byte[] _WaveFormat = new byte[]{ // WAVEFORMATEX little endian	0x01, 0x00, // wFormatTag	0x02, 0x00, // nChannels	0x44, 0xAC, 0x00, 0x00, // nSamplesPerSec	0x10, 0xB1, 0x02, 0x00, // nAvgBytesPerSec	0x04, 0x00, // nBlockAlign	0x10, 0x00, // wBitsPerSample	0x00, 0x00 // cbSize};#elif XBOXstatic readonly byte[] _WaveFormat = new byte[]{ // WAVEFORMATEX big endian	0x00, 0x01, // wFormatTag	0x00, 0x02, // nChannels	0x00, 0x00, 0xAC, 0x44, // nSamplesPerSec	0x00, 0x02, 0xB1, 0x10, // nAvgBytesPerSec	0x00, 0x04, // nBlockAlign	0x00, 0x10, // wBitsPerSample	0x00, 0x00 // cbSize};#endif

The first thing that should be apparent is that it's different for the PC and the 360. This is because the 360 is big-endian, whereas PCs are little. This also applies to the PCM data itself.

The first member is the format of the wave (0x1 for PCM). Next is the number of channels (2 for stereo). The sample rate (44100Hz in hex). Bytes per second (sample rate times atomic size). Bytes per atomic unit (two 2-byte samples). Bits per sample (16), and size of the extended data block (0 since PCM doesn't have one). This will give us a pretty standard 44.1kHz, 16-bit, stereo wave to work with. It could just as easily be made mono with the appropriate adjustments.

The next parameter is the sound data itself. This is stored as a series of 16-bit values alternating between the left and right channels. Here's a snippet that generates a sine wave:
_WavePos = 0.0F;float waveIncrement = MathHelper.TwoPi * 440.0F / 44100.0F;for (int i = 0; i < _SampleData.Length; i += 4){	short sample = (short)(Math.Round(Math.Sin(_WavePos) * 4000.0));#if WINDOWS	_SampleData[i + 0] = (byte)(sample);	_SampleData[i + 1] = (byte)(sample >> 8);	_SampleData[i + 2] = (byte)(sample);	_SampleData[i + 3] = (byte)(sample >> 8);#elif XBOX	_SampleData[i + 0] = (byte)(sample >> 8);	_SampleData[i + 1] = (byte)(sample);	_SampleData[i + 2] = (byte)(sample >> 8);	_SampleData[i + 3] = (byte)(sample);#endif	_WavePos += waveIncrement;}

This will generate a 440Hz (A) tone. Again notice the endian difference, and how the 16-bit sample is sliced into 2 bytes for placement into the array. It's written to the array twice so that the tone will sound in both channels.

Next we have the loop region. The loopStart is the inclusive sample offset of the beginning of the loop, and loopStart + loopLength is the the exclusive ending sample. In this context, sample includes both the left and right channel samples, so really a 4-byte atomic block. If you pass in values measured in bytes, playback will run past the end of your sound and the app will die a sudden and painful death.

Finally, the duration parameter. I just calculate the length of the sound in milliseconds and pass it in (soundData.Length * 250 / 44100). I'm not sure if this parameter actually has an effect on anything, but it's still prudent to set it.

Once you have all this, you can just invoke the constructor and supply your arguments, and you should get a nice new SoundEffect from which you can spawn instances and play it just as you would with one you'd get from the content pipeline.

That about covers it. Certainly not as useful as full real-time audio would be, but I thought it was cool anyway, and would hopefully be useful for some scenarios at least.

## Undo/Redo and Debug Anecdote

Oddly enough, undo/redo was actually rather easy to implement. WPF's routed command facility makes it a snap to wire the shortcuts and add the callbacks. The implementation is simply two stacks of an IUndoRedo interface. New actions are pushed onto the Undo stack, and the Redo stack is cleared. If an action is undone, it is popped from the undo stack and pushed onto the redo stack, and vice versa when it is redone. Custom actions can implement this interface and their Undo and Redo methods will be invoked appropriately. Currently I have a MultiPropertyChangedAction that listens to the MultiPropertyGrid and will save all the previous values of a property when it is about to change, so that it can be easily undone. That alone covers a lot of what needs to be undo-able. Other things will include spawning a new object, deleting objects, dragging an object around in the viewports, etc.

----------
EDIT: Yoink. I jinxed the side project by talking about it. Turns out SoundEffect has more limitations than I realized. Pitch shifting is constrained to +/- 1 octave, and there's no effective way to start playing from an offset within a sound. Oh well, back to the editor I guess.
----------

Since the side project has been shelved for now, I'll instead regale any readers with a humorous tale of debugging woe. While running the game and mouse-looking around, things were working great. For some reason I alt-tabbed out to look at something else, and clicked the "show desktop" button to minimize everything, including the game. After bringing the game up again, I noticed something was missing...

All of my tesseracts were gone. Empty nothingness staring back at me like the void of anxiety inside at the realization of another obscure bug. Did it freeze? Was there another race condition in my task manager? Thankfully, the FPS counter was still in the corner dutifully ticking away, so it didn't freeze. The numbers looked right, too, and went down just as they should as I pushed the key to spawn more tesseracts. It was still running, but why wasn't I seeing anything?

Being relatively new to shaders and all that jazz, I immediately suspected something was broken in my rendering code. It was going blank after a minimize, so maybe some kind of lost device situation? To test, I ran it again, and hit ctrl-alt-del to bring up the interrupt menu, which I'm pretty sure causes a lost device. Canceling and going back to the desktop, all the tesseracts were still there. They would ONLY disappear after a minimize, not for any other reason.

Even so, maybe my shader constants were getting messed up somehow. XNA claims to be able to recover most kinds of graphics assets fully after a device lost, but maybe there was some kind of bug with minimizing? I added a special key that would reset the constants to appropriate values when pressed. I ran the game, and before minimizing, I pressed the key. The display didn't change, since the values were still correct, and the console reported that the values had been set. So far, so good.

I minimize and reopen to emptiness. Crossing my fingers, I press the key. Nothing. Seething with frustration, I mash the key and fill the console output with "minimize test" but my rage was insufficient to sway the program to render once more. What the hell was wrong? Maybe I just wasn't asking it to nicely enough. RenderStates.PrettyPlease | RenderStates.WithCherry?

I start reading my Update and Draw methods again and again trying to find out what the eff was wrong. If all my shader constants and render states were fine, it had to be something else. Maybe camera updates were going wonky or something. In desperation, I completely comment out the camera code so the view can't be moved at all, and ran again. Holding my breath, I minimize and reopen, only to be met with...

A floating red tesseract.

YES. I took a quick break to relish the discovery of the problem area. Relaxed and confident, I plow into the camera code. It was obviously getting moved in such a way that you could no longer see the scene. How though? There was code to clamp the camera coordinates to reasonable values on both rotational axes, so even the most erratic movement should be fine. Perplexed, I add a line to print out the camera angles when a key is pressed. Before minimizing I get typical -180 to 180 horizontal and -90 to 90 vertical. Minimizing and reopening yet again, I push the key and still see typical values of -Infinity and NaN. Maybe next I'll- wait, what?

I don't care how high your mouse DPI is, you're not going to be scrolling to -Infinity anytime soon. Besides, my input manager will normalize the coordinates based on the client window size, so-

Oh.

Seems that when you come back from being minimized, the IsActive flag in Game becomes true a few updates before the client width and height are set back to nonzero values. Slapped an if around it and all is well. NaN is fun stuff.

## Success! Now I can... Wait, what was I doing again

It's embarrassing that it took a month to finish just this one control, but I think I can finally put it to rest and move on. It helps to remind myself that this control could be useful in any future WPF app I ever make. Out of the box it can edit any class that provides a string converter, flags and non-flags enums (of any underlying type), arbitrary structs, and classes that provide a default constructors. I think that's sufficient coverage for a lot of cases, even without adding custom type editors. I'll still probably do that anyway, at least for stuff like colors. Here's a few shots of the control:

The control before items are added, and after a single item is added.

The flags, struct, and class editors. Simply check the boxes in the flags dropdown for the combination you want. The struct and class editors are just a dialog with a nested MPG, the main difference being that for classes the "null" checkbox is available.

A red outline is shown around editors who's contents cannot be assigned back to the property. The background of the editor cell will be gray if the value differs among the objects selected by the MPG.

Getting all the keyboard input, focus, mouse capture, data binding, etc. of this thing working correctly was at time enormously frustrating. I could itemize the challenges, but honestly I'm so tired of this control that I don't really want to drag my memory through it again. If anyone is using WPF and is interested though, I'd be happy to discuss it and share the code.

Switching gears... Somewhere along the line I took a break and refactored my XNA input manager, since I wasn't quite happy with it. The primary reason for making one was to provide quick methods for checking if buttons were just pressed or released during the current frame. From that though, I also wanted to reign in the inconsistent input APIs. The keyboard, gamepad, and mouse classes all exposed their states in slightly different ways, and I wanted to have one, single flat state for all digital inputs, and all analog inputs. For digital, I combined all inputs exposed by the 3 devices into a single, large enum, and allow the user to query if an input is down, up, just pressed, or just released. This should make it easy to bind controls to any device, or any combination of devices.

Similarly for analog, I made an enumeration of all axes available, and normalize them to a range of -1.0 to 1.0. The user can then query the current sample, or for the delta from the last sample. This works great for all the gamepad axes, but was slightly awkward for the mouse. To make it work, the input manager recenters the pointer after every mouse sampling. This allows one set of code to correctly handle camera control from either the mouse or the gamepad (albeit with different sensitivities). For the gamepad, camera movement is proportional to the current sample. If the player is holding the stick steady at 1.0, the camera will move at a certain speed, a different speed at 0.5, etc. For the mouse, camera movement is typically handled with the mouse delta, rather than the mouse sample. Recentering the mouse, however, effectively turns the sample into a delta, and we get the expected behavior.

One final challenge was the aspect ratio of the mouse. Normalizing horizontal and vertical axes to -1 to 1 means that the movement along the axes has different magnitudes, as the screen is not square. To deal with this, I provide aspect corrected mouse axes, as well as the non-aspect ones (which are themselves useful to tracking pointer position in view space).

Now I still have the undo/redo stack to implement, then It's finally back to working with 4D things. The level format aught to be interesting, especially if I add spatial partitioning...

## Bleh

Anyone in the Seattle area right now is I'm sure aware of the loathsome heatwave taking place. 90 plus degree temperatures in an area where it's not uncommon to have no AC at all does not foster motivation. I'm not a big fan of the day star to begin with, and record breaking heat is doing little to sway me.

Anyway, the icy wind blasting through the window now is doing much to revitalize me, and I'm continuing to pursue the MultiPropertyGrid. Types that support String in their TypeConverter are editable via a simple text box, while enums are handled with a combo box. User entered strings are validated through the type converter to make sure they're valid, and if they fail the user is notified with a message box and the control's error template is activated. Flags enums might be a little more complicated. Maybe a multi-select listbox. Beyond that, types will need to supply their own WPFTypeEditorBase derived editor control through the Editor attribute. The type is then retrieved through the TypeDescriptor interface and instantiated and data-bound into the property grid.

Once this is out of the way I'll need to handle an undo-redo stack somehow, then maybe I'll finally be able to focus on the level format. At this point it's abundantly clear that there's no way I'll be ready in time for DBP 09, but it's still a useful deadline to keep motivation up.

## Editor Braindump

Almost a week since last update, So I thought I'd just talk a bit about the editor.

For some reason, writing one feels kind of like how doing homework felt. Well, minus the procrastination anxiety. It definitely has that fatiguing tedium of a project that you just want to put behind you so you can get on to other things; a project that, while technically necessary, doesn't really give you the feeling that you're getting any closer to your goal. Instead, building the boat that will get you across the river that separates you from your goal, the whole time wondering if maybe you could just swim it and skip all the bother.

In this case, "swimming it" would be my crazy (lazy?) idea of trying to make levels out of plain text files. Maybe draw the box with hyphens and pipes and junk, and annotate it with some XML or something. When faced with the prospect of reimplementing a property grid control in WPF, that was sounding really tempting. At first I was cursing WPF for not just including one out of the box, but then realized I would need to customize it anyway, so perhaps it doesn't make much difference. Still though, it feels like for every two steps forward, I take another step back; a net gain, but frustrating nonetheless.

I suppose part of the problem is that such a project is nontrivial in any windowing framework, so it's exposing all the weakest parts of my understanding of WPF, of which there are many, even after reading a rather comprehensive book about it. It is by far the most enormous and complex API I have ever used (hardcore win32 hackers will probably laugh, but hey, we're all bound by the limits of our experience). Not just in terms of the number of classes, methods, properties, events, etc. (seriously though, there's a lot, pull up any FrameworkElement derived class and try to mousewheel through it as fast as you can), but in terms of how those classes interrelate and all the functionality they expose: XAML, Dependency properties, routed events, commands, data binding, retained rendering, measure/arrange layout, templates, styles, triggers...

With all of this power available, I feel a constant nagging doubt that I'm not using it properly, or enough. If I just google a little harder, just follow that next link, maybe a whole new and better way of solving the problem will be revealed. In fact, this actually happened when trying make an XNA control. I was looking for all kinds of ways to render arbitrary pixel data. One way I read about involved using reflection to crack open a class and force the data through; another involved using an interop bitmap class and some win32 methods. Then almost by accident I stumbled across some posts about WriteableBitmap, which was added in a later version of the framework and did exactly what I wanted, but that I had no clue existed.

I don't want this to seem like I'm coming down on WPF, cause it's really quite good. In particular, the resolution independence is something I've been wanting for a long time, what with poor vision and the constantly increasing resolution of displays over the years. It's just that learning to use it was very intimidating, at least for me. I think I'm finally starting to come to grips with it though, and most importantly I'm learning to relax and just solve the problem, instead of obsessing over the perfectly elegant and proper solution using every single feature to its fullest.

I have an OrthographicViewport control that I'll use to display geometry viewed top-down in the XY, XW, and WY planes, with 3 more in a tab control that will show height with the XZ, YZ, and WZ planes. The views can be scrolled, zoomed, and updated, with custom major/minor grid-lines for each axis, with numbers on the major line. They're all data-bound together so they all use the same list of objects, and sync up their zoom factors and scrollbars (scrolling right on the XY display will also scroll right on the XW and XZ displays, etc.). Next I'm trying to make a MultiPropertyGrid that, given a collection of objects, will show only the properties those objects have in common, and allow you to edit them with custom type editors. I'll post some screenshots when it's working.

Previously on Skipping a D:
Quote:
 First, before we can improve performance, we have to know where we stand. - ...we then have a full 24,000 intersection checks. - That's a LOT of computation, especially on the 360. - 25 tesseracts. Release build. No debugger. - It was time to use... (zoom in on face) THREADS. - Something kept bugging me though. - I wanted to try and make it lockless to further reduce any blocking. - If it didn't work I could always roll back...
And now the conclusion.

Well, to get started, any thread pool is going to need threads, so we need to set some up:
	_Threads[0] = Thread.CurrentThread;	Thread.SetData(_ThreadIndexStoreSlot, 0);#if XBOX	Thread.CurrentThread.SetProcessorAffinity(XBoxCoreMap[0]);#endif	if (!forceSingleThread)	{		for (int i = 1; i < ThreadCount; i++)		{			_Threads = new Thread(delegate()				{#if XBOX					Thread.CurrentThread.SetProcessorAffinity(XBoxCoreMap);#endif					Thread.SetData(_ThreadIndexStoreSlot, i);					_TaskInitWaitHandle.Set();					ThreadProc();				});			_Threads.Start();			_TaskInitWaitHandle.WaitOne();		}	}}

_Threads is an array of all the worker threads held by the manager. At the top, we set the first element to be the current thread, set its thread local index value to 0, and set its hardware thread affinity to the first entry in the mapping. XboxCoreMap is an array holding the indices of the hardware thread slots that the worker threads should use. I define it as { 1, 3, 4, 5 }, because slots 0 and 2 are reserved by the XNA framework itself. So the main thread gets slot 1 (which is actually its default anyway, but I set it explicitly anyway just to be thorough).

After the main thread setup, we have a loop that spins up the other workers. We new up a thread for each entry, giving it a ThreadStart delegate that sets its processor affinity, its thread local index, then signals a wait handle. This wait handle is crucial. As I mentioned in part 1, it keeps the main thread in sync with the worker threads, allowing each one to read the current value of 'i' before the main thread changes it for the next loop iteration. After the wait handle, it drops into the main worker thread procedure, which we'll see later.

Now that we have our workers ready to go, how do we actually go about performing tasks? The DoTasks method is the main entry point for the task manager that other code will call when it wants parallel work done:
public void DoTasks(List tasks){	if (_Disposing) { throw new InvalidOperationException("Cannot run after disposal."); }	if (_Running) { throw new InvalidOperationException("Cannot run while already running."); }	_Running = true;	_Tasks = tasks;	try	{		if (_ForceSingleThread)		{			for (int i = 0; i < tasks.Count; i++) { tasks(); }		}		else		{			_CurrentTaskIndex = 0;			_WaitingThreadCount = 0;			_ManagerCurrentWaitHandle.Set();			TaskPump();			while (_WaitingThreadCount < ThreadCount - 1) { Thread.Sleep(0); }			if (_Exceptions.Count > 0)			{				DoTasksException e = new DoTasksException(_Exceptions);				_Exceptions.Clear();				throw e;			}		}	}	finally	{		_ManagerCurrentWaitHandle.Reset();		_ManagerCurrentWaitHandle = _ManagerCurrentWaitHandle == _ManagerWaitHandleA			? _ManagerWaitHandleB : _ManagerWaitHandleA;		_Tasks = null;		_Running = false;	}}

The first thing we have is simple checks to make sure the manager hasn't been disposed of, and that it's not already running (on another thread). Next we mark it as running, and set the internal _Tasks list to the list that got passed in. This list will be read in parallel by the worker threads, so I use a concrete List instead of an IList. With an interface, there's no way of knowing what's actually going on inside the accessors, and if they're actually thread safe, so instead I use a List which is clearly defined as being an array internally. Here we see the _ForceSingleThread flag again. If this is set by the constructor, the task manager will not create any threads at all and will execute all the tasks serially. This is mainly for diagnostic and comparison purposes.

The meat is in the 'else' clause, which prepares to execute the tasks by setting the start index to 0, and the number of finished threads to 0. It then signals a ManualResetEvent that simultaneously activates all the worker threads. This avoids the one-by-one activation of my previous implementation. It then enters TaskPump, which is the method that actually acquires and executes work items. Thus, my other goal of getting the main thread to do work as well is achieved. When it comes back from that, it waits in a loop for the other threads to finish.

Once all threads are finished, we check to see if any threads added an exception to the internal _Exceptions list. If they did, we aggregate them all into a single DoTasksException and throw that back to the caller so they can examine it, or allow it to fall through to a debugger.

Not to be overlooked, the finally block below plays a critical role. First it closes the wait handle we opened earlier to start the workers. Next, it switches the current wait handle that it's using to the other of the 2 defined in the class. This will make more sense when we see ThreadProc:
void ThreadProc(){	while (true)	{		Interlocked.Increment(ref _WaitingThreadCount);		_ManagerWaitHandleA.WaitOne();		if (_Disposing) { return; }		else { TaskPump(); }		Interlocked.Increment(ref _WaitingThreadCount);		_ManagerWaitHandleB.WaitOne();		if (_Disposing) { return; }		else { TaskPump(); }	}}

This is the loop that all worker threads spend their time in. The first thing they do as they enter is safely increment a value that indicates the current number of threads that are in a waiting state, then waits on wait handle A. At first it seems that the loop repeats the same code twice. The second portion is the same as the first, except it waits on handle B. This plays into the "current wait handle" we saw on the main thread. The first time DoTasks is called, it will signal handle A, while handle B is closed. This means that all workers will run TaskPump, then block on handle B. The main thread then closes handle A and uses handle B next time. Similarly, the workers will move when B is signaled, run TaskPump, then block once more on A. This A/B/A/B pattern makes it simple to tell all workers that they can start running, while at the same time being sure that they'll stop when you want them to. With only a single handle, you would need to wait until all workers are confirmed to be activated, but not wait too long or one worker might finish its work and pass the wait handle again before it closes. As a final note, the _Disposing flag is set when the manager is disposed of and tells the workers to return, thereby ending the thread.

The final piece of the puzzle is TaskPump itself:
void TaskPump(){	List tasks = _Tasks;	int taskCount = tasks.Count;	while (_CurrentTaskIndex < taskCount)	{		int taskIndex = _CurrentTaskIndex;		if (taskIndex == Interlocked.CompareExchange(ref _CurrentTaskIndex, taskIndex + 1, taskIndex)			&& taskIndex < taskCount)		{			try { tasks[taskIndex](); }			catch (Exception e)			{				lock (_ExceptionsLock) { _Exceptions.Add(new TaskException(tasks[taskIndex], e)); }			}		}	}}

In this method, we enter a loop that will continuously execute until the _CurrentTaskIndex indicates that all work items have been fetched. The next bit is the dangerous lockless part. First we make a local copy of the current task index. Next, we compare that local copy to the result of an Interlocked.CompareExchange call. This call will safely store the second argument into the location of the first, but ONLY if the third argument is equal to the first before the replacement, all as an atomic operation. If our local copy of the task index matches the shared copy, then we effectively "claim" that index for the executing thread. If it doesn't match, then another thread has altered the value between when we made our local copy and when we tried to make the check. This means that another thread has claimed the index and we must try again with the new value. If we pass the check, then we use our claimed index to retrieve the corresponding task from the task list, and execute it. If it throws an exception, we catch it, lock the exception list, and add it in for the main thread to sort out later. I don't try to avoid locks with this part, because if task code is throwing exceptions on a regular basis, then something's broken and should be fixed.

With all the pieces in place, it was time for the moment of truth. I hooked everything up, deployed to 360, and...

150 tesseracts.

AWESOME. I was totally stoked. I now had a full 3x improvement over the single threaded version. Maybe this lockless thing wasn't as hard as they said! Of course, several runs later, I was finally punished for my hubris.

A seemingly random crash out of nowhere. I went cold. Thankfully I was currently in debug mode and caught the exception, and I knew I had to try and fix it then and there, since if it was some kind of arcane synchronization bug, I might not be able to get it to happen again for a long time. As it turned out, one of the workers was trying to read a task just past the end of the list. But how? The check in the while loop should catch that right? Wrong.

Let's say 2 threads pass the check in the while loop, but pause before attempting to claim an index. Further let's say one of the threads moves through the compare exchange and grabs one successfully, and that immediately after that, the other thread does the same. The current index counter has now been incremented twice. This if fine and dandy, but what if there was only actually 1 work item left in the list? This is where the "&& taskIndex < taskCount" check that I glossed over in my explanation comes in. After slapping that in, no more crashes (yet).

Is it foolproof now? Honestly, who knows. It's working great so far, and I'll be continuing to rely on it for the foreseeable future, so we'll see how it goes. In part 1 I promised a bonus, so here's the full code for the manager for those interested:
If anyone tries it out, please let me know how it works out for your program and if it actually gets you more performance.

So my tale of threading comes to an end. Perhaps next time I'll talk a bit about trying to wrangle a 4D editor out of WPF.

These entries are a bit rapid fire so far, but that's mainly the result of having a bunch of backlogged topics that finally added up enough for me to break down and write them. Things will slow down pretty soon.

Switching gears to a more concrete issue this time, I'm going to talk about the performance challenges of these algorithms, and a few things I've done to fight back. This is part 1 of a 2 part entry, because it got pretty long as I was writing it. Part 2 will have a bonus that other XNA developers might hopefully find useful.

First, before we can improve performance, we have to know where we stand. As discussed in the previous entry, a tesseract is composed of 40 tetrahedra, each of which must potentially be sliced to produce a 3D scene. Each slicing consists of 6 intersection checks, the successful of which will yield a linear interpolation factor that we use to generate a 3D vertex from the two 4D vertices of the edge. For a modest scene of say 100 visible tesseracts (meaning none of them are early-outed by lying entirely outside the camera realm) we then have a full 24,000 intersection checks. Many of these will require further linear interpolations to generate the geometry. I'd estimate roughly half on average for visible polychora.

That's a LOT of computation, especially on the 360, which means I was in for a rude awakening, and indeed it came. I want to ideally run at 60fps, so I spawned visible tesseracts in my test app to see how many I could get before dropping below that. Going in to this I had read up about the performance pitfalls of XNA, did my best to avoid created garbage, used ref passing for structs where I could; thought I had all my ducks in a row. Well, I got to... (drum roll)

25.

25 tesseracts. Release build. No debugger. I was... less than encouraged. Was this all a waste? Should I just pack it in and call it quits? My dream was crumbling, the sky was falling, the-

Oops. Logic error... The geometry was being sliced twice per frame instead of once.

50.

Okay, so that's... less slow. You can't scoff at a 2x performance boost right off the bat I suppose. This includes disabling the back-face calculation I mentioned in the previous entry, which bought me about 3 or 4 I think (I really wish I had kept detailed notes of this as it was happening). Even so, I was starting to face harsh reality. The naive approach of just plowing through all the work on the main thread with ears plugged and humming a tune pretending the other cores don't exist was right down the toilet. It was time to step up and get with the times. It was time to use... (zoom in on face) THREADS.

Reading the XNA forums had previously made it clear to me that the standard .NET thread pool was ill-suited for use on the 360 because on that platform, you must manually assign your threads to a hardware thread slot; no scheduler does the work for you. So it was time to go googling again. I eventually came upon a thread pool implementation designed for use on the 360. I didn't feel comfortable just slapping it into my engine though. This is also a learning exercise after all, and parallelism is an interesting topic, so I instead used the code as a guide and wrote my own.

It went pretty well actually. I used a standard .NET queue to store the work items, spun up some worker threads, assigned them to 360 hardware threads, and used an AutoResetEvent to block the workers until the main thread called DoTasks on my task component. When this happened, the main thread would one at a time awaken a worker by signaling the wait handle, waiting for the worker to signal back with another wait handle that it was activated. Once activated, the worker would lock the task queue, dequeue a work item, then run it. Once the main thread saw that there were no more tasks to be dispatched, it would wait for all outstanding tasks to complete, then return.

This wasn't too bad for a first attempt I thought, so I set about to make my game use it. Every polychoron was now a work item, slicing its geometry into a thread local buffer. Once all work items were done, the main thread would render each of the worker thread geometry buffers.

Unsurprisingly, there was a bug. I used a loop to spin-up my worker threads, and use the loop index inside a closure delegate to assign each thread a unique index so they can write to their own element of shared data arrays. For some reason though, all the indices were coming out as 2, which was the terminal loop value on my windows build, instead of being 0 and 1 like they should be. As it turns out, C# closures are mutable, and by the time the worker threads were activated, the loop counter had progressed to its final value. As a fix, I added a wait handle to the initialization so the main thread would wait until the worker was done reading the local values before proceeding to the next iteration. I'm suspicious, although not sure (because I haven't actually ran it) that the linked implementation might suffer from it as well. If anyone has used it (specifically on the 360) I'd be curious to know how it went, otherwise perhaps I'll investigate at some point.

Anyway, this worked out pretty nicely, and I was quite pleased to see my new performance number:

100.

The clouds are parting now. Starting to get to the point where I consider it adequate, especially if I was willing to settle for 30fps, because then I could reach around 200. I declared a preliminary victory and changed focus to other things, like the editor.

Something kept bugging me though. The 360 was now using 3 hardware threads to do the work instead of only 1, but I was "only" getting about 2x performance. I initially shrugged it off as the fact that 2 of those threads are actually on the same core, but then I thought "why not let the main thread do work too instead of just waiting for the other threads to do it." So I tried to add yet another worker thread and assign it to the same hardware thread as the main thread. Much to my dismay though, this had essentially no impact on performance.

I shrugged it off again, and again went back to the editor. Working on an editor is dry business though, so as a distraction I decided to completely tear down the task manager and try to fix 3 main problems that I suspected. First, activating workers one at a time felt inefficient, there has to be a way to activate them all at once and let them take care of acquiring their own work items. Second, I was still convinced that getting the main thread to consume work items as well would help. Third, I wanted to try and make it lockless to further reduce any blocking. Experts frequently warn against trying that, and for good reason as I'll detail next time, but still I wanted to try, even if just for fun. If it didn't work I could always roll back...

TO BE CONTINUED...

## Geometry Slicing

Before moving on to more platform specific topics, I'll detail the general algorithms in this installment.

The first thing I did was write a pseudo 5x5 matrix class. Rotations, scales, skews, etc. can all be handled by the built in 4x4 matrix provided by XNA, but to be able to combine translation in there with a matrix multiply, you need one more dimension than the space you're working in. As it turns out though, I only actually needed the fifth row, not the fifth column. XNA uses row vectors, so the translation portion of the matrix is in the fifth row, but since I'm not doing any kind of 4D projection, the fifth column is always [0, 0, 0, 0, 1]. I implemented the class to always assume that, and substantially cut down on the number of floating point operations required. This means that it's not actually a "true" matrix though, so I called it AffTrans4D, since its sole purpose is to represent affine transformations. For similar reasons, I didn't implement a Vector5, since the last element would always be 1, so I just kept using XNA's Vector4.

With that out of the way (complete with vigorous unit testing to make sure it actually worked right), It was time to actually make some geometry. This is where it starts to get weird. In 3D, the basic unit of geometry is the triangle. Some combination of these can form 2D faces, which in turn make up a 3D object. In 4D, the situation is similar. The basic unit of 4D geometry is the tetrahedron, some combination of which makes up a 3D "Cell," which are the basic facets of a 4D shape (also called a polychoron). As an example, the tesseract (or hypercube), is composed of 8 cubic cells, two capping the extremes of each axis. This is surprising at first, to realize that the 4D analog of the cube contains a full 8 of its 3D counterparts, but it gets worse...

In 3D, perhaps the most common arrangement of triangles into a face is the quad, which is a mere 2 triangles. A 3D cube therefore is built from 12 trianlges, 2 for each face. Unfortunately, the situation isn't so simple for the tesseract. Subdividing a cube into tetrahedrons yields more primitives than subdividing a square into triangles:

We have not 2, but 5 primitives within. A tesseract is consequently an arrangement of 40 tetrahedrons, each of which itself has 4 triangles. So going strictly by triangle composition, a tesseract is roughtly 13 times as much geometry. Many of those triangles are shared by cells, just as edges are shared by faces, but this is just an illustration of the kind of complexity explosion that can happen even with simple geometry. Curved and sphere-like shapes are even worse, but that's another discussion.

Anyway, we have all this geometry, but what do we do with it? To render it, we need a 3D slice that we can add to the scene. This part was intimidating at first, but the algorithm actually isn't too bad. Getting a slice of an object is the same as the sum of the slices of its component tetrahedra. To slice a tetrahedron, we must examine each of its 6 edges to find if they cross the camera realm, and use those intersections to construct 3D geometry. This can be made much easier by first doing a sort of "view" transform on all the 4D geometry such that the camera realm aligns with W = 0. Once this is done, we can easily check for intersections if the two endpoints of an edge have opposing signs for their W component. If they do, a nice quick linear interpolation can give us the actual intersectioin point.

That's all fine and good, but what do we do once we have all the intersection points? It helps to first think about what kinds of cross-section a tetrahedron yields. If there are 0, 1, or 2 intersection points, we can discard it completely, as that means it's barely touching the camera realm and would yield at most a line segment. If there are 3 intersections, that means the slice is a triangle, which is useful to us and can be shipped off to the rendering system. If there are a full 6 intersections, then the entire tetrahedron is parallel with the camera realm, and we could choose to either draw the whole thing, or discard the whole thing. I chose to discard it, since a fully parallel tetrahedron means the 4D shape is lying cell-wise on the camera realm, barely intersecting it. The most interesting case is if there are 4 intersections, because this forms a quad. So what? just draw a quad! Well, the rub is that if you just have a set of 4 points, how exactly do you arrange them into the proper quad? In other words, which edge is the common edge?

My first take on this was to just draw all 4 possible quads, since one of them will be correct, and that one will contain all the incorrect ones, so you wouldn't actually see the overdraw. This felt sloppy though, so I was considering all kinds of cross product/normal comparison trickery to figure it out. Then I realized though, that the reason I didn't know which quad was correct was because I was doing the intersections in an arbitrary order, but that if I did the checks in a very specific order, I could be sure which was the right one:

As can be seen here, if we number the edges thusly and do the checks in that order, any quad slice we get will come out in triangle-strip order, so we can be sure which way we need to construct the triangles, at no addition performance cost at all!

This is lucky, because computing a 4D normal is about 4 times as costly as a 3D one. In 3D, if we have 2 vectors, we can take their cross-product and get a 3rd which is a normal to the surface containing those 2 vectors. But what to do in 4D? As it turns out, Cross4 requires 3 input vectors, yielding one result. It took some rather desparate googling to find the answer, but in the end the solution is to take the determinant of a 4x4 matrix with the top row as the components of the answer, and the lower 3 rows as your argument vectors. This is a rather heavy op though, so I try to avoid it as much as possible. The one place where I couldn't avoid it though was when trying to do proper face-culling on the gpu. The 3D slices were working, but I couldn't for the life of me think of an efficient way to figure out which side of the slice is the "front" without taking the 4D normal of the tetrahedron, projecting that into 3D, and using that to determine the front face. This makes the slicing algorithm about twice as complicated though, so for now I've simply decided to disable face culling and live with the overdraw. I'm so massively CPU bound though that it's not a huge issue yet. Still though, if anyone has any ideas about this I'd love to hear them.

I think that about covers the main algorithms. If anyone has any questions or I forgot to explain something, feel free to ask. The next entry will be more about what I did to improve performance on the 360.

## First Entry

This is the first entry, but I've actually been working on the game for about 6 months. It didn't occur to me to write about it until recently, and even then I waffled back and forth about actually doing it. I didn't want to use some generic blog site, so I just took the plunge and got GDNet+. I suppose it's only fair, since I've been visiting the site for who knows how long, but it has to be close to 10 years. I'm kindof kicking myself for not actually registering back then, as it would be cool to have a 1999 join date.

Anyway, back to the game...

I've always found the concept of 4D space fascinating, and wanted to explore it in an interactive way. The only other games I could find that did this were simplistic browser applets or dry mathy concept demos. My driving goal for this is to make 4D accessible and, hopefully, fun. I myself have no idea how this game will eventually play or look like in full motion, but I REALLY want to find out.

I chose a simple ball-rolling gameplay mechanic because it's familiar, not too complicated, and puts the emphasis on deftly traversing the world. Even just visualizing 4D space is hard, to say nothing of navigating it, so I want that to be the primary focus of the game. Most of the player skill involved will likely be manipulating the 4D "camera" to obtain a desired 3D view of the world, enabling the player to roll their glome to the goal.

The 4D camera is confusing, so I'll try to clarify. In 3D space, we have the standard axes X, Y, and Z. a combination of distances along these axes can locate any point in 3D space. In 4D, we add an additional axis, W. Now to identify any point, we require a 4-element vector. These are thankfully provided by most 3D frameworks, thanks to the use of homogeneous coordinates in 3D rendering. That's where luck runs out though, as I had to write my own 5x5 matrix class to handle 4D transforms, as well as figure out how to do a 4D cross product, which was a bit of an adventure in itself.

When visualizing 4D objects, there are a number of approaches:

The first 2 techniques are analogous to the techniques used for 3D cameras to generate a 2D image. Perspective in particular is great for that because our brains are adapted to interpret perspective-projected 2D images into 3D information. However, this completely breaks down in 4D, since we can't rely on any kind of built-in neural circuitry to do it for us. For this reason, I've chosen to use simple cross-sections. This will allow the player to more easily interpret at least a portion of the game world, instead of being overwhelmed with trying to disentangle twice-projected objects.

Basically, the 4D camera is a 3D hyperplane (or realm) that bisects the 4D world. Where 4D objects intersect with this realm, they generate 3-Dimensional cross-sections (or slices as I refer to them in the code). Slices then populate a dynamically generated 3D scene that is presented to the player with a traditional perspective camera. Generating these slices was actually an interesting process that I'll detail later.

This basic foundation is already complete, and my test program can generate a sea of randomly generated sliced tesseracts:

I generate so many mainly as a performance indicator, especially for gauging performance on the 360, which as most XNA users can tell you, leaves a bit to be desired with regards to floating point. Even so, I think i managed to get decent numbers, and I'll discuss that later as well.

Still, just cross-sections won't be enough to communicate to the player information about the extra dimension. For this I will attempt to use other forms of feedback, both visual and otherwise. Another perk of the abstract marble roller gameplay concept is that I'm free to use things like the color and shading of an object to communicate information, rather than being constrained by trying to reproduce "real" objects. For example, the shaders I'm using for lighting actually compute 4D light. This means that if two surfaces appear parallel in the view realm, but are shaded differently, then they are sloped differently along the W axis. I plan to try and set traps for the player in this way, so that they have to be careful to observe the full nature of the objects they're rolling across. Another idea is to make the character "afraid of heights" so that the controller rumbles when the player gets close to an edge, extradimensional or otherwise.

So that's the general overview of what I'm doing. This first entry feels a bit scattered, cause I wasn't quite sure where to start, but I suppose that's the price for waiting so long. I have more specific topics in mind for future entries though, some of which might be useful to other XNA devs.