More Than A Lego
engine design GUI render batching overview
Okay, I spent two minutes and then decided I'd start from what I'm working on right now (render batching, which will be the in-depth topic of the next entry). However, first I found it somewhat necessary to do something that I don't really believe in (when one is working without a roadmap, that is): I decided to jot down a graphical outline of my engine so far to get an overview of all the major components that are partially present at this stage.
Before moving on to that, however, I'll try to lay down some objectives in terms of the engine itself. These criteria are relatively lax and don't necessarily have much to do with reality, in that they don't really mean anything: at present there simply is no evaluable game world size, triangle count, fill rate or other metric at hand, which makes assumptions about a target platform kind of moot. If anything, it has to run smoothly on my laptop (GT 640M, 1.2 GHz i3). These broadstroke objectives are:
Target: OpenGL 3.0, N-core, no physics, 2D AI, 60 Hz
Desired: automatic scaling down to single core, GL 2.0
GL3.0 moved something to core that I consider to be one of the most useful features on GPUs below GL4.0: transform feedback. As it magically binds together processing copious amounts of vertex data on the GPU as one would do on the CPU, that alone can be considered a guiding factor for the target setup.
To start off - I hate graphs. Unlike maps they're useless if not done correctly. Moreover, differently from maps (which usually follow a preset and well-defined topography), graphs are lucid. If you do them wrong, they do more harm than good. Also, flowcharts never tell the whole story, because they're directional. Very select few software titles are uniformly directional in terms of operation - most follow a variety of feedback loops that create complex flow paths that turn a Monet into a Jackson Pollock faster than a child turns a cone of ice cream into "Mommy, can I have another one?".
Anyway - I hope that the layout I came up with isn't too confusing and that the overall flow is at least somewhat reflected by all those pesky arrows. Far from everything is present in it, though.
In general the entire framework is divided into two practical components: the game itself and the editor, and two logical components that do a lot of virtual work, but at the end of the day actually accomplish little more than safe and efficient marshaling (including clustering, translation and checking) of messages between the two application components and the OS/API. These components are the Driver (a complete misnomer since I haven't had the heart to rename it after it started out as a wrapper for an API-oblivious graphics pipeline) and GSL (my own little GUI system), which is a fairly new thing I'm working on, but performs a number of important tasks - particularly from the editor's point of view. I'll leave both of these for future discussion, however, as both are ever-evolving and far from "complete" in any meaningful sense or fashion.
The remaining components are:
- a diagnostic component, which can do profiling, memory tracking and basically collates all information about data that goes through the Driver interface. It exposes this information as a topmost GUI layer
- a custom paged memory manager and guard-based profiler and call tracer.
- the graphics device library (eg GL/D3D/software renderer; currently focusing on GL only)
- the audio component, which is currently quite measly and is based on DirectSound
- and the game "engine" itself, which is essentially a data-driven pipeline that connects files on the disk with your gaming experience
- a thread-safe exception net that is currently Windows-specific, but displays extended information when an exception is thrown
My current efforts, as noted above, are going towards devising a robust, efficient and scalable way of doing render batch scheduling. Since it has always been my idea to settle with the simplest possible scene graph - a regular grid1 with possibly one or two levels of child nodes (eg a regular grid of octrees), I need to look at the problem the other way around and figure out how to do culling and visibility testing during run-time on a data structure that's fairly flat in terms of optimizations.
Since my realistic target environments are low-poly worlds2, I'm not particularly worried about overdraw or visibility culling; however, collision is a considerably more concerning issue, which in my case needs to be done per game-world polygon (eg without artist-defined collision meshes). Artist-defined collision will be there for everything but the game world itself, however. Anyway - as I'm still working on hull computation and intersection tests, it's a topic for another time.
What everything inside the blue box adds up to, however, is a kind of a clusterlol for two reasons:
1) it's a huge collection of interconnected puzzles that need to fit in place like magic; working all of this out is both hard, daunting and fun
2) instead to the traditional problems of working out visibility, collision and batching based on a statically optimized data structure (eg a BSP tree), I want to accomplish the same with streamable geometry. "Yes, but any open-world RPG out there already does that," you might say. True, but those guys probably know what they're doing. Or so I'm told (well, I personally find Skyrim's loading times truly impressive). In any case - doing this for indoor scenes requires radical rethinking, which starts with the editor.
With that aside, I'd like to outline a few more goals, some of which have already been partially implemented and some of which are just something I want to do:
- minimal artist intervention: the level editor is the artist's only tool. As such it needs to be up to par. I already have the "geometry part" up and running to a very large extent (CSG, mesh operations, etc), but the editor truly is the most complex (not really "complicated", but complex) component in the mix
- zero turnaround: the entire objective is to use a data management scheme that works in small chunks and can literally be "streamed" from the disk in real-time, both for in-game rendering and editing. Consequently, my objective is to have zero-latency compilation time. The truth is lightmaps cannot always be calculated interactively (although I'd say one can get pretty close on a multicore system) and the PVS is something I haven't even looked into yet. Hence, both can be expected to take a fair amount of offline computational time, which I'm bent on holding at odds with my goals. It's 2013 - there has to be a better way.
- WYSIWYG editing: most professional engines offer this provisionally - that is, you can enter the game world, walk around and do stuff, but you still need to perform a final build of the level; my objective is to remove this build step or in the very least hide it
- a good look-ahead system: the whole point of a streamable world is to keep the right components loaded at the right time. In theory this sounds simpler for open-world RPGs (in practice it's much much more costly, especially considering that terrain is much heavier on raw geometry than a well-constructed indoor scene); nevertheless, I consider a good look-ahead/caching system vital
1 "Huh?". Exactly!
2 I'd like to expand on this, because there's actually a very valid point to be made: a number of things I bring up here and possibly later have to do with relatively high end optimizations after the computer's resources have been used up and the system is truly bogged (like instancing for CPU bottlenecks or software rendering to perform rough visibility checks - stuff one can find in real game that has to spit out a truly majestic amount of geometry and still perform well). In reality all this is laughably at odds with environments that have a few hundred to a few thousand polygons (as are the ones I'm aiming for). I'd like to stress that that's not the point. My aim here is to go that extra mile, probably test it in the lab and act like I'm going to take full advantage of it one day. For now it's make-believe; magic; a process of substitution to going back to working on a 486 and trying to get by with less than 1% of the resources we have at our disposal today. Why on earth would anyone do that? The answer is that I'm not sure. Because I can. And also because doing stuff the hard way makes you better person - most of the time. Also - because that's how you learn.
Image: the Twelve Apostles in Australia, Feb 2013. Quite the enriching experience.