Viability of a 3D tiling engine

Started by
8 comments, last by Koobazaur 12 years, 11 months ago
I am currently working on a small games which takes palce in a semi-urbanized setting, here's a few reference photos to give you a feel for the kind of environment:
http://koobazaur.com...pics/city01.jpg
http://koobazaur.com...cityscape04.jpg
http://koobazaur.com...tystreets05.jpg
http://koobazaur.com...tystreets08.jpg

Of course, the first thing that daunted at me is the sheer amount of house meshes I will need, the amount of effort in making them and, worse, the eventual sense of "repetition" as you walk down the streets. And so, I thought - why not generate the cities using tiles? Most of the existing 3D tile engines I've seen (which there aren't many of) use a really big tile. NWN and Dungeon Siege both define a tile as an entire house or a structure. But I wanted a finer detail than that; houses, themselves, made of different tiles, arranged according to a tile-based, "floor plan." A throwback to old iosmetric games, but in full 3D!

Pros:
- Easy and fast to create world
- Randomization and proceduarl generation
- Easy fading away / hiding "upper floors" to not obstruct player's view
- Easy to partition world into "inside" and "outside" tiles with "transition" tiles like doors or windows
- Low memory footprint of the layouts, meaning potentially huge world, or even single-thread loadless streaming that takes less than a frame!

Cons:
- Purely procedural generation is too random (looks unrealistic!) - needs a lot of restrictions
- "Blocky" design with everything orthogonal to each other. Diagonal tiles are not enough to break this monotony.
- Making more complex patterns require specialized tiles, and sometimes a series of specialized tiles next to each other - cumbersome
- Poor handling of gradual vertical differences (want to build a house on an incline? forget about it!)

Regardless, I went to coding. Here is what my current test-implementation looks like, with randomized tiles:
http://koobazaur.com...e_tile_test.jpg

Now, while I was at a point where I desperately needed to create more tiles for more variety, I started to doubt the viability of the tile engine, mainly due to technical issues. Consider that the typical, 4-story house I tested with has dimensions of 4x3x4, meaning it's made of 3*4*3 = 36 tiles, plus a roof. Rounding it up for extra details, a single house could be made of ~= 40 tiles = 40 meshes. Multiply that by a few tens of houses, and rendering all that is no small feat! A scene such as this:

http://koobazaur.com..._tile_test2.jpg

Was enough to drop my framerate to 20fps, and that's without any other objects to render, and no real physics, AI or even game logic. I've spent the past few days experimenting, researching (what little there was) and realized that a 3D tile engine would neccessarily suffer from the following technical issues:

No LODing
- Can't "LOD away" tiles, if you see 40, you have to render 40
- With a Single-Mesh house, it can be LODed to a single simple mesh with a single shader/material, thus drawn with just one draw call (not "40"), or taking tiny space in a dynamic batch buffer
- Unless you make lo-poly versions of each house, which defeats the point of tiling
- LODing individual tiles not beneficial since they are usually already fairly low poly
Low poly tiles
- With so many tiles to draw, the only viable techniques are Instancing or Dynamic Batching
- Instancing - requires low poly (NVidia 2006 GDC talk - "no more than 100")
- Dynamic Batching - since you keep rebuilding the batch, it needs to be low-poly due to sheer size of tile copies (remember, each mesh is copied by as many times as tiles that use it!)
Low rez textures
- With so many tiles, you end up with a lot of texture and shader switches;
- You have to pack several textures into a bigger texture => resulting tile textures are fairly low rez
Instance data building overhead and poor culling performance
- If you see 30 houses, that is 30*40 = 1200 instances which all need their own instance data (instead of just 30 with single mesh)!
- In my benchmark, with around 50 houses, I reached a point where, if I saw only half of them, it actually took longer to build the instance data than to render it!
- Meaning: if you only see half of all your tiles, it's faster to render all of them than to actually cull them. Culling is not efficient
No efficient caching method
- Impractical caching of per-frame instance data - due to fine granularity of your world, even small camera movements will lead to big fragmentation of your cached data
- Even if you did cache it, you would need to create different caches per every single mesh (since with instancing you cant group multiple meshes in a single vert. buffer)
- Impractical one huge static buffer: huge overhead due to overlapping vertices at tile corners you cant weld away due to different UV coords.
- Even if you did use one huge static buffer, then the many texture/shader switches due to the amount of tiles would negatively affect performance even when you are seeing a small subset of them (i.e. staring away from city center).


Conclusion
And so, the only viable games are steep-angle isometric games (RPG or RTS with limited camera). Think NWN, Dungeon Siege, Starcraft 2 etc.
- Try to lower camera angle or go TPP and the poor culling performance and no viable caching will destroy your performance
- Try to go FPS and the player will quickly notice the low-poly tiles and low-rez textures
- Try to make a racing game and enjoy your 90 degree turns everywhere.

These are my conclusions based on quite a bit of prototyping meshed with thought-experiments. I figured I'd share my findings and see if anyone else here has had any other experiences, thoughts, or even solutions!
Comrade, Listen! The Glorious Commonwealth's first Airship has been compromised! Who is the saboteur? Who can be saved? Uncover what the passengers are hiding and write the grisly conclusion of its final hours in an open-ended, player-driven adventure. Dziekujemy! -- Karaski: What Goes Up...
Advertisement
Well I don't have much experience in 3d, but why can't you just model each house with tiles, and then automatically convert them to single meshes when loading?
I trust exceptions about as far as I can throw them.
Yes, but in that case, it's not really a tile engine anymore. And, as I said, using one big static buffer (which is what this would effectively be, but split between individual houses) would have a huge memory waste due to overlapping verts. Plus, it takes away from the ability to "fade out" upper floors for better visibility, or cleanly separate "inside" and "outside" tile areas for some inside/outside visibility culling.

I know I focused mainly on just house as being tiled, but I was extrapolating those tests to a case where your whole world (in part, or wholly) made of tiles, with no clear differention between different "houses" or structures.
Comrade, Listen! The Glorious Commonwealth's first Airship has been compromised! Who is the saboteur? Who can be saved? Uncover what the passengers are hiding and write the grisly conclusion of its final hours in an open-ended, player-driven adventure. Dziekujemy! -- Karaski: What Goes Up...
For a small team or a single person, I think tiling offers a good way of constraining the amount of artwork needed to create relatively big levels. With the help of good lighting, shadows and multi-texture you can increase the perceived variety of content. It also helps keeping the art style of the levels consistent. Also, thinking of them as tiles doesn't necessarily means all of them being of the same size/shape. They'd be tiles only in the sense of use and reuse.
[size="2"]I like the Walrus best.
What graphics API are you using? Also, are you using an octree or quadtree for culling?

With Direct3D stream instancing, I'd think it would perform fairly well to group each kind of tile (object) into its own instanced draw call. No actual object vertex data would need to get copied/recreated, only the instance transform matrices. I'd imagine the frustum culling itself (even when accelerated by an octree) should take more time than filling the instance data vertex buffer. Then for further optimization you could try occlusion culling, either by a crude depth-only software rasterizer, or hardware occlusion queries. In short, very "standard" 3D renderer methods, nothing really specific to the tile world approach.
You've been beaten to it ;)


Just because your input data is tiles, that doesn't mean your rendering has to be done on those same tiles.
As Storyyeller suggests, after assembling your tiles into a building, you can crunch all those sub-meshes into a single building mesh. At the same time you can generate your LOD meshes.

If you want the ability to cull the tops of buildings, or inside/outside, etc, you can compile them into as many sub-meshes as you want. There's a lot of optimisation you can do to the mesh once you decide to join the tiles together like this. You can weld verts (or even discard/merge whole primitives) during compilation by transferring the texture data to a new atlas. Shader switching should be avoided by trying to use the same shaders on different tiles.

You don't necessarily need instancing for this to work either.

Regarding the low-res textures caused by atlasing -- SVT/Megatexturing would fit perfectly with this and provide the benefits of atlasing while still giving you high resolution.
If you have 3ds max you could take a look at GhostTown which is a great city generation script.
What graphics API are you using? Also, are you using an octree or quadtree for culling?[/quote]

DirectX 9 and Octrees


With Direct3D stream instancing, I'd think it would perform fairly well to group each kind of tile (object) into its own instanced draw call. No actual object vertex data would need to get copied/recreated, only the instance transform matrices. I'd imagine the frustum culling itself (even when accelerated by an octree) should take more time than filling the instance data vertex buffer. Then for further optimization you could try occlusion culling, either by a crude depth-only software rasterizer, or hardware occlusion queries. In short, very "standard" 3D renderer methods, nothing really specific to the tile world approach.
[/quote]

Yes, except that multiple the "standard" rendering methods by however many tiles your typical structure is made of. So if, normally, you would model a house with just a single mesh, with tiles, the same mesh could use 50 individual tile-meshes. Which roughly translates to 50x as many cull tests and 50x as much matrix copies into the instance buffer, and that's where the performance hits really come from.

And in fact, even filling the instance buffer has pretty big overhead, since you pretty much have to copy each matrix one-by-one. Even with culling off, this can easily take longer than the actual draw call, as I found out. Im wondering if there's something I'm doing off though, since that does seem a little ridiculous (but then again, it is something like 10,000+ copies of float4x4). The buffer is set up with pool default, dynamic and discard locking, so it should maximize performance.

Lastly, as I mentioned, caching the instance data (per house, for example, for a fast frustum cull), is also inefficient since, since you need a different instance-data-buffer per each tile type. So a house made of many different tiles would meaning many different small buffers, only giving you small benefit (albeit, I may be underestimating it; benchmarking would really be needed to gauge that).


You've been beaten to it ;)



Oh crap, that's hella cool! Reading through that, getting a lot of ideas. Particularlily how you can do a LOD via a render to texture based on the overall "shape" of the building!

However, while it does beat me to random house generation with a swift kick to the face and grace, it does not beat me to a tiling engine. It does use tiles to generate the houses, but it does not use tiles to represent the world, or even use these tiles after generating the house . The houses work as brushes (not tiles), and even the collision is for the whole brush (not per tile). What this means is, collision feels fake for non-flat tiles, you still cant have vertical visibility fades, cant sepearate "inside" and "outside" areas using tiles etc. And it seems like you need to pre-compute a good chunk of your houses (like the low-LOD mesh) for it to work.

I guess what I am talking mostly here is a grid-based tiling applied to the whole world; not an arbitrary-tile-generated meshes free floating in the world (albeit that is where I started originally with the houses, and just started thinking of the benefits of extending that to the whole world, rather than just keeping it precomputed inside a single object).

I do wonder how they do the rendering, the documentation doesn't seem to say if they merely render each tilemesh individually, or if they batch (and optimize it) into a single mesh. I know they do a single mesh for a low-lod, but not the high-detail version. Or how the performance was on the whole thing, if you were to create an entire town you could run through (rather than just a fancy background or the occasional house here and there). Especially if you did some fancy houses, rather than just the standard brush-sized box.

Nonetheless, much appreciated that link, great food for though and will come in handy in further experimentation!


Just because your input data is tiles, that doesn't mean your rendering has to be done on those same tiles.
As Storyyeller suggests, after assembling your tiles into a building, you can crunch all those sub-meshes into a single building mesh. At the same time you can generate your LOD meshes.

If you want the ability to cull the tops of buildings, or inside/outside, etc, you can compile them into as many sub-meshes as you want. There's a lot of optimisation you can do to the mesh once you decide to join the tiles together like this. You can weld verts (or even discard/merge whole primitives) during compilation by transferring the texture data to a new atlas. Shader switching should be avoided by trying to use the same shaders on different tiles.

You don't necessarily need instancing for this to work either.

Regarding the low-res textures caused by atlasing -- SVT/Megatexturing would fit perfectly with this and provide the benefits of atlasing while still giving you high resolution.
[/quote]

Hmmm, well as you can see through my replies, I have been deliberately avoiding this kind of a solution. "Baking" your tiles into optimized world geometry would greatly increase the load time of your world (particularly optimizing thousands of tile groups, and applying something like a BSP to actually be able to render the behamoth efficiently), which would probably necessitate for it to be done in advance. And at that point, as I mentioned, you're no longer really having a Tile Engine, just a regular mesh engine, but your tool for generating said meshes is tile-based. And it greatily increase level-design time and prototyping (those who ever used Hammer and went to make dinner while the "building BSP tree / calculating lightmaps" dialog was slowly filling know exactly what I mean).

At this point, you're really better off with a simple BPS and portal based engine, and simply a separate world-building tool that operates on tiles, but ultimately converts them to a format your game understands.

But, as my prototyping is finding and as the posts suggest, this may be the only technically feasible way to "create a world with tiles."
Comrade, Listen! The Glorious Commonwealth's first Airship has been compromised! Who is the saboteur? Who can be saved? Uncover what the passengers are hiding and write the grisly conclusion of its final hours in an open-ended, player-driven adventure. Dziekujemy! -- Karaski: What Goes Up...
Or how the performance was on the whole thing, if you were to create an entire town you could run through (rather than just a fancy background or the occasional house here and there). Especially if you did some fancy houses, rather than just the standard brush-sized box.[/quote]AFAIK, this tech was used for large parts of Gears of War (to reduce the time taken for the team to build large urban areas).

I've done a lot of Hammer/BSP work before, so I understand your opposition to baking ;)
An interesting study might be Minecraft and it's clones. You can think of it as the most simple implementation of your concept -- the whole world is a grid of tiles, where the tiles are cubes!
Minecraftesque worlds have an ungodly number of these "tiles" visible, so to get decent performance from it's renderer, some amount of baking has to be done on the tiles. Perhaps you can find a decent medium though, between naive instancing of tiles and full mesh baking/optimisation.

Some more links that may not be entirely relevant for your prototype, but may provide ideas/inspiration:
http://www.vision.ee.ethz.ch/~pmueller/wiki/CityEngine/PaperBuildings
http://www.procedural.com/company/publications/urban-simulation.html
Aye, the "what you see is what you get" style of game-making seems to be the fad nowadays, and for good reasons. So for a tile engine to implement that, it has to "think" in tiles as well, not post-baked geometry. This thread has been largely to show my findings of the feasibility, as well as get some fresh ideas, like the UDK link or the suggestions. All your posts have been really useful, both in terms of considering new ideas and providing resources, so big kudos to all of you; keep it coming :)

So far, it seem to be confirming what I have been sadly finding from my own tests - that a full dynamic tile engine really does not seem to be feasible. A certain degree of pre-computation really is necessary for it to run at acceptable framerates.

And as far as minecraft, I always found it quite impressive how many cubes it manages to render efficiently. Props to the designer. Albeit I don't know how exactly he does that - you said some amount of baking is done, but the world is fully dynamic. Does it consistently bake and rebake stuff "on the fly" in memory? I know its open source so I could probably look through that, but if you know how ti works off the top of your head (or have a link to a "big picture" overview) that would save me a bit of a slog.

Also Hodgman, thanks for the two extra links, reading them as we speak - great stuff! It's funny how spot-on this is. The whole reason between my prototyping and this thread was that, when I started working on my game ideas, I came to a realization that having the same house repeated 200 times in a city is just boring, and I dont want to create 200 hundred individual meshes. So hey, why not random generate a house? How to do it most easily? Tiles with rule sets (such as base shape, floor plan, amount of windows, "flavor" etc)! And then I realized, if I randomly generate an outside from tiles, I can use that information to also randomly generate the inside (don't you love how in Dragon Age every house is the same inside)? So I started thinking, why even limit the tiles to buildings, define your whole world with tiles that can be either "inside," "outside" or "both" (i.e. walls) and all of a sudden you have instant, cheap and easy inside/outside visibility and collision culling mechanism, load-less inside/outside transitions, and a really easy way to create and edit your whole world! And thus, a week later, comes this thread.
Comrade, Listen! The Glorious Commonwealth's first Airship has been compromised! Who is the saboteur? Who can be saved? Uncover what the passengers are hiding and write the grisly conclusion of its final hours in an open-ended, player-driven adventure. Dziekujemy! -- Karaski: What Goes Up...

This topic is closed to new replies.

Advertisement