Modern approach to 3/4 overead tile engines

Started by
5 comments, last by Ravyne 17 years, 1 month ago
For some time now 3D acclerators have been more common than 2D accelerators. Many of these 3D accelerators have poor (and inconsistant) support for even relatively basic 2D operations. As such, modern 2D games should target 3D APIs, rather than their 2D counterparts (or a 2D API that targets a 3D API internally.) The question then becomes: How can we best map a tiled, 2D engine onto modern 3D APIs and hardware? Particulary D3D 8-level hardware (Shader Model 1.x) using either/both DirectX and OpenGL. Specifically, I'm interested in hearing some oppinions/evidence on:
  • The playfield (ie The Map) - This is essentially static geometry and happens to have a relatively low vertex-count. Is it better to send only the visible geometry to the accelerator each frame, or would it be better to send the entire geometry to the card once, displaying only a portion of it for each frame? If the latter, is it best to draw all the geometry and let the clipper take care of it, or would it be better to build an index-list of the visible geometry (using a quad-tree, of course) for each frame and draw that?
  • Animated tiles - If we do send the geometry only once, whats the best way to change the texture coordinates of a particular tile? Is it even possible to do so? Would animated tiles be better off seperate from the static ones? Stated simply: Whats the best way to put a frame-based 2D animation from a texture-sheet onto a piece of geometry?
  • To Z, or not to Z - Since we're essentially talking about a series of (potentially stacked) textured quads which are trivial to draw in a back-to-front order, the depth-buffer is essentially unnecessary. However, is it more performant to enable Z, then draw front-to-back, in order to get the early-z reject benefits to eliminate overdraw the way many FPSs do?
What I'm aiming for is a modern, 2D RPG much like the 16 bit classics FF II, FF III and Chrono Trigger. Default resoution would be 640x480 (to retain the pixel-art look) with up-sampling to 1280x960 using simple pixel-doubling or the scale2x algorithm (again, to preserve the pixel-art look.) Of course there will be an option to scale to other resolutions, at some sacrifice to the pixel-art look. The default unit of size for tiles will be 32x32 texels, on a 512x512 texture-sheet containing 256 tile textures. Maps will consist of two layers (below and above player/NPCs/Objects) both layers allowing two textures/tile. I've written a couple 2D RPGs like this in the past, but only using 2D APIs. I'm fine integrating the various systems, but I'm curious to hear more about how to impliment the new 3D renderer optimally. I'm targetting relatively modern, but not bleeding edge, hardware - SM 1.x - which is, IIRC, GeForce 3/4MX and up on the nVidia side, and Radeon 8500 and up on the ATI/AMD side.

throw table_exception("(? ???)? ? ???");

Advertisement
Processors are so fast now you don't need a 2d accelerator at all. I'm doing it the 'old way', with no acceleration at all (I have my own alpha-blended bitblit) and I'm blitting thousands of transparent sprites every frame at 60fps with enough CPU time left over to run five or six instances of the game at once.

The advantage of the 3d api is that you get scaling and rotation for 'free'. The disadvantage is that everything has to be in power-of-2 sized textures, you can't take control at the pixel level without using pixel shaders, and texture filtering can cause all sorts of blending errors when using tile sheets.
That might be another possibility as well. It does at least solve the support and consistancy issues. I also have a very fast software renderer as well that supports most of the features I need, that last remaining bits would be easy enough to add, and I've been meaning to anyhow. Ultimately that may be the road travelled, but I would like to still hear more on how this would all come together using a 3D accelerator.

throw table_exception("(? ???)? ? ???");

Almost all commercial 2D games now a days use a 3D world, with just a 2D camera, ie stick the camera onto a plane. The camera then just rotates and slides around in the plane. This works great for topview and side view games. To determine whether disabling the Z buffer is quicker try profiling. This trick might work on some platforms, and not on the others.
I am currently using 3D acceleration for a 2D engine of my creation. The only difference is that I'm using DirectX9.

* Is it better to send only the visible geometry to the accelerator each frame, or would it be better to send the entire geometry to the card once, displaying only a portion of it for each frame?

I found that sending only the visible geometry will improve performance. Because the game is 2D it would be easier to either create a quad-tree or a simple double (or triple for blending) nested for loop to render only the known visible geometry. It is highly recommended that you use index buffers for everything except for maybe per-pixel rendering and single line renderings.

* Whats the best way to put a frame-based 2D animation from a texture-sheet onto a piece of geometry?

Texture coordinates of course. I would use an array of texture coordinates only. This will save all the positions of the individual textures while maintaining the frames for that particular texture animation. Just plug the coordinates into your 2D Sprite rendering function and you got it made.

* However, is it more performant to enable Z, then draw front-to-back, in order to get the early-z reject benefits to eliminate overdraw the way many FPSs do?

Oh yeah it is! I find that the more overdraw you have the slower the performance will be, but if you know you can avoid serious overdraw then I wouldn't even use the Z-Buffer. Using the Z-Buffer is a small performance hit, and should only be used when necessary.
For me, Z-buffering is one of the biggest draws of 3D. There are a lot of cases in a traditional 2D engine where there really is no perfect draw order, and you have to either do weird things with splitting up objects into small pieces, or constrain the engine so that certain things don't happen. Z buffering eliminates those problems.

You've got me interested in your first bullet point. I'm going to build a test prototype tonight to test which way might be better. I could see having acceptable performance just trying to draw the whole thing, considering how extremely simple and low-polygon-count an level would be. It would be interesting to see if just drawing the whole level would be faster than using a quad-tree structure, considering the overhead of using a quadtree. I use a simple node-based structure that calculates a set of visible tiles in the map as offsets from the camera position, and just draws a chunk of tiles. It's fast enough (FPS measured in the thousands on an NVidia GF FX card) but I've never really considered it alongside a quad-tree or brute force method.

As far as animated tiles, I always do objects like that separate from the static world geometry. Originally when I was working on the first Golem engine, I just stored frame animations as separate textures, where each frame had a color map texture for the frame of animation, and an alpha map texture for the shadow drawn on the ground plane. There was a performance hit from all the texture state switching, but it was never significant enough to try anything else. Frame rates typically stayed int he 500 - 600 range for me. (We're talking about an orthographically projected view here, I mean. You're not going to get vertex transformation bottleneck, fill rate isn't really going to be an issue, so you can afford a few slightly more expensive state changes)

Lately, though, I've been using full 3D characters in a 2D world because I detest the limitation and increased memory requirements of requiring 8, 16 or 32 pre-rendered facing directions. 8 and 16 look ugly, 32 bloats memory usage to ungodly proportions, forcing artificial limitations on the number of monster and character types allowed per level. Fully 3D characters drastically reduce memory requirements, increase performance since far fewer texture state changes are required (monsters sorted by texture skin, each monster type usually has only 1 or 2 textures per skin) and the artificial stiltedness of pre-rendered facing directions is replaced by rotateable 3D models that can face in any possible direction, and who's orientation can be represented by quaternions and smoothly interpolated.

Given your desire to use tile-sheets though, I'm wondering if you could implement a simple vertex program to do the texture-coordinate switching. Pass an object's current frame of animation as a vertex attribute which would cause the shader to offset the texture coordinates by some amount, based on the number of sprites per row/col on a texture sheet. This would let you set the texture coordinates of an animation frame defaulting to the first row/first col sprite in the sheet, and upload the frame geometry to vbo and not touch it again. Might be trickier if object animations are split up among many different texture sheets, but that could be worked around.
If you do ever run that test I'd be glad to hear the results. I'll post mine as well since it appears that I'll end up benchmarking that myself as well.

The z-buffer certainly does have other benefits. So thats something to consider as well, particularly when particle effects come into play.

Various granularities of quad-trees may be uesfull as well. Rather than going down to the atomic elements, the tiles, perhaps a 4x4 block of tiles would be appropriate? 1 screen's worth of tiles? something in between?

I'm sticking with classic, 4-way directional movement, at least for now. But eventually I'd like to move away from flat tiles, to simple "3D tiles" -- basically very simple textured geometry and lowish-poly-count models retaining the "squished sprite" look. That's really more of an engine 2.0 thing though, since that will require more advanced sub-systems. At that time I may opt for more free-form movement as well, we share the same oppinions on the bloated requirements for doing the same with 2D art.

throw table_exception("(? ???)? ? ???");

This topic is closed to new replies.

Advertisement