How to Make Tile-Rendering Fast

Started by
6 comments, last by MasterWorks 18 years, 7 months ago
My roguelike game Witherwyn is written in C# and uses DirectX to draw a tilemap. Right now, I'm doing the naive thing... I create a vertex buffer containing a single unit square made of two triangles. To draw a single tile, I translate this square to the right place, set it's texture. I do this for each layer and for each visible tile I need to draw. Tiles that aren't currently visible, but that the player has seen get a 50% opaque black quad drawn on top of them to fade them out.

dev.SetStreamSource(0, tile_vb, 0);
dev.VertexFormat = CustomVertex.PositionNormalTextured.Format;
dev.SetTexture(0, TileTex[texID]);
		
Matrix trans = Matrix.Translation(x*(float)GameData.TileSize*zoom,y*(float)GameData.TileSize*zoom,0);
Matrix scale = Matrix.Scaling(zoom*(float)GameData.TileSize,zoom*(float)GameData.TileSize,1);

scale.Multiply(trans);

dev.SetTransform(TransformType.World,scale);
dev.DrawPrimitives(PrimitiveType.TriangleList, 0, 2);
The problem is that it can be kind of slow (~60-70 ms when the map is zoomed out - geforce4MX). I'd like to add animations to my game (it's currently turn-based) and I need a higher framerate. I have several thoughts about this: * I know that changing the texture of a tile is expensive, and I should draw all the tiles of one texture at a time. The thing is, I don't know an easy way to manage that. * I know that DX is not optimized to render one quad at a time and that I should be sending several thousand triangles at a time for good performance. How do I batch up my primitive drawing in this way? * Maybe I should put all my tiles on one big texture? Instead of having lots of small ones? How would this interact with texture filtering? I don't want tiles bleeding into each other. Thanks for any input!

Shedletsky's Bits: A Blog | ROBLOX | Twitter
Time held me green and dying
Though I sang in my chains like the sea...

Advertisement
First off, you can save a small amount of CPU power if you were to use triangle strips instead of triangle lists. ie. You would need 4 verticies in your buffer rather than 6... which means 2 fewer verticies per tile drawn. To draw a quad with a triangle strip you just need to follow a Z pattern.
That said, you are correct. You really should draw all the tiles of the same texture at once. That would require a dynamic vertex buffer. Using an index buffer would increase performance further, but the math can be complicated and it's best to get it working with just the vertex buffer first. You mentioned that you have a fixed size square around the origin which you are translating. What you want to do is populate the vertex buffer with all the tiles of that one texture. You need to use triangle lists here though since there are a bunch of disconnected triangles. Supposing you have the tiles pointing up on the Y axis, all you need to do is simple addition. If your top left vertex is at 1,1 and the location of where you WANT the triangle to be is x,y then the translated position of that particular vertex is x+1,y+1 -- and so on for the others (it helps to draw a quick sketch) -- so you would also need (1,-1), (-1,-1), and (-1,1) to cover all the points. You don't need to worry about rotations I assume so that makes your job and the CPU's a lot easier, since otherwise you need to deal with matrix transforms (which are fairly simple, but take more time than mere addition [wink]). So then you need to populate the vertex buffer with all the tiles of that texture, render then, move to the next set. I expect you will find this to be significantly more efficient. Feel free to ask any other questions you may have.

Alternatively, have you considered the Sprite class? I personall perfer the method above, but people here will undoubtedly tell you the Sprite class is good, so you might want to see if it's for you.
I haven't actually tried the Sprite class. I've never used it, but when it was first introduced (DX7?) I heard some bad things about it. It would need to scale up to the tens of thousands of sprites to handle drawing my maps (100x100xN layers). My guess is that it was designed for simple arcade-like games and is not up to the task.

Anyway. Thanks for your advice. rating++;

Shedletsky's Bits: A Blog | ROBLOX | Twitter
Time held me green and dying
Though I sang in my chains like the sea...

I would definitely recommend using an index buffer. It´s not THAT complicated at all, it saves memory and can grant a performance benefit.
Assuming the use of an index buffer, I would render the geometry as a triangle list, so you can simply add them together into one large index buffer.

Imagine you´ve got a 2x3 tilemap like this:
0---1---2---3|A  |A  |B  ||   |   |   |4---5---6---7|B  |c  |B  ||   |   |   |8---9---10--11

The letters stand for different textures on the quads, the numbers are the indices of the vertices in your vertex array. If you use an index buffer and you use the same UV-coordinates for every texture, you could store all needed vertices in one vertex buffer, simplifying your task because you don´t have to switch vertexbuffers.
So you´ve got a vertex array containing those twelve vertices. If you use triangle lists for your tile rendering, the index buffer should look something like this:
{ 0, 1, 4,  1, 5, 4,  <---- tris of left quad with texture A  1, 2, 5,  2, 6, 5,  <---- tris of right quad with texture A  2, 3, 6,  3, 7, 6,  <---- tris of quad with texture B in the upper row  ...}

Going on like this would give you an index buffer to draw your whole tile-map in one draw-call (with one texture only of course)... but that isn´t what you want.
Now you want to use different textures. I would use a different index buffer (or at least a seperate index array) for use with every texture. This would mean one index array for texture A, one for texture B and one for C. Whether you copy them in one index buffer for every texture (would mean 3 index buffers here) or use only one index buffer for all textures is up to you, I guess. But the different index arrays for the three textures could look like this:
{ 0, 1, 4,  1, 5, 4,  1, 2, 5,  2, 6, 5 }  <---- indices for use with texture A{ 2, 3, 6,  3, 7, 6,  6, 7, 10, 6, 11, 10,  4, 5, 8,  5, 9, 8 }  <---- indices for use with texture B{ 5, 6, 9,  6, 10, 9 } <---- indices for texture C

As you may have noticed, nothing changed with the order of the vertices / indices, I simply put the indices representing each quad that uses a texture in the corresponding index array.
Perhaps you would want to look into index buffers first, but I think you will find they´re quite easy to use.
That is a good point. You can have a static vertex buffer that fills the screen and simply use a dynamic index buffer to draw the desired tiles. Makes life that much simpler [wink]
I'm going to need to go through the docs on this - I didn't know that there were two kinds of vertex buffers (static/dynamic). Two questions:

1. If I'm going to fill a vertex buffer full of all the quads with a single texture in the view frustrum, do I have to rebuild this buffer from scratch every frame? Most tiles will move only rarely.

2. Let's envision the extreme case where my tile map has no two tiles that use the same texture. Is it true that bad performance due to frequent texture swapping is unavoidable at this point?

Shedletsky's Bits: A Blog | ROBLOX | Twitter
Time held me green and dying
Though I sang in my chains like the sea...

I'd say the Sprite class is definitely worth a look, in your case. It's fast and simple enough to quickly implement so you could compare rendering speeds. Easy as:

	device.BeginScene();	device.Clear(ClearFlags.Target, Color.Black, 0, 0);	sprite.Begin(SpriteFlags.None);	sprite.Draw(someTexture, new Rectangle(0,0,32,32), new Vector3(0.5f,0.5f,0), new Vector3(x,y,0),	Color.White);	sprite.End();	device.EndScene();	device.Present();


BTW, your roguelike is looking great so far. I have an incurable weakness to roguelikes; I love to delve into in-development ones! Keep us posted on how it develops. :)
Quote:Original post by Telamon
I'm going to need to go through the docs on this - I didn't know that there were two kinds of vertex buffers (static/dynamic). Two questions:

1. If I'm going to fill a vertex buffer full of all the quads with a single texture in the view frustrum, do I have to rebuild this buffer from scratch every frame? Most tiles will move only rarely.?

The differences between the 2 kinds of buffers are significant, but if you want to keep things general you have to use a dynamic buffer so you can have animation, changing vertices, etc. The idea with a dynamic buffer is that you make it massive, and each frame you write all the vertices to the next available positions, and repeat until it is full (many frames hopefully). Whenever you must change texture or blend mode (say after 120 vertices), you'll have to drawPrimitive all the vertices in your buffer (0-119) and start anew (next drawPrimitive with new texture starts at vertex 120-???). Whenever you update your dynamic vertex buffer, you have to lock it, so be careful with the locking flags as that is where you can leverage the power of the type of VB you are using. In my example, you would lock with NOOVERWRITE each time except when you have to quit because the VB is full, when you use DISCARD and start again at vertex 0.

If you decide that static VB is for you, the idea is that you can upload a bunch of vertices to your card where they will be stored for fast rendering. This is the case where you want to use matrix transformations so that you can quickly reposition/scale/etc a large number of vertices without needing a lock. I don't think doing this repeatedly for 4/6 vertices (a tile) is worthwhile...

Quote:Original post by Telamon
2. Let's envision the extreme case where my tile map has no two tiles that use the same texture. Is it true that bad performance due to frequent texture swapping is unavoidable at this point?

Your performance will suffer but keep in mind you can probably change texture dozens of times per frame and still get solid performance. But hundreds of times is not a good idea.

The key to 2D type applications is to bunch groups of sprites together into one large "texture" so that you can kind of ignore the limitation in most cases. Make large textures and then use texture coordinates to index your tiles/sprites into that texture in whatever way you choose. If you use dynamic VBs you can even update the texture coordinates each frame for essentially no penalty -- for instance, if you had an explosion animation of 64 frames in a large texture, you could render 50 explosions in one drawPrimitive regardless of which frame each explosion was on. It's very easy to make a render class for a dynamic VB because all you have to do is keep adding vertices to your vertex que until you hit a new texture (or it gets full), then render it and start over. It makes no difference how many sprites you have or what they are doing. Just try to sort by texture in some way. I use a layer system, where I place each sprite on a layer and render layer 1, 2, 3,... in succession, so I just try to keep sprites that use the same texture on the same layer if at all possible.



This topic is closed to new replies.

Advertisement