ARGH! Tilemap Framerate in D3D is Awful

Started by
5 comments, last by barakus 18 years ago
This is making me very sad. Why was DirectDraw deprecated? Here's the situation. I'm making a tile based game in C# using MDX and Direct3D. I'm trying to achieve real time framerates and having a dismal time at it. I have done some benchmarking with a PerformanceTimer and I've determined my problem is that I am calling DrawPrimative once for every visible tile. That is, to draw the map, I transform the world coordinates, call DP on a square vertex buffer, then move to the next tile. Typically I have about 1000 tiles on the screen at once. So this didn't seem like it would be a problem, even if this is not the way to draw a 3D scene. To draw the whole 100x100 map this way, without textures, gives 1.5 FPS. I did an experiment and made a single vertex buffer to render the whole map with one DP call (still passing in 60,000 vertices) and my FPS jumped to 40-80 (variance probably due to the presentation params and the vblank). I now have 3 options, that I can tell, all of them unappealing. 1. Trying using the MDX Sprite class. I hear different things about it's performance from different people. 2. Put all my textures into one big texture, put my whole map in one big vertex buffer, set all the texture coords of the individual tiles right, and blast the whole mess to the GPU with one DP call. This is not ideal since it makes updating the tilemap complicated. Especially for layers above the ground layer (critters, items, ect) that are sparse and would need constant updating. Also I would need to batch my textures if they don't fit on a single large texture in video mem. 3. Use DirectDraw, even though it is deprecated. Or use OpenGL, which doesn't have this problem of the DP-equivalent call costing so much. OpenGL is not natively supported by C#, so I'd rather not. There's also stuff like SDL.Net -- but it doesn't seem I should need a 3rd party lib to draw a tilemap. I mean, come on. So... What should I do? Discontinuing DirectDraw was so short-sighted. D3D really isn't a substitute. Why is the latency of DP so high? Modern GPUs had GBs of bandwidth, and I'm not close to using all mine.

Shedletsky's Bits: A Blog | ROBLOX | Twitter
Time held me green and dying
Though I sang in my chains like the sea...

Advertisement
PS - I am running on a 4 year old laptop with a Geforce 440 Go. However, I have written games in OpenGL that can handle 200,000 triangles in a scene at 30 FPS, drawing them in the same stupid way.

Shedletsky's Bits: A Blog | ROBLOX | Twitter
Time held me green and dying
Though I sang in my chains like the sea...

the problem is that you are calling DP too much, as you noted.

with 3D hardware you need to submit your graphics to the card in large batches.

a 'batch' occurs when you set some render states, or change textures, and then submit one or more trinagles via draw primitive.

as you say, you call draw primitive for each tile, this is very acceptable in directdraw but in Direct3D it is considered abusing the hardware.

so what you need to do, is sort your drawing by texture and render states.

that is, instead of

set texture to rockdrawprimitive for tile 0set texture to grassdrawprimitive for tile 1set texture to dirtdrawprimitive for tile 2set texture to rockdrawprimitive for tile 3


you need to do

set texture to rockdraw all tiles that are visible and using rockset texture to grassdraw all tiles that are visible and using grassetc.


now in this example you /could/ still use a drawprimitive for each tile, but that is still wasteful.

one way to solve this is to structure all /like/ tiles next to each other in a single static vertex buffer, so that you can render large sections of them at once by drawing many tiles that all use the same texture.

antoher approach which is more flexible but less performant is to use a dynamic vertex buffer that is locked and filled each frame with all of the quads you want to draw, by this time you've already recorded them and you put them into the vertex buffer so that like-textures are adjacent, this may be neccisary if your tiles can change the image they are displaying.

in short, it is very possible to do in Direct3D, but you cannot acomplish it performantly using the same approaches you did with DirectDraw, 3D hardware simply doesnt work that way.

that being said, you might be happier with directdraw, or SDL or some other 2D library.

Raymond Jacobs, Owner - Ethereal Darkness Interactive
www.EDIGames.com - EDIGamesCompany - @EDIGames

Your engine must be really flawed because you should be getting much faster speeds than that. My engine gives me 1200+ FPS rendering around 150 animated sprites to a dynamic vertex buffer and using index buffers. You sure you made your buffer dynamic? Otherwise, your gonna get a huge performance hit from locking the buffer.

Another thing, you shouldn't be drawing the whole map in one call. If the tile is not visible, do not upload it to the buffer.

And one more thing, Direct Draw can't even begin to compare to Direct3D for 2D games. Once you sort out the bottleneck in the engine(which isnt Direct3D's fault) you will be amazed at how powerful your engine can be.
The thing that bothers me the most is that it is actually not a hardware problem - otherwise it would occur in OpenGL when you draw a scene one quad at a time.

Also, 1000 DP calls a frame is just not that many, seems to me. What the hell is going on?

Shedletsky's Bits: A Blog | ROBLOX | Twitter
Time held me green and dying
Though I sang in my chains like the sea...

barakus:

I honestly don't know if my vertex buffer is dynamic or not - I cannibalized a MDX demo for my project - kind of learning by example here.

This is my VB definition:
tile_vb = new VertexBuffer(typeof(CustomVertex.PositionNormalTextured), 6, device, Usage.WriteOnly, CustomVertex.PositionNormalTextured.Format, Pool.Default);			tile_vb.Created += new System.EventHandler(this.OnCreateVertexBuffer);


I do lock it, but only once, when it is initialized I make a 2 element triangle list that together make a square tile.

Then I draw each tile like this (but only for the visible ones):

		private void DrawTile(int texID, float x, float y)		{			/* This may need to be optimized at some point. The bottleneck is			 * we change the texture we are rendering with a lot. This is apparently 			 * expensive. As a stop gap solution I have added an optimization			 * that takes advantage of temporal coherence. That is, if the texture			 * hasn't changed since last time, don't change it. This will happen			 * a lot given the current homogenous nature of the dungeon. We can 			 * increase performance with this even more by rendering the dungeon layers			 * at a time.			 */          			dev.SetStreamSource(0, tile_vb, 0);			dev.VertexFormat = CustomVertex.PositionNormalTextured.Format;					//	if(last_tex != texID)		//	{				// otherwise a texture swap is not necessary				dev.SetTexture(0, TileTex[texID]);            		//	}		//	last_tex = texID;			Matrix trans = Matrix.Translation(x*(float)GameData.TileSize*zoom,y*(float)GameData.TileSize*zoom,0);			Matrix scale = Matrix.Scaling(zoom*(float)GameData.TileSize,zoom*(float)GameData.TileSize,1);			scale.Multiply(trans);			dev.SetTransform(TransformType.World,scale);//TransformType.World,Matrix.Translation(x,y,0));			dev.DrawPrimitives(PrimitiveType.TriangleList, 0, 2);		}

Shedletsky's Bits: A Blog | ROBLOX | Twitter
Time held me green and dying
Though I sang in my chains like the sea...

Well, your buffer creation code should look something like this :
spriteBuffer = new VertexBuffer(                typeof(CustomVertex.PositionColoredTextured),                BUFFERSIZE,device,Usage.Dynamic | Usage.WriteOnly,                CustomVertex.PositionColoredTextured.Format,                Pool.Default)

This will create a vertex buffer with minimal locking latency.

Now, during the render loop, we iterate through the sprites. If a sprite is visible(ie within the view of the camera)we upload it to the buffer. You want to have a variable that counts the number of vertices that have been uploaded to the buffer(the number of sprites multiplied by the number of vertices on each sprite, 4 if your using an index buffer in conjunction with a vertex buffer, 6 if your using a vertex buffer alone).

Finally, when you either run out of sprites or need to change a renderstate, you draw all thats in the buffer at the moment(don't draw all the buffer, just the number of vertices you have counted), and then change the renderstate if needed and start the process again.

A really good article right here on gamedev dealing with this subject can be found here. It's in C++ but its fairly easy to convert to C#. Pay special attention to issue 4, batching.

Good luck!

EDIT: Also, something really bad about your drawing algorithm is that your setting the stream source and texture each frame. This is a big no-no.

This topic is closed to new replies.

Advertisement