2D in 3D: is my performance too low?

Started by
2 comments, last by Uttar 21 years, 9 months ago
Hello everyone, Maybe this isn''t the best forum because my engine is rectangular right now, but i still think it is since on other forums they''d have no idea of 2D in 3D performance. My 2D engine currently draws 1650 triangles ( 825 tiles, no objects ) I got a G4 Ti4200 and i''m getting about 187 FPS. But since i''d like to add objects and characters, i''d easily have 10000 triangles i think - so i thinked about a odd way o test this: drawing the same thing 6 times to have 9900 triangles drawn and with that, i get 87 FPS. So, is this really normal to get so little with a Ti4200 with 2D in 3D? Here are a few things i ( hopefully ) do right: 1. using indexed triangle lists with a minimum of DIP calls ( indexed triangle strips are NOT an option here ) 2. not using alpha blending ( i use alpha testing because i doesn''t reduce performance that much ) 3. Use one large 256x256 texture - DX SDK explains that they are the fastest - and i doesn''t have no change of texture too much if i only use 256x256 textures. Can put 64 tiles on them! 4. Using DirectInput for the keyboard and Windows Message system with the mouse ( it doesn''t work well at all for DI with mouse because i''m in windowed mode ) There still are a few things i''m worried about: 1. How come it seems i''m index limited and NOT vertice limited? When i reduce the vertice count by a LOT ( even up to 4 by alway drawing the same tile ) , it doesn''t help performance at all. 2. Wouldn''t performance be too low if i decided to use lighting or vertex/pixel shaders? Uttar
Advertisement
well, you are using triagle lists and drawing all the tiles. draw only the tiles on the screen. drawing things the way you are is inefficent, but ussually the only way 2d in 3d works well. unfortunatly 3d cards are not designd to do things as you are, and you will have to deal with it.

try non-indexd triangle lists. they may be faster. also make sure that you keep the number of locks down as well. this minium number of DIPc alls i assume is close to only 3 or 4 since you dont change the texture. the lock count should be the same or less.

tiles dont need alpha testing, unless you are drawing the screen with that many tiles. (825 seems like a lot). how big are the tiles and how many you draw to the screen? i certainly hope you are doing all your own transformations, since it seems that matrix transformations would slow things down immensily. furthermore dont even bother making a pure device, keep it software transformations, you wont be using the d3d TnL pipeline anyway.

remeberm you are transfering the vertex data each frame. this slows things down. there is not much you can do since you think you are doing everythign correct yet dont mention important thnigs like how many locks() you do, how many DIPs you do.
Sorry for not mentioning DIPs and locks.

DIPs: In the 187 FPS example, i use one DIP. In the 87 FPS example, i use 6 DIPs. In each case, right now, each DIP has 825 triangles.
Locks: Been optimizing that one for a while now, i''ll probably optimize it a little more later. Yep, one lock per frame, potentially 3 or 4 when i''ll have finished the program.
Draw All Screen: That isn''t what i do - remember this is rectangular I''m only drawing those 825 tiles - i have a 100x100 array with all the tiles so i have a potential 10000. I''m certainly NOT drawing that much!
The tiles are 32x32 - they are all in a 256x256 texture.
In a perfect world, i''d be drawing (1024/32)*(768/32) which is 32*24 -> 768. But this isn''t a perfect world so i''ve got to draw 825 - and i can''t write any less for sure.

Transformations: Well, was using matrices and TnL... Decided to change between my post and your reponse to vertex shaders. Works pretty well so far - i''ll need a little more experimenting before making 2D lights works however.

You also mention the idea of non-indexed triangle lists. Sounds pretty odd to me - could you please explain? I''m saving a lot of vertices by using my index buffer. Without it, i got 6000 vertices. This way, i only got about 3500 vertices or less.

Thanks for the idea of not using alpha testing with the tiles - i''ll only activate it when there''s objects. Forgot that one

Uttar
TnL and matrices are bad. vertex shaders are bad. use trasnform (XYZRHW) vertices. since you are doing VERY simple translations it will be much faster to do addition and subtraction vs doing the matrix muliplications. also sprites DEFINATLY will not be suitable for using matrices. later on i discuss why using index buffers are slower then straight vertex buffers.

your number of locks is good, as well as number of calls to DIP. this actually surprised me, most ppl who ask for help on this ussually are making one of those two mistakes. even 6-7 locks is planty acceptable. basically if you are locking each vertex buffer you have only once or twice per frame, and you are changing over 400-600 vertices each time. your good. just dont ever read from the vertex buffers (i doubt you are though).

i tested my tile renderer which does transformations in software (ie XYZRHW vertices) and draw a 33x25 map (ie 825 tiles) that are 32x32 pixels in size in a pretty large texture (1024x512) using DrawPrimitive(). there are in a triangle list. it gets locked once per frame and all the tiles positions are copied to the vertex buffer. i mantain a local copy of the vertex buffer (ie i malloc() a buffer) which stores the initial position of the tiles. when copying to the vertex buffer i handle offseting the screen positions, as well as only copy the tiles that will be on the screen (ie my world is actually 100x100 tiles).

i get 247 frames per second when drawing just the 825 tiles. i am using a geforce3. i was running at a 1024x768x32 in windowed mode (ie it was a WS_POPUP sized to fill the screen so isWindowed in the d3dparms struct was 1).

this translate to about 407,550 polygons per second. which is not too shabby. increasing the number of times things get drawn actually decreases the polygon throughput, mainly because its redrawing the entire screen each time so fillrate gets eaten rather quickly. making the tiles smaller means i can get a higher polygon throughput. this means that most of the reason you see such a drop in performence when redrawing the screen for testing theoretical limits is due to fillrate. in reality, after you draw the tiles, you are unlikly to fill it up more then 2 or 3 times (depending on if you have layers and such).

dont forget, triangle lists are slow, so they wont be as fast as other polygon types. so dont expect throughput similar to other normal 3d games (though you see to understand this). unfortunatly tile based games dont allow the use of anything other then triangle lists.

the number of vertices you are moving is important when dealing with transformations in normal 3d games. with the 2d in 3d problem, transformations are not that expensive and dont really warrent using TnL. the TnL will do transformations as if the vertices were in a 3d world, thus more work. you can easily out match the TnL hardware using simple addition and subtraction for your translations. furthermore, not using indexed buffers means the card does not have to read from two buffers, only a single one. this is faster. index buffers were a way to reduce the amount of transformations that you had to do to draw the polygons. again, transformations are cheap since you can do them with simple addition and subtraction in 2d. dont worry about sending too much vertex data over the agp bus. its unlikly you will saturate its bandwidth before you hit fill rate limitations. this goes for tile based games only (ie 2d in 3d stuff).

hopefully when you try these suggestions your engine will be blazing and you wont feel so restricted about how much you can have on screen in your game.


also a stupid question, but are you compiling to release mode?


i am sorry if this post sounds a bit contrdictory to the previous post (ie in realation to sending vertex data over the agp bus). i did not realize just how low the framerate your were getting till i tested my engine. i forgot just how many tiles were needed to fill the screen. i am glad i went back to my old tile engine for testing.

also dont forget to make sure no windows are on top of yours, this can cause stutters and slow down in the framerate as well.

[edited by - a person on July 2, 2002 4:23:53 AM]

This topic is closed to new replies.

Advertisement