Performance questions for a 2D game

Started by
13 comments, last by zedzeek 18 years, 8 months ago
Hi, I'm working on a tile-based 2D RPG with some other folks. Anyway when I joined the project, it was running at 200 fps, which is pretty horrid since my machine should be able to handle a 2D RPG without flinching... So, previously we had every image in its own file which is loaded into its own texture. I wrote a system that allows you to still keep images in their own files (for convenience) but at run time it will fit multiple images into one texture. So now the game runs at 650 fps. Actually I was surprised that it ran so fast- I hadn't even written a system to sort images by texture when rendering. However in practice it works quite well because I load all tiles into one texture, all characters into another texture, so batching is kind of automatic. Now, the other optimization that I'm considering is batching vertex submission. Currently I'm just sending vertices one at a time with glVertex2f() and friends. Of course in a 3D game this would be the kiss of death but for a 2D game I'm not sure since there aren't as many vertices to push through. Also, every frame I'd have to basically create a new vertex array and fill it up, so that probably has some performance impact. What do you guys think, should I spend some time on improving the vertex submission? If so, what would be a good way to do it- filling up a vertex array every frame? Thanks very much, roos
Advertisement
You don't have to make any complicated batching stuff with VBO's or vertex arrays.
One optimization you could do (assuming you haven't done already) is to render as many quads as possible between glBegin()...glEnd(). I'm not sure how much would that help, because I mainly work with DirectX.
But anyway, vertex submission isn't the worst bottleneck, especially in a 2D game, but fillrate is. Seems like your game is running pretty fast already, so I don't think you should spend anymore time optimizing vertex submission.
650 FPS isn't anything to laugh at. (Neither is 200 FPS, but...) If your getting speeds like this optimizing the rendering should probably be the last of your worries. The only time I every worry about my frame rate is when it starts dropping around that magic 60 number ^_^

In any case, if you DO want to squeeze every last frame out of it, then I don't know that vertex batching is going to help a whole lot. As pointed out earlier at this point you're probably 100% fill-rate limited, so effecient vertex transfer won't make a lick of visible different. I would look into a simple sort-by-material/texture routine, like you mentioned, but I think that the biggest think to look into may be finding a way to "cache" your scene as much as possible.

While I'm not familiar with this game's actual setup, most of the time a 2D RPG involves a very few moving charecters and a lot of static background. If you could render the static areas to a texture at certain view-change intervals you would be able to quickly throw the background up on screen and only draw the dynamic charecters. This may or may not work for your setup, but it's worth looking into.
// The user formerly known as Tojiro67445, formerly known as Toji [smile]
Quote:Original post by centipede
You don't have to make any complicated batching stuff with VBO's or vertex arrays.
One optimization you could do (assuming you haven't done already) is to render as many quads as possible between glBegin()...glEnd(). I'm not sure how much would that help, because I mainly work with DirectX.
But anyway, vertex submission isn't the worst bottleneck, especially in a 2D game, but fillrate is. Seems like your game is running pretty fast already, so I don't think you should spend anymore time optimizing vertex submission.


Thanks, that info really helps! Hmm, I'm a bit rusty with OpenGL myself since I've been using DirectX a lot the last year. So... Putting all the draws in between a glBegin() and glEnd() sounds like a good idea- I'll have to look into one thing though... I know there are certain OpenGL commands that you're NOT allowed to have within a glBegin/glEnd block, but I can't remember which. That might be a problem since for each image that's drawn, there's a little bit of setup code that needs to be done like binding texture, and setting up a transform using mostly glTranslatef.

I have another question while I'm at it. Say you call glBindTexture() 500 times for drawing 500 images, but each time you bind to the same texture. Should I implement a check that keeps track of the last bound texture and skips binding the texture if you do the same one twice? My guess is that it's not necessary- OpenGL already seems to eliminate the 499 redundant calls, judging from how my frame rate skyrocketed when I put all my images in 1 texture.

Quote:Original post by skittleo
200fps is not slow. Did you mean 20? Or could your problem lie in your timing system?


Yeah that's a good point :) There are two reasons I said 200 fps is slow... Firstly it's because this was a release build on a very very fast machine, so on lower-end machines it'd be likely to run much slower. Secondly, our game isn't done yet- right now it was giving 200 fps on just a simple tile map. In the near future we're planning to add loads of eye candy, like realistic water, multitexturing for smooth transitions between tiles, simple 2D lighting effects, particle systems, etc... So, I wanted to make sure that at least the core of the rendering was decently optimized before going forwards.

Quote:Original post by Toji
In any case, if you DO want to squeeze every last frame out of it, then I don't know that vertex batching is going to help a whole lot. As pointed out earlier at this point you're probably 100% fill-rate limited, so effecient vertex transfer won't make a lick of visible different.


Hmm, yeah I guess you're right :) I'll probably just leave the tile rendering as it is and only worry about vertex batching for the particle systems.

Quote:Original post by TojiWhile I'm not familiar with this game's actual setup, most of the time a 2D RPG involves a very few moving charecters and a lot of static background. If you could render the static areas to a texture at certain view-change intervals you would be able to quickly throw the background up on screen and only draw the dynamic charecters. This may or may not work for your setup, but it's worth looking into.


Thanks, I'll check into it although I'm not quite sure how that'd work for this setup. The two "thorn in my sides" are that the background is constantly scrolling which makes it a bit tougher, and also some of the map tile layers aren't 100% "background" since they can be displayed on top of the character.

Anyway again thanks very much guys, I guess I'll ditch the idea of optimizing the vertex submission, and I might even not bother sorting them when rendering since grouping images into the same texture largely solves that problem.

Thanks,
roos
Quote:Original post by roos

I have another question while I'm at it...

Usually drivers notice redundant render state settings and won't execute them. So, you don't need to add any checking for them in your code (although it wouldn't hurt).
Sweet, thanks :)
Quote:Original post by centipede
Quote:Original post by roos

I have another question while I'm at it...

Usually drivers notice redundant render state settings and won't execute them. So, you don't need to add any checking for them in your code (although it wouldn't hurt).


not true
eg
for ( i=0; i<10000;i++)
glBindTexture( texture0, grass_textureID )

is 10x slower (not a few percent but 1000%) than

for ( i=0; i<10000;i++)
{
if ( currenttextureID != grass_textureID )
{
glBindTexture( texture0, grass_textureID )
currenttextureID = grass_textureID;
}
}
Quote:Original post by zedzeek
not true
eg
....

I'd disagree with you there, zedzeek. I've preformed similar tests on multiple cards and MOST of the time the speed comes out just about even.

I think that the exact behavior in this case is entirely driver dependant. A driver maker is allowed a lot of interpretation of the OpenGL specs when it comes to optimizations like this, and thus while some drivers may contain a simple redundancy check, others may not. I think the best path for a game programmer is to implement one yourself regaurdless, as a double redundancy check (yours and the drivers) is certainly going to be less painful than none at all.

// The user formerly known as Tojiro67445, formerly known as Toji [smile]
The best way in my opinion to do what you are trying to do is to batch as many sprites as you can in one glDrawElements. for example what you need to do is Create an Index Buffer or in GL Term an GLuint* Buffer that will hold all the Indices of the specified vertex buffer for the sprite to draw. For example if you have 3 textures then you need to 3 Index Buffer One for each Textures and at load them you sort them and put them in their own vertex Indices then all you have to do in your rendering is call them one by one. So if we had 3 Texture then we would just do
for(int i=0;i<3;i++) {
glBindTexture(GL_TEXTURE_2D,tex);
glDrawElements(GL_QUADS,sizeof(buffer),GL_UNSIGNED_BYTE,buffer);
}

so as you can see the Texture get set only 3 times and you have all your sprites on screen. i hope you understand what i am trying to say. I did something like that last year and when i render around 2000 sprites i was still getting 600fps
Quote:Original post by Toji
Quote:Original post by zedzeek
not true
eg
....

I'd disagree with you there, zedzeek. I've preformed similar tests on multiple cards and MOST of the time the speed comes out just about even.


did u try on a nvidia card?, i tested about a month ago the above bit of code + theres a huge difference

This topic is closed to new replies.

Advertisement