As I have been browsing around topics about SDL, I saw SFML mentioned as a potentially better and more sophisticated API for certain purposes. I haven't had performance issues with my game in SDL yet, what made me try to convert to SFML is the scaling, rotation and some alpha blending features it has. Even though my game isn't crazy big, I have decided that I will first try to rewrite my very first working build, which I didn't divide to multiple cpp and header files yet, and was only about 1K lines.
After I have rewritten it successfully, and ran it in Debug (in Visual Studio 2010) I ran about 5FPS, or lower. After some searching around I saw it mentioned that the debug libraries are slower, I should use Release instead, so I did. When I ran that inside Visual Studio with F5 it was faster, but still unacceptable frame rate (60FPS is the target). When I just build an executable and run it outside VS it ran okay, but when I increased the particles drawn a bit, the performance dropped. When I compared the SDL build with the SFML build, the SDL one got better performance.
Since the SFML sample projects that I compiled with the exact same settings, in the same solution file ran with 2000+ FPS I have figured there must be something wrong with my code and I am probably misusing some expensive SFML calls.
Most of the game objects are added this way:
void World::explosion(int x, int y){
for (int i=0, r=((rand() % 100) -80); i < 120 + r ; i++){
int xx = x + rand() % 10 - 5;
int yy = y + rand() % 10 - 5;
Every actual image is only loaded once, during the initialization, and every object only has a sprite associated with the image. The objects will be drawn later in the game loop.
Later I have discovered, that even if I don't draw the objects, don't use any SFML draw calls at all during the game loop, just have the c++ logic running, with vector.push_back loops, the frame rate drops drastically.
This would make me think that what I am doing is too expensive on the CPU, too many vector operations are going on, and this has nothing to do with SFML, but all this runs perfectly fine when I use SDL.
At this point I am not even sure what exactly to ask, or what information to provide. Is this something that has to do with my VS 2010 solution settings perhaps? How come the sample projects with the debug SFML library in visual studio (pong for example) runs 800+FPS, while my game only gets 1-5 with those settings?
I have removed event polling too and I have removed the clock to make sure I am not doing something bad with that, that would affect the frame rate.
It looks like that the c++ operations are the bottleneck, because the frame rate is the same even if I comment out the SFML draw lines, but with the exact same, unchanged logic, with the SDL solution and SDL calls runs just fine.
How should I try to tackle this problem, what do you guys think that causes this?
Let me know if you would need my settings, or parts of my code, or something
Thanks in advance
EDIT:
Here are the frame rates of the sample pong.cpp included with SFML, and with my project
All are compiled from the same VS 2010 solution file
pong | Debug configuration | Started with F5 (Start Debugging) : 130FPS
pong | Debug configuration | Started with CTRL + F5 (Start without Debugging) : 175 FPS
pong | Release configuration | Started with F5 (Start Debugging) : 700 FPS
pong | Release configuration | Started with CTRL + F5 (Start without Debugging) 1600 FPS
myGame | Debug configuration | Started with F5 (Start Debugging) : 0 FPS, didnt draw another frame for 20seconds
myGame | Debug configuration | Started with CTRL + F5 (Start without Debugging) : ~1 FPS
myGame | Release configuration | Started with F5 (Start Debugging) : 2-3 FPS
myGame | Release configuration | Started with CTRL + F5 (Start without Debugging) : 27 FPS
If you know beforehand how many elements will be in the vector, then use reserve or resize to resize the vector once, instead of multiple times in the loop. Note that dynamic arrays, in order to assure that elements are contiguous in memory, must allocate a new larger array every time you attempt to add an element when the array is at full capacity.
int ct = 0;
v_particles.reserve( ct );
for ( int idx = 0; idx < ct; ++idx ) {
int xx = x + rand() % 10 - 5;
int yy = y + rand() % 10 - 5;
v_particles.push_back( Particle(xx, yy, 6) );
}
Also, if you're using Visual Studio, grab AMD CodeAnalyst, profile your code, and find out where exactly your bottleneck lies.
I have tried using vector.reserve before, set it to a large number, but it made no difference in performance.
Either way the exact same code running in my SDL version, runs faster than in the SFML one, even if SFML's draw isn't called, only the logic is running...
You mentioned you changed it to the "Release" configuration, does your "Release" configuration link with the debug versions of your libraries or the release versions?
11 functions, 82 instructions, Total: 442 samples, 100.00% of shown samples, 2.27% of total session samples
So the way I tested performance, since in this early version of my game there aren't many objects, I increased the particles coming out from the player's thruster (in both SDL and SFML, the same number)
In the SFML one it seems like this takes up everything.
Here is the code for Particle::show:
void Particle::show()
{
//Show image
if (initialized == false){
if (type == 1) part = p1;
else if (type == 2) part = p2;
else if (type == 3) part = p3;
else if (type == 4) part = p4;
else if (type == 5) part = bparticle;
else if (type == 6){
if ((rand() % 4) == 0)
part = p1;
else if ((rand() % 4) == 1)
part = p2;
else if ((rand() % 4) == 2)
part = p3;
else if ((rand() % 4) == 1)
part = p4;}
initialized = true;
}
//Animate
//stuff here is not relevant, and its exactly the same in both
EDIT:
Since then I tested it with part.SetPosition(x,y); and App.Draw(part); commented out, so they are not drawn, and still Particle::show takes up 61% of the resources (was 74 previously)... with that commented out, the SDL and SFML Particle::show() are identical, except in SDL they actually get drawn.
I'll put the same code here for SDL for comparison, so you won't have to scroll back and forth.
void Particle::show()
{
//Show image
if (part == NULL){
if (type == 1) part = p1;
else if (type == 2) part = p2;
else if (type == 3) part = p3;
else if (type == 4) part = p4;
else if (type == 5) part = bparticle;
else if (type == 6){
if ((rand() % 4) == 0)
part = p1;
else if ((rand() % 4) == 1)
part = p2;
else if ((rand() % 4) == 2)
part = p3;
else if ((rand() % 4) == 1)
part = p4;}
}
if(alive)
apply_surface( x, y, part, screen );
//Animate
And CodeAnalyst result for the SDL code, Particle::show doesn't strain it as much at all
EDIT:
Since then I tested it with part.SetPosition(x,y); and App.Draw(part); commented out, so they are not drawn, and still Particle::show takes up 61% of the resources (was 74 previously)... with that commented out, the SDL and SFML Particle::show() are identical, except in SDL they actually get drawn.
This would suggest that the API is the bottleneck (as indicated earlier in the thread). If I recall correctly, SFML uses OpenGL 1.1 immediate-mode calls, which would mean for every particle rendered, there's a call to glBindTexture, glPushMatrix/glPopMatrix, and glBegin/glEnd. For something like a particle system, the overhead in each of these calls, while not significant on their own, can snowball. To reduce the overhead from texture switching, place all of your particle textures on one sheet. Since SFML doesn't seem to have any feature that will allow us to assign a part of an Image to a Sprite, you'll need to do the rendering yourself through raw OpenGL calls. By doing so, you can optimize out some OpenGL calls and will allow you to do batching, among other things:
For further optimizations, you can use VBOs, use the CPU to update all the vertices of all the particles, then send the entire buffer to the GPU in one call.
SFML 1.6 is definitely too slow for this.
SFML 2.0 (current) is better if you draw multiples sprites that use the same texture.
SFML 2.0 (future) will be much better, just wait a little bit [/font]
To reduce the overhead from texture switching, place all of your particle textures on one sheet. Since SFML doesn't seem to have any feature that will allow us to assign a part of an Image to a Sprite, you'll need to do the rendering yourself through raw OpenGL calls. By doing so, you can optimize out some OpenGL calls and will allow you to do batching, among other things:
As a corollary to my other post, this is why it doesn't work in 1.6 and sorta works in 2.0. Instead of using sf::Image, you use sf::Texture in 2.0, which allows you to load your image from a single sprite sheet.