Jump to content

  • Log In with Google      Sign In   
  • Create Account

SDL->SFML conversion; very strange performance issues


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
19 replies to this topic

#1 chondee   Members   -  Reputation: 135

Like
0Likes
Like

Posted 16 January 2012 - 10:58 PM

Hi everyone,

As I have been browsing around topics about SDL, I saw SFML mentioned as a potentially better and more sophisticated API for certain purposes. I haven't had performance issues with my game in SDL yet, what made me try to convert to SFML is the scaling, rotation and some alpha blending features it has. Even though my game isn't crazy big, I have decided that I will first try to rewrite my very first working build, which I didn't divide to multiple cpp and header files yet, and was only about 1K lines.

After I have rewritten it successfully, and ran it in Debug (in Visual Studio 2010) I ran about 5FPS, or lower. After some searching around I saw it mentioned that the debug libraries are slower, I should use Release instead, so I did. When I ran that inside Visual Studio with F5 it was faster, but still unacceptable frame rate (60FPS is the target). When I just build an executable and run it outside VS it ran okay, but when I increased the particles drawn a bit, the performance dropped. When I compared the SDL build with the SFML build, the SDL one got better performance.

Since the SFML sample projects that I compiled with the exact same settings, in the same solution file ran with 2000+ FPS I have figured there must be something wrong with my code and I am probably misusing some expensive SFML calls.

Most of the game objects are added this way:
void World::explosion(int x, int y){
for (int i=0, r=((rand() % 100) -80); i < 120 + r ; i++){
int xx = x + rand() % 10 - 5;
int yy = y + rand() % 10 - 5;

Particle a(xx, yy, 6);
v_particles.push_back(a);
}

Every actual image is only loaded once, during the initialization, and every object only has a sprite associated with the image. The objects will be drawn later in the game loop.

Later I have discovered, that even if I don't draw the objects, don't use any SFML draw calls at all during the game loop, just have the c++ logic running, with vector.push_back loops, the frame rate drops drastically.

This would make me think that what I am doing is too expensive on the CPU, too many vector operations are going on, and this has nothing to do with SFML, but all this runs perfectly fine when I use SDL.


At this point I am not even sure what exactly to ask, or what information to provide. Is this something that has to do with my VS 2010 solution settings perhaps? How come the sample projects with the debug SFML library in visual studio (pong for example) runs 800+FPS, while my game only gets 1-5 with those settings?
I have removed event polling too and I have removed the clock to make sure I am not doing something bad with that, that would affect the frame rate.

It looks like that the c++ operations are the bottleneck, because the frame rate is the same even if I comment out the SFML draw lines, but with the exact same, unchanged logic, with the SDL solution and SDL calls runs just fine.


How should I try to tackle this problem, what do you guys think that causes this?
Let me know if you would need my settings, or parts of my code, or something

Thanks in advance

EDIT:
Here are the frame rates of the sample pong.cpp included with SFML, and with my project
All are compiled from the same VS 2010 solution file

pong | Debug configuration | Started with F5 (Start Debugging) : 130FPS
pong | Debug configuration | Started with CTRL + F5 (Start without Debugging) : 175 FPS
pong | Release configuration | Started with F5 (Start Debugging) : 700 FPS
pong | Release configuration | Started with CTRL + F5 (Start without Debugging) 1600 FPS

myGame | Debug configuration | Started with F5 (Start Debugging) : 0 FPS, didnt draw another frame for 20seconds
myGame | Debug configuration | Started with CTRL + F5 (Start without Debugging) : ~1 FPS
myGame | Release configuration | Started with F5 (Start Debugging) : 2-3 FPS
myGame | Release configuration | Started with CTRL + F5 (Start without Debugging) : 27 FPS



myGame with SDL calls, 60FPS fixed

Edited by chondee, 16 January 2012 - 11:57 PM.


Sponsor:

#2 fastcall22   Crossbones+   -  Reputation: 4461

Like
1Likes
Like

Posted 16 January 2012 - 11:13 PM

If you know beforehand how many elements will be in the vector, then use reserve or resize to resize the vector once, instead of multiple times in the loop. Note that dynamic arrays, in order to assure that elements are contiguous in memory, must allocate a new larger array every time you attempt to add an element when the array is at full capacity.
int ct = 0;
v_particles.reserve( ct );
for ( int idx = 0; idx < ct; ++idx ) {
    int xx = x + rand() % 10 - 5;
    int yy = y + rand() % 10 - 5;

    v_particles.push_back( Particle(xx, yy, 6) );
}

Also, if you're using Visual Studio, grab AMD CodeAnalyst, profile your code, and find out where exactly your bottleneck lies.
c3RhdGljIGNoYXIgeW91cl9tb21bMVVMTCA8PCA2NF07CnNwcmludGYoeW91cl9tb20sICJpcyBmYXQiKTs=

#3 chondee   Members   -  Reputation: 135

Like
0Likes
Like

Posted 16 January 2012 - 11:27 PM

I have tried using vector.reserve before, set it to a large number, but it made no difference in performance.
Either way the exact same code running in my SDL version, runs faster than in the SFML one, even if SFML's draw isn't called, only the logic is running...

I'll check out AMD CodeAnalyst, thanks!

#4 fastcall22   Crossbones+   -  Reputation: 4461

Like
0Likes
Like

Posted 16 January 2012 - 11:33 PM

You mentioned you changed it to the "Release" configuration, does your "Release" configuration link with the debug versions of your libraries or the release versions?
c3RhdGljIGNoYXIgeW91cl9tb21bMVVMTCA8PCA2NF07CnNwcmludGYoeW91cl9tb20sICJpcyBmYXQiKTs=

#5 chondee   Members   -  Reputation: 135

Like
0Likes
Like

Posted 16 January 2012 - 11:38 PM

It links the release versions (the clean ones, without -d or -s or -s-d)

Here are the frame rates of the sample pong.cpp included with SFML, and with my project
All are compiled from the same VS 2010 solution file

pong | Debug configuration | Started with F5 (Start Debugging) : 130FPS
pong | Debug configuration | Started with CTRL + F5 (Start without Debugging) : 175 FPS
pong | Release configuration | Started with F5 (Start Debugging) : 700 FPS
pong | Release configuration | Started with CTRL + F5 (Start without Debugging) 1600 FPS

myGame | Debug configuration | Started with F5 (Start Debugging) : 0 FPS, didnt draw another frame for 20seconds
myGame | Debug configuration | Started with CTRL + F5 (Start without Debugging) : ~1 FPS
myGame | Release configuration | Started with F5 (Start Debugging) : 2-3 FPS
myGame | Release configuration | Started with CTRL + F5 (Start without Debugging) : 27 FPS



myGame with SDL calls, 60FPS fixed

Edited by chondee, 16 January 2012 - 11:57 PM.


#6 fastcall22   Crossbones+   -  Reputation: 4461

Like
0Likes
Like

Posted 17 January 2012 - 12:18 AM

I'd have a look at what CodeAnalyst has to say about your code...
c3RhdGljIGNoYXIgeW91cl9tb21bMVVMTCA8PCA2NF07CnNwcmludGYoeW91cl9tb20sICJpcyBmYXQiKTs=

#7 chondee   Members   -  Reputation: 135

Like
0Likes
Like

Posted 17 January 2012 - 12:49 AM

Thank You!
This is the first time I use CodeAnalyst, I guess this is what we need.
This one is for the SFML code:

CS:EIP Symbol + Offset Timer samples
0xd33560 Particle::show 73.98
0xd31a70 Player::show_particles 13.12
0xd36150 std::_Remove_if<Particle *,bool (__cdecl*)(Particle)> 9.28
0xd37290 std::_Uninit_copy<Particle *,Particle *,std::allocator<Particle> > 0.9
0xd32800 World::show_stars 0.68
0xd36ed0 std::_Find_if<Star *,bool (__cdecl*)(Star)> 0.68
0xd363c0 std::_Remove_if<Star *,bool (__cdecl*)(Star)> 0.45
0xd32630 World::show_enemies 0.23
0xd32790 dead_s 0.23
0xd33410 Star::show 0.23
0xd34d60 main 0.23


11 functions, 82 instructions, Total: 442 samples, 100.00% of shown samples, 2.27% of total session samples

So the way I tested performance, since in this early version of my game there aren't many objects, I increased the particles coming out from the player's thruster (in both SDL and SFML, the same number)

In the SFML one it seems like this takes up everything.

Here is the code for Particle::show:
void Particle::show()
{
	//Show image
if (initialized == false){
if	  (type == 1) part = p1;
else if (type == 2) part = p2;
else if (type == 3) part = p3;
else if (type == 4) part = p4;
else if (type == 5) part = bparticle;
else if (type == 6){
  if ((rand() % 4) == 0)
  part = p1;
  else if ((rand() % 4) == 1)
  part = p2;
  else if ((rand() % 4) == 2)
  part = p3;
  else if ((rand() % 4) == 1)
  part = p4;}
initialized = true;
}

	if(alive)
{
part.SetPosition(x,y);
App.Draw(part);

//apply_surface( x, y, part, screen );
	}

	//Animate
//stuff here is not relevant, and its exactly the same in both

EDIT:
Since then I tested it with part.SetPosition(x,y); and App.Draw(part); commented out, so they are not drawn, and still Particle::show takes up 61% of the resources (was 74 previously)... with that commented out, the SDL and SFML Particle::show() are identical, except in SDL they actually get drawn.


I'll put the same code here for SDL for comparison, so you won't have to scroll back and forth.
void Particle::show()
{
	//Show image
if (part == NULL){
if	  (type == 1) part = p1;
else if (type == 2) part = p2;
else if (type == 3) part = p3;
else if (type == 4) part = p4;
else if (type == 5) part = bparticle;
else if (type == 6){
  if ((rand() % 4) == 0)
  part = p1;
  else if ((rand() % 4) == 1)
  part = p2;
  else if ((rand() % 4) == 2)
  part = p3;
  else if ((rand() % 4) == 1)
  part = p4;}
}

	if(alive)
apply_surface( x, y, part, screen );
  
	//Animate


And CodeAnalyst result for the SDL code, Particle::show doesn't strain it as much at all


CS:EIP Symbol + Offset Timer samples
0xc05ab0 Particle::show 7.09
0xc08530 std::_Vector_iterator<std::_Vector_val<Particle,std::allocator<Particle> > >::operator++ 4.8
0xc04fd0 apply_surface 4.77
0xc0a2a0 std::_Vector_const_iterator<std::_Vector_val<Particle,std::allocator<Particle> > >::operator== 4.64
0xc08170 std::vector<Particle,std::allocator<Particle> >::end 4.39
0xc0bd00 std::_Vector_const_iterator<std::_Vector_val<Particle,std::allocator<Particle> > >::operator++ 4.27
0xc0bc10 std::_Vector_const_iterator<std::_Vector_val<Particle,std::allocator<Particle> > >::_Vector_const_iterator<std::_Vector_val<Particle,std::allocator<Particle> > > 4.23
0xc0a140 std::_Vector_iterator<std::_Vector_val<Particle,std::allocator<Particle> > >::_Vector_iterator<std::_Vector_val<Particle,std::allocator<Particle> > > 4.16
0xc0bcc0 std::_Vector_const_iterator<std::_Vector_val<Particle,std::allocator<Particle> > >::operator* 4.15
0xc02cb0 dead_p 4.1
0xc0bc80 std::_Iterator_base0::_Adopt 3.98
0xc084e0 std::_Vector_iterator<std::_Vector_val<Particle,std::allocator<Particle> > >::operator-> 3.95
0xc0a1f0 std::_Vector_iterator<std::_Vector_val<Particle,std::allocator<Particle> > >::operator++ 3.73
0xc0bd50 std::_Vector_const_iterator<std::_Vector_val<Particle,std::allocator<Particle> > >::_Compat 3.57
0xc0a1a0 std::_Vector_iterator<std::_Vector_val<Particle,std::allocator<Particle> > >::operator* 3.53
0xc085e0 std::_Vector_const_iterator<std::_Vector_val<Particle,std::allocator<Particle> > >::operator!= 3.41
0xc0f1e0 std::_Remove_if<Particle *,bool (__cdecl*)(Particle)> 3.15
0xc11dd0 std::_Move<Particle &> 2.75
0xc13d7f _RTC_CheckStackVars 2.65
0xc017fd ILT+2040(__RTC_CheckEsp) 1.69
0xc0f8d0 std::forward<Particle const &> 1.65
0xc105d0 std::_Construct<Particle,Particle const &> 1.61
0xc03dc0 Player::show_particles 1.32
0xc02e70 Particle::Particle 1.11
0xc09df0 std::vector<Particle,std::allocator<Particle> >::_Inside 1.01
0xc0def0 std::_Cons_val<std::allocator<Particle>,Particle,Particle const &> 1.01
0xc0de50 std::addressof<Particle const > 0.95
0xc13030 std::_Destroy<Particle> 0.92
0xc081e0 std::vector<Particle,std::allocator<Particle> >::push_back 0.9
0xc12a00 std::allocator<Particle>::destroy 0.85
0xc0ed60 std::allocator<Particle>::construct 0.8
0xc13d54 _RTC_CheckEsp 0.8
0xc09ff0 std::vector<Particle,std::allocator<Particle> >::_Orphan_range 0.72
0xc120c0 std::_Dest_val<std::allocator<Particle>,Particle> 0.71
0xc0f860 operator new 0.63
0xc0190b ILT+2310(?dead_pYA_NVParticleZ) 0.22
0xc05890 Star::show 0.22
0xc0c860 std::_Vector_const_iterator<std::_Vector_val<Star,std::allocator<Star> > >::operator* 0.22
0xc013d4 ILT+975(_RTC_CheckStackVars 0.21
0xc09210 std::_Vector_iterator<std::_Vector_val<Star,std::allocator<Star> > >::operator++ 0.21
0xc047a0 dead_s 0.19
0xc08e40 std::vector<Star,std::allocator<Star> >::end 0.18
0xc0c8a0 std::_Vector_const_iterator<std::_Vector_val<Star,std::allocator<Star> > >::operator++ 0.18
0xc0102d ILT+40(??D?$_Vector_iteratorV?$_Vector_valVParticleV?$allocatorVParticlestdstdstdQBEAAVParticleXZ) 0.16
0xc0147e ILT+1145(?_Compat?$_Vector_const_iteratorV?$_Vector_valVParticleV?$allocatorVParticlestdstdstdQBEXABV12Z) 0.16
0xc10190 std::_Destroy_range<std::allocator<Particle> > 0.16
0xc01785 ILT+1920(??E?$_Vector_iteratorV?$_Vector_valVParticleV?$allocatorVParticlestdstdstdQAEAAV01XZ) 0.14
0xc01915 ILT+2320(?showParticleQAEXXZ) 0.14
0xc0c7f0 std::_Vector_const_iterator<std::_Vector_val<Star,std::allocator<Star> > >::_Vector_const_iterator<std::_Vector_val<Star,std::allocator<Star> > > 0.14
0xc0140b ILT+1030(??8?$_Vector_const_iteratorV?$_Vector_valVParticleV?$allocatorVParticlestdstdstdQBE_NABV01Z) 0.13
0xc0c8f0 std::_Vector_const_iterator<std::_Vector_val<Star,std::allocator<Star> > >::_Compat 0.13
0xc011bd ILT+440(??9?$_Vector_const_iteratorV?$_Vector_valVParticleV?$allocatorVParticlestdstdstdQBE_NABV01Z) 0.11
0xc011e0 ILT+475(??0?$_Vector_const_iteratorV?$_Vector_valVParticleV?$allocatorVParticlestdstdstdQAEPAVParticlePBU_Container_base0 0.11
0xc0164f ILT+1610(?apply_surfaceYAXHHPAUSDL_Surface 0.11
0xc01a87 ILT+2690(??E?$_Vector_iteratorV?$_Vector_valVParticleV?$allocatorVParticlestdstdstdQAE?AV01HZ) 0.11
0xc092c0 std::_Vector_const_iterator<std::_Vector_val<Star,std::allocator<Star> > >::operator!= 0.11
0xc0ad40 std::_Vector_iterator<std::_Vector_val<Star,std::allocator<Star> > >::_Vector_iterator<std::_Vector_val<Star,std::allocator<Star> > > 0.11
0xc01384 ILT+895(??C?$_Vector_iteratorV?$_Vector_valVParticleV?$allocatorVParticlestdstdstdQBEPAVParticleXZ) 0.1
0xc0171c ILT+1815(??$_MoveAAVParticlestdYA$$QAVParticleAAV1Z) 0.1
0xc091c0 std::_Vector_iterator<std::_Vector_val<Star,std::allocator<Star> > >::operator-> 0.1
0xc0ada0 std::_Vector_iterator<std::_Vector_val<Star,std::allocator<Star> > >::operator* 0.1
0xc01875 ILT+2160(??0?$_Vector_iteratorV?$_Vector_valVParticleV?$allocatorVParticlestdstdstdQAEPAVParticlePBU_Container_base0 0.08
0xc047f0 World::show_stars 0.08
0xc13610 std::forward<Particle> 0.08
0xc13ce8 SDL_UpperBlit 0.08
0xc013b1 ILT+940(??E?$_Vector_const_iteratorV?$_Vector_valVParticleV?$allocatorVParticlestdstdstdQAEAAV01XZ) 0.06
0xc0153c ILT+1335(?end?$vectorVParticleV?$allocatorVParticlestdstdQAE?AV?$_Vector_iteratorV?$_Vector_valVParticleV?$allocatorVParticlestdstd 0.06
0xc08b50 std::_Vector_iterator<std::_Vector_val<Bullet,std::allocator<Bullet> > >::operator-> 0.06
0xc09930 std::_Vector_const_iterator<std::_Vector_val<Enemy,std::allocator<Enemy> > >::operator!= 0.06
0xc01122 ILT+285(??2YAPAXIPAXZ) 0.05
0xc01320 ILT+795(??$_ConstructVParticleABV1stdYAXPAVParticleABV1Z) 0.05
0xc01627 ILT+1570(?_Adopt_Iterator_base0stdQAEXPBXZ) 0.05
0xc0a740 std::_Vector_iterator<std::_Vector_val<Bullet,std::allocator<Bullet> > >::_Vector_iterator<std::_Vector_val<Bullet,std::allocator<Bullet> > > 0.05
0xc0adf0 std::_Vector_iterator<std::_Vector_val<Star,std::allocator<Star> > >::operator++ 0.05
0xc0b340 std::_Vector_iterator<std::_Vector_val<Enemy,std::allocator<Enemy> > >::_Vector_iterator<std::_Vector_val<Enemy,std::allocator<Enemy> > > 0.05
0xc0b4a0 std::_Vector_const_iterator<std::_Vector_val<Enemy,std::allocator<Enemy> > >::operator== 0.05
0xc11f80 std::_Move<Star &> 0.05
0xc01230 ILT+555(??D?$_Vector_const_iteratorV?$_Vector_valVParticleV?$allocatorVParticlestdstdstdQBEABVParticleXZ) 0.03
0xc013e3 ILT+990(??$_DestroyVParticlestdYAXPAVParticleZ) 0.03
0xc014b5 ILT+1200(?_Inside?$vectorVParticleV?$allocatorVParticlestdstdIBE_NPBVParticleZ) 0.03
0xc019ec ILT+2535(??$_Cons_valV?$allocatorVParticlestdVParticleABV3stdYAXAAV?$allocatorVParticle 0.03
0xc01a9b ILT+2710(_SDL_UpperBlit) 0.03
0xc01ab4 ILT+2735(?_Orphan_range?$vectorVParticleV?$allocatorVParticlestdstdIBEXPAVParticle 0.03
0xc02760 Enemy::move 0.03
0xc037e0 World::show 0.03
0xc07930 SDL_main 0.03
0xc09830 std::_Vector_iterator<std::_Vector_val<Enemy,std::allocator<Enemy> > >::operator-> 0.03
0xc09880 std::_Vector_iterator<std::_Vector_val<Enemy,std::allocator<Enemy> > >::operator++ 0.03
0xc09f20 std::vector<Particle,std::allocator<Particle> >::_Tidy 0.03
0xc0aea0 std::_Vector_const_iterator<std::_Vector_val<Star,std::allocator<Star> > >::operator== 0.03
0xc0bb70 std::allocator<Particle>::allocator<Particle> 0.03
0xc0c320 std::_Vector_const_iterator<std::_Vector_val<Bullet,std::allocator<Bullet> > >::_Compat 0.03
0xc0ce30 std::_Vector_const_iterator<std::_Vector_val<Enemy,std::allocator<Enemy> > >::operator* 0.03
0xc11780 std::vector<Particle,std::allocator<Particle> >::begin 0.03
0xc12b80 std::_Cons_val<std::allocator<Particle>,Particle,Particle> 0.03
0xc0115e ILT+345(?construct?$allocatorVParticlestdQAEXPAVParticle$$QAV3Z) 0.02
0xc011db ILT+470(??0ParticleQAEHHHZ) 0.02
0xc01217 ILT+530(?enemy_moveWorldQAEXXZ) 0.02
0xc013c5 ILT+960(?_Compat?$_Vector_const_iteratorV?$_Vector_valVBulletV?$allocatorVBulletstdstdstdQBEXABV12Z) 0.02
0xc01451 ILT+1100(??0?$_Vector_const_iteratorV?$_Vector_valVStarV?$allocatorVStarstdstdstdQAEPAVStarPBU_Container_base0 0.02
0xc0145b ILT+1110(?destroy?$allocatorVParticlestdQAEXPAVParticleZ) 0.02
0xc01672 ILT+1645(??C?$_Vector_iteratorV?$_Vector_valVEnemyV?$allocatorVEnemystdstdstdQBEPAVEnemyXZ) 0.02
0xc017ad ILT+1960(??C?$_Vector_iteratorV?$_Vector_valVStarV?$allocatorVStarstdstdstdQBEPAVStarXZ) 0.02
0xc01893 ILT+2190(?_Compat?$_Vector_const_iteratorV?$_Vector_valVStarV?$allocatorVStarstdstdstdQBEXABV12Z) 0.02
0xc018d9 ILT+2260(??D?$_Vector_const_iteratorV?$_Vector_valVStarV?$allocatorVStarstdstdstdQBEABVStarXZ) 0.02
0xc03bf0 World::show_player 0.02
0xc04550 World::show_enemies 0.02
0xc06e30 Bullet::show 0.02
0xc07430 Collision_Detection 0.02
0xc08060 std::vector<Particle,std::allocator<Particle> >::~vector<Particle,std::allocator<Particle> > 0.02
0xc08100 std::vector<Particle,std::allocator<Particle> >::begin 0.02
0xc08c50 std::_Vector_const_iterator<std::_Vector_val<Bullet,std::allocator<Bullet> > >::operator!= 0.02
0xc09d80 std::vector<Particle,std::allocator<Particle> >::_Destroy 0.02
0xc0a510 std::vector<Bullet,std::allocator<Bullet> >::_Tidy 0.02
0xc0a7f0 std::_Vector_iterator<std::_Vector_val<Bullet,std::allocator<Bullet> > >::operator++ 0.02
0xc0a8a0 std::_Vector_const_iterator<std::_Vector_val<Bullet,std::allocator<Bullet> > >::operator== 0.02
0xc0abf0 std::vector<Star,std::allocator<Star> >::_Orphan_range 0.02
0xc0bfc0 std::vector<Bullet,std::allocator<Bullet> >::size 0.02
0xc0c220 std::_Vector_const_iterator<std::_Vector_val<Bullet,std::allocator<Bullet> > >::_Vector_const_iterator<std::_Vector_val<Bullet,std::allocator<Bullet> > > 0.02
0xc0c290 std::_Vector_const_iterator<std::_Vector_val<Bullet,std::allocator<Bullet> > >::operator* 0.02
0xc0ea20 std::_Allocate<Bullet> 0.02
0xc0ef70 std::_Unchecked<std::_Vector_val<Bullet,std::allocator<Bullet> > > 0.02
0xc0f0b0 std::_Rechecked<std::_Vector_val<Bullet,std::allocator<Bullet> > > 0.02
0xc0f100 std::find_if<std::_Vector_iterator<std::_Vector_val<Particle,std::allocator<Particle> > >,bool (__cdecl*)(Particle)> 0.02
0xc0f190 std::_Unchecked<std::_Vector_val<Particle,std::allocator<Particle> > > 0.02
0xc0f600 std::_Remove_if<Star *,bool (__cdecl*)(Star)> 0.02
0xc0fb50 std::forward<Bullet const &> 0.02
0xc117f0 std::vector<Particle,std::allocator<Particle> >::end 0.02
0xc11d40 std::_Find_if<Particle *,bool (__cdecl*)(Particle)> 0.02
0xc11ef0 std::_Find_if<Star *,bool (__cdecl*)(Star)> 0.02
0xc128a0 std::vector<Bullet,std::allocator<Bullet> >::_Ucopy<std::_Vector_const_iterator<std::_Vector_val<Bullet,std::allocator<Bullet> > > > 0.02
0xc13210 std::allocator<Particle>::construct 0.02

132 functions, 610 instructions, Total: 6224 samples, 100.00% of shown samples, 20.72% of total session samples

Edited by chondee, 17 January 2012 - 01:16 AM.


#8 fastcall22   Crossbones+   -  Reputation: 4461

Like
1Likes
Like

Posted 17 January 2012 - 09:22 AM

EDIT:
Since then I tested it with part.SetPosition(x,y); and App.Draw(part); commented out, so they are not drawn, and still Particle::show takes up 61% of the resources (was 74 previously)... with that commented out, the SDL and SFML Particle::show() are identical, except in SDL they actually get drawn.


This would suggest that the API is the bottleneck (as indicated earlier in the thread). If I recall correctly, SFML uses OpenGL 1.1 immediate-mode calls, which would mean for every particle rendered, there's a call to glBindTexture, glPushMatrix/glPopMatrix, and glBegin/glEnd. For something like a particle system, the overhead in each of these calls, while not significant on their own, can snowball. To reduce the overhead from texture switching, place all of your particle textures on one sheet. Since SFML doesn't seem to have any feature that will allow us to assign a part of an Image to a Sprite, you'll need to do the rendering yourself through raw OpenGL calls. By doing so, you can optimize out some OpenGL calls and will allow you to do batching, among other things:

glBlendFunc( GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA ); // sf::BlendMode::Alpha
Particle::particleSheet.Bind(); // sf::Image::Bind, essentially calls glBind( GL_TEXTURE_2D, particleSheet.handle )
glBegin( GL_QUADS ); {
    for ( Partice& p : v_particles ) {
        cnost Rect2f& texRect = getTextureRect( p.getTextureIdx() );

        Vector2f coord[2] = {
            p.getPosition() + Vector2f( -1, -1 ) * (p.getScale() / 2.f );
            p.getPosition() + Vector2f(  1,  1 ) * (p.getScale() / 2.f );
        };

        glColor4ub( p.color().r, p.color().g, p.color().b, p.color().a );
        glTexCoord2f( texRect.left,  texRect.bottom ); glVertex2f( coord[0].x, coord[0].y );
        glTexCoord2f( texRect.right, texRect.bottom ); glVertex2f( coord[1].x, coord[0].y );
        glTexCoord2f( texRect.right, texRect.top    ); glVertex2f( coord[1].x, coord[1].y );
        glTexCoord2f( texRect.left,  texRect.top    ); glVertex2f( coord[0].x, coord[1].y );
    }
} glEnd();

For further optimizations, you can use VBOs, use the CPU to update all the vertices of all the particles, then send the entire buffer to the GPU in one call.
c3RhdGljIGNoYXIgeW91cl9tb21bMVVMTCA8PCA2NF07CnNwcmludGYoeW91cl9tb20sICJpcyBmYXQiKTs=

#9 Serapth   Crossbones+   -  Reputation: 5756

Like
2Likes
Like

Posted 17 January 2012 - 10:29 AM

This is exactly the problem, and somewhat the solution. See here.

Question: Because my particle engine can't draw more than 3000 particles before it starts to lag...

SFML 1.6 is definitely too slow for this.
SFML 2.0 (current) is better if you draw multiples sprites that use the same texture.
SFML 2.0 (future) will be much better, just wait a little bit Posted Image


#10 Serapth   Crossbones+   -  Reputation: 5756

Like
2Likes
Like

Posted 17 January 2012 - 10:33 AM

To reduce the overhead from texture switching, place all of your particle textures on one sheet. Since SFML doesn't seem to have any feature that will allow us to assign a part of an Image to a Sprite, you'll need to do the rendering yourself through raw OpenGL calls. By doing so, you can optimize out some OpenGL calls and will allow you to do batching, among other things:



As a corollary to my other post, this is why it doesn't work in 1.6 and sorta works in 2.0. Instead of using sf::Image, you use sf::Texture in 2.0, which allows you to load your image from a single sprite sheet.

#11 chondee   Members   -  Reputation: 135

Like
0Likes
Like

Posted 17 January 2012 - 02:54 PM

Thank you fastcall22 and Serapth, this is really helpful.

I am wondering, is SFML 2.0 stable enough to base my whole project on it?
Also, if it is not, are there many things that need to be changed in my code in SMFL1.6<->SFML2.0 switch, or most of the calls remain the same and the changes are "behind the scenes" implementations?

Thanks for the sprite sheet implementation too, in SDL I just used a SDL_Rect[] as a possible offset to apply_surface, which was quite convenient, but this way I can do it in SFML too, and it's time I start getting familiar with some raw opengl too.

#12 Serapth   Crossbones+   -  Reputation: 5756

Like
1Likes
Like

Posted 17 January 2012 - 03:04 PM

Thank you fastcall22 and Serapth, this is really helpful.

I am wondering, is SFML 2.0 stable enough to base my whole project on it?
Also, if it is not, are there many things that need to be changed in my code in SMFL1.6<->SFML2.0 switch, or most of the calls remain the same and the changes are "behind the scenes" implementations?

Thanks for the sprite sheet implementation too, in SDL I just used a SDL_Rect[] as a possible offset to apply_surface, which was quite convenient, but this way I can do it in SFML too, and it's time I start getting familiar with some raw opengl too.


The differences aren't too major, you should be able to port without too much issue. You don't need to do a SDL_Rect implementation. Load an image as you would normally, but instead of using sf::Sprite, you use sf::Texture, which you can populate using LoadFromImage. The majority of other changes are small but annoying. The coordinate system for Sprites/Textures has been updated ( to make sense, the old naming convention was stupid ), obviously Texture was added for much the reasons of the problems you've run into, otherwise the biggest changes are input related. sf::Input is gone, replaced by two global namespaces.

#13 chondee   Members   -  Reputation: 135

Like
0Likes
Like

Posted 17 January 2012 - 03:11 PM

I see, I'll port this early version to SFML 2.0, learn how input is working, and when everything seems comfortable I'll port my full project.

#14 Serapth   Crossbones+   -  Reputation: 5756

Like
2Likes
Like

Posted 17 January 2012 - 03:24 PM

I see, I'll port this early version to SFML 2.0, learn how input is working, and when everything seems comfortable I'll port my full project.



Input changed from using sf::RenderWindow.GetInput(), to two separate global methods sf::Keyboard and sf::Mouse

So, before you would go

myAppWindow->GetInput()->IsKeyDown(someKey);

You now do:

sf::Keyboard::IsKeyDown(someKey);


Ditto for mousing functions have also been split out. The change does make sense, but will potentially cause a number of changes to be required.

#15 chondee   Members   -  Reputation: 135

Like
0Likes
Like

Posted 17 January 2012 - 07:22 PM

Thanks,

Fortunately I haven't even started looking into SFML 1.6's input either, so it won't make much of a difference to me, I'll just start learning 2.0's input.

#16 chondee   Members   -  Reputation: 135

Like
1Likes
Like

Posted 18 January 2012 - 03:12 AM

So, I have built the SFML 2.0 libraries, and converted my code for SFML 2.0.
There was quite a huge performance gain, everything looked fine, so I started writing the keyboard input functions.

I am polling for events in the main loop, every frame like this:
  while (App.PollEvent (Event))
  {
   //myWorld.myPlayer.handle_input();
   //if (Event.Type == sf::Event::Closed)
   //{
   // App.Close();
   //}
  }
Even if the loop is empty, like above, the performance drops significantly, the frame rate fluctuates between 20-60.
Am I doing the polling wrong?

EDIT:
SOLVED
I got the answer on SFML forums

Can you try to comment line 121 of src/SFML/Window/WindowImpl.cpp (ProcessJoystickEvents();) and recompile SFML?


Edited by chondee, 18 January 2012 - 03:31 AM.


#17 BeerNutts   Crossbones+   -  Reputation: 3000

Like
0Likes
Like

Posted 18 January 2012 - 12:29 PM

Well, I had a suggestion about why show was so slow using v1.6, when using SFML or SDL (even commenting out the SFML draw).

it looks like, in SDL, "part" is a pointer, while, when using SFML, it is not. Try setting the SFML version's part from Sf::Sprite part to Sf::Sprite *part.

It's probably all the copying it has to do.
My Gamedev Journal: 2D Game Making, the Easy Way

---(Old Blog, still has good info): 2dGameMaking
-----
"No one ever posts on that message board; it's too crowded." - Yoga Berra (sorta)

#18 chondee   Members   -  Reputation: 135

Like
0Likes
Like

Posted 18 January 2012 - 04:39 PM

Well, I had a suggestion about why show was so slow using v1.6, when using SFML or SDL (even commenting out the SFML draw).

it looks like, in SDL, "part" is a pointer, while, when using SFML, it is not. Try setting the SFML version's part from Sf::Sprite part to Sf::Sprite *part.

It's probably all the copying it has to do.


Well, in SDL you have Surfaces that are associated with the image files (in my case)
To avoid having each particle load the same image that they share, I just used a pointer to the same Surface.

In SFML, there is a separate Image (or Texture in 2.0) that contains the actual image, and there is a Sprite, that (the way I understand it) is kind of like a pointer to the image.
I can set the sprite to an image, but it will only point to that one image, it won't contain the image file's data.

That's why in a traditional sense I wasn't using * pointer, but conceptually I was.
Either way, thanks for your comment, so far I seemed to have had the performance issues solved with using SFML 2.0.

btw I am really new to SFML, so please correct me if my understanding of this seems to be wrong

#19 BeerNutts   Crossbones+   -  Reputation: 3000

Like
1Likes
Like

Posted 18 January 2012 - 07:53 PM


Well, I had a suggestion about why show was so slow using v1.6, when using SFML or SDL (even commenting out the SFML draw).

it looks like, in SDL, "part" is a pointer, while, when using SFML, it is not. Try setting the SFML version's part from Sf::Sprite part to Sf::Sprite *part.

It's probably all the copying it has to do.


Well, in SDL you have Surfaces that are associated with the image files (in my case)
To avoid having each particle load the same image that they share, I just used a pointer to the same Surface.

In SFML, there is a separate Image (or Texture in 2.0) that contains the actual image, and there is a Sprite, that (the way I understand it) is kind of like a pointer to the image.
I can set the sprite to an image, but it will only point to that one image, it won't contain the image file's data.

That's why in a traditional sense I wasn't using * pointer, but conceptually I was.
Either way, thanks for your comment, so far I seemed to have had the performance issues solved with using SFML 2.0.

btw I am really new to SFML, so please correct me if my understanding of this seems to be wrong


Typically, you load the image once, and you create a Sf::Sprite from that image. But, that has nothing to do with it being a pointer or not. You can create a new Sf::Sprite as a pointer for all your sprites, and, when you assign part, you're just copying the pointer (4 bytes, 1 CPU operation) instead of the whole Sprite class (much larger, taking a memcpy).

So, the point I was making really doesn't have anything to do with SFML or SDL; rather, with the speed it takes to copy a pointer, versus copying a whole class structure.
My Gamedev Journal: 2D Game Making, the Easy Way

---(Old Blog, still has good info): 2dGameMaking
-----
"No one ever posts on that message board; it's too crowded." - Yoga Berra (sorta)

#20 chondee   Members   -  Reputation: 135

Like
0Likes
Like

Posted 18 January 2012 - 10:50 PM

Typically, you load the image once, and you create a Sf::Sprite from that image. But, that has nothing to do with it being a pointer or not. You can create a new Sf::Sprite as a pointer for all your sprites, and, when you assign part, you're just copying the pointer (4 bytes, 1 CPU operation) instead of the whole Sprite class (much larger, taking a memcpy).

So, the point I was making really doesn't have anything to do with SFML or SDL; rather, with the speed it takes to copy a pointer, versus copying a whole class structure.


Thank you for the explanation, I was only considering the memory difference between the actual Image, and the Sprite, but you are right. In case of the high number particles being created and displayed constantly, the difference between whole Sprite objects and only pointers the CPU and memory cost might be significant enough to consider.

I will try switching to Sprite pointers instead of sprites now.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS