Speed of Euler, Verlet and RK4 with loads of objects

Started by
23 comments, last by Raghar 16 years, 10 months ago
Hi guys, here's a little question I'm sure some of you will be able to respond in under 5 secs... I'm going to implement game that resembels Gish physics (you know, Jacobsen-like Verlets with relaxation etc), and now I'm prototyping some physics code. First, I solved FPS independent movement using technique presented by http://www.gaffer.org/game-physics/fix-your-timestep/ and everything works fine using Euler, RK4 and Verlet integration schemes. Objects move on the screen smoothly, without any visible jerking etc. So far so good. However, problems arise when I throw a "few" more objects - the framerate drops very badly at even moderate numbers. Obviously, I'm testing the stuff in release mode and here are some sample values:

Euler:      350 objects at 30 fps 
Verlet:     330 objects at 30fps
RK4:        70 objects at 30fps, drops below 5 with ~100 objects
It runs at ~500 fps with 1 object for all integration schemes. (yup, no vsync) Here's code I use for integration:

  // ------------- RK4
  {
    previous = current;

    Derivative2d a = Evaluate(current, t);
    Derivative2d b = Evaluate(current, t, dt*0.5f, a);
    Derivative2d c = Evaluate(current, t, dt*0.5f, b);
    Derivative2d d = Evaluate(current, t, dt, c);

    const float dxdt = 1.0f/6.0f * (a.d_pos.x + 2.0f*(b.d_pos.x + c.d_pos.x) + d.d_pos.x);
    const float dydt = 1.0f/6.0f * (a.d_pos.y + 2.0f*(b.d_pos.y + c.d_pos.y) + d.d_pos.y);

    const float dvxdt = 1.0f/6.0f * (a.d_vel.x + 2.0f*(b.d_vel.x + c.d_vel.x) + d.d_vel.x);
    const float dvydt = 1.0f/6.0f * (a.d_vel.y + 2.0f*(b.d_vel.y + c.d_vel.y) + d.d_vel.y);

    current.pos.x = current.pos.x + dxdt*dt;
    current.pos.y = current.pos.y + dydt*dt;

    current.vel.x = current.vel.x + dvxdt*dt;
    current.vel.y = current.vel.y + dvydt*dt;
   }

  // ------------- Euler
   {
    previous = current;

    Derivative2d a = Evaluate(current, t);
    
    current.pos.x = current.pos.x + a.d_pos.x*dt;
    current.pos.y = current.pos.y + a.d_pos.y*dt;

    current.vel.x = current.vel.x + a.d_vel.x*dt;
    current.vel.y = current.vel.y + a.d_vel.y*dt;
   }    

 // ------------- Verlet
 

 {
  Vector2 tmp = state.pos_current;
  state.pos_current = (state.pos_current * 2 - state.pos_previous) + state.acceleration * (dt * dt);	
  state.pos_previous = tmp;
  state.acceleration.Set(0,0);
 }


 // --- for curious ones

Derivative2d TEntity :: Evaluate( const State2d & _initial, float t )
 {
	Derivative2d output;
	output.d_pos.x = _initial.vel.x;
	output.d_pos.y = _initial.vel.y;

	Acceleration(_initial, t, output ); // now does nothing
	return output;
 }



Here's the main loop:

  float new_time = timer->GetTime();
  float delta_time = new_time - current_time;
  current_time = new_time;

  accumulator += delta_time;

  while (accumulator >= dt) // dt = 0.01
   {
    for (EntitiesItor itor = entities.begin(); itor != entities.end(); ++itor)           
     (*itor)->Integrate(t, dt);
      
    t += dt;
    accumulator -= dt;          
   }

  const float alpha = accumulator / dt;

  State2d state;

  for (EntitiesItor itor = entities.begin(); itor != entities.end(); ++itor)
   {         
    (*itor)->Update();
    (*itor)->Render(renderer, (*itor)->Interpolate(alpha));
   } 


Does anyone have any ideas why this works so slow? I don't believe that today powerful computers that run things like Unreal 3 or Crysis on >30 fps could have problems moving a few dozens of objects, without any collision detection / collision resolving / any other activity taking place, so there's either sth inherently bad with my code, or that's the hard truth I'll have to accept. I tried removing the rendering of objects, and it raised fps by only a few percent, so obviously I'm not GPU but CPU bound. Also, I have AMD 3500+ with plenty of DDR2 RAM, Radeon X1600 which is not that bad, so IMHO slow computer config is not the case.
Advertisement
The RK4 method seems to be performing quite well, considering how much arithmetic is involved. Rough calculations show that in your case, it is only 5x slower than Euler (5 == 350/70).

What you could do is run each integration method repeatedly for n seconds, tallying up how many iterations are performed. I'm going to guess that the number of Euler iterations will be also around 5x as many as that given by RK4.

Have you tried RK2?

When numerically integrating, you should also consider whether Euler is sufficient. Unless you're simulating non-linear velocities and accelerations, Euler is sufficient. ex: Quake's characters either stand run or walk. They do not have a range of velocities, so Euler integration is sufficient.

Also, the numerical imprecision of the floating point format that you are using may end up providing more instability than the actual integration method. ex: When modelling the orbit of Mercury, trajectory errors would be as much a fault of imprecision as naive integration.
Since the OP mentioned Gish i'd say that Euler may not be the best choice, however "newton-stoermer-verlet" might be a good choice. as fast as Euler but way better-behaving:

v += a*dt
x += v*dt

See this page for more info (the "Choosing an Integrator for Games and Simulations: Use a Symplectic Integrator" section): http://www.brightland.com/physics/


I'd also like to know why the numbers are so low -- for Euler that's only 350*4*30 = 42k adds + 42k multiplies per second.. it should surely be orders of magnitude faster than that, no?

Possibly your method of timing is incorrect?

A speed of a SGDI (or NSV) and a sixth order symplectic integrator are about... Sixth order integrator is seven times slower than SGDI (NSV, symplectic euler, or whatever you are calling that simple algorithm.).

So you might throw away RK4 from the window, it's crap anyway, and use sixth order integrator.

BTW what's your timestep, how much copying are you doing, and where is that O(n^2) algorithm? 70 - 100 : 30 - 5 it's an increase of amount of object by half followed by decrease of speed by 6. It looks more like O(n^3) algorithm, even if I'd account for increased memory allocation.
RK4 does take about 5 times longer but you should be able to run with a dt that is 10 to 100 times larger and still have acceptable accuracy, that's really the point of using higher order integration techniques.
Quote:Original post by Raghar
A speed of a SGDI (or NSV) and a sixth order symplectic integrator are about... Sixth order integrator is seven times slower than SGDI (NSV, symplectic euler, or whatever you are calling that simple algorithm.).

So you might throw away RK4 from the window, it's crap anyway, and use sixth order integrator.

BTW what's your timestep, how much copying are you doing, and where is that O(n^2) algorithm? 70 - 100 : 30 - 5 it's an increase of amount of object by half followed by decrease of speed by 6. It looks more like O(n^3) algorithm, even if I'd account for increased memory allocation.


I'm just curious, but why do you say RK4 is crap? And I'm not real familiar with numerical methods as they apply specifically to games but I've never heard of using a 6th order method for, well anything really.

I want to see the timing methods used by the OP. I'm betting they're at best 1ms resolution and if you're going to be picky that's just not enough.

"Those who would give up essential liberty to purchase a little temporary safety deserve neither liberty nor safety." --Benjamin Franklin

just a note: I would change the order from this
  while (accumulator >= dt) // dt = 0.01   {    for (EntitiesItor itor = entities.begin(); itor != entities.end(); ++itor)                (*itor)->Integrate(t, dt);          t += dt;    accumulator -= dt;             }

to this
    for (EntitiesItor itor = entities.begin(); itor != entities.end(); ++itor)    {      for( float time_left = accumulator; time_left >= dt; time_left -= dt )      {               (*itor)->Integrate(t, dt);      }    }    while (accumulator >= dt) // dt = 0.01    {      t += dt;      accumulator -= dt;              }


This won't make a difference for a small number of objects, or when you are only doing 1 step, but should help the rest of the time. Have you tried profiling your code to see where the time is actually going?

(why do my increment operators (plus equals) not show up in the preview window?)
Quote:Original post by lonesock
just a note: I would change the order from this
*** Source Snippet Removed ***
to this
*** Source Snippet Removed ***

This won't make a difference for a small number of objects, or when you are only doing 1 step, but should help the rest of the time. Have you tried profiling your code to see where the time is actually going?

(why do my increment operators (plus equals) not show up in the preview window?)


Except that then you're passing the wrong time to the integrator.

--www.physicaluncertainty.com
--linkedin
--irc.freenode.net#gdnet

Quote:Original post by jjd
Except that then you're passing the wrong time to the integrator.

Good catch! I was just hacking the order and didn't pay attention...my bad.

This topic is closed to new replies.

Advertisement