How to optimize this?
In my game, I''m calling the function sqrt(), but I got a feeling this isn''t the fastest method. Is there anything that does the same but faster?
I use sqrt for calculating velocity:
velocity = sqrt(x^2 + y^2)
______________________________________________
You know your game is in trouble when your AI says, in a calm, soothing voice, "I''m afraid I can''t let you do that, Dave"
Take a look here:
http://www.df.lth.se/%7Ejohn_e/fr_contrib.html
And here:
http://www.agner.org/assem/
Maybe those will point you in the right direction.
http://www.df.lth.se/%7Ejohn_e/fr_contrib.html
And here:
http://www.agner.org/assem/
Maybe those will point you in the right direction.
Are you sure that the sqrt needs to be optimised? I suggest you find out where the executable is spending most of its time - you''ll get a much better speed increase by optimising that code instead.
The functions in the math library are usually already optimized. Optimizing Sqrt is probably trying to reinvent the wheel. So i suggest you invest your time optimizing other parts of the code instead.
I can also suggest that if you are using the pow() function to rise x and y to the power of 2, use x*x and y*y instead. Calling pow() is less efficient than those two multiplications.
peace
I can also suggest that if you are using the pow() function to rise x and y to the power of 2, use x*x and y*y instead. Calling pow() is less efficient than those two multiplications.
peace
Here''s a url that shows how sqrt might be implemented in C.
http://minnie.tuhs.org/UnixTree/V7/usr/src/libm/sqrt.c.html
It says nothing about this implementation being faster than the crt version.
I think BeerHunter nails it on the head though, find where the exe is spending it''s time and go after that.
http://minnie.tuhs.org/UnixTree/V7/usr/src/libm/sqrt.c.html
It says nothing about this implementation being faster than the crt version.
I think BeerHunter nails it on the head though, find where the exe is spending it''s time and go after that.
I''m actually using x*x. So you''re saying the sqrt() is already optimized? That''s all I need to know. I was just checking because I wasn''t sure about this.
______________________________________________
You know your game is in trouble when your AI says, in a calm, soothing voice, "I''m afraid I can''t let you do that, Dave"
______________________________________________
You know your game is in trouble when your AI says, in a calm, soothing voice, "I''m afraid I can''t let you do that, Dave"
quote:Original post by weesiOn common machines, they just use the corresponding FPU instruction, so how are they optimised? Anyway, by sacrificing accuracy, it''s sometimes possible to come up with a function that''s faster than the corresponding FPU instruction. My fast_sine() function is somewhat faster than sin(). The question is: are such functions going to give a noticeable speed improvement?
The functions in the math library are usually already optimized.
Where you can get serious speed improvements, is if you perform multiple sin/cos/tan/etc... calls on the same data. There''s x86 op codes that will perform all of these operations simutaneously - so instead of seperately calling cos & sin, which would take about 200 ticks, you can get both at once in ~120 ticks.
SSE or SSE2 (or 3D Now!) might have stuff to perform the same operation on multiple data elements as well - so instead of calling sin four times on four floats, you pack all the floats on top of the SSE stack and invoke the ''do4sin'' op code, and the 4 sin values are put on top of the stack. (I''m guessing that''s how it would work, I haven''t done any SSE coding yet).
And (obviously, I hope) none of this matters if you only call cos/sin once in a while. If you have a pile of data and can batch process it, then the time difference can be signifcant.
SSE or SSE2 (or 3D Now!) might have stuff to perform the same operation on multiple data elements as well - so instead of calling sin four times on four floats, you pack all the floats on top of the SSE stack and invoke the ''do4sin'' op code, and the 4 sin values are put on top of the stack. (I''m guessing that''s how it would work, I haven''t done any SSE coding yet).
And (obviously, I hope) none of this matters if you only call cos/sin once in a while. If you have a pile of data and can batch process it, then the time difference can be signifcant.
here is what i do when i have a situation like that
i assume somewhere you are going to be comparing that velocity to something somewhere in your program like say maxVelocity or something
later you might be doing this
if (velocity > maxVelocity)
{
velocity = maxVelocity;
}
or something like that
here is what i do
i precompute maxVelocity^2 so say
double maxVelocity = 100;
double maxVelocitySquared = maxVelocity*maxVelocity
so now instead of calulating the sqrt to obtain the velocity i can just do
velocity = x*x + y*y;
if (velocity > maxVelocitySquared)
{
velocity = maxVelocity;
}
and you never need to compute with the sqrt
"I pity the fool, thug, or soul who tries to take over the world, then goes home crying to his momma."
- Mr. T
i assume somewhere you are going to be comparing that velocity to something somewhere in your program like say maxVelocity or something
later you might be doing this
if (velocity > maxVelocity)
{
velocity = maxVelocity;
}
or something like that
here is what i do
i precompute maxVelocity^2 so say
double maxVelocity = 100;
double maxVelocitySquared = maxVelocity*maxVelocity
so now instead of calulating the sqrt to obtain the velocity i can just do
velocity = x*x + y*y;
if (velocity > maxVelocitySquared)
{
velocity = maxVelocity;
}
and you never need to compute with the sqrt
"I pity the fool, thug, or soul who tries to take over the world, then goes home crying to his momma."
- Mr. T
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement