Some optimizations

Started by
5 comments, last by Crispy 21 years, 1 month ago
Hi, I will refrain from commenting for what I need these optimizations (no, not for a homework assignment) for the time being. Anyway: I need to optimize the following: 1) -6 / x^2 (if possible) 2) 1 / x (if possible) 3) rotate x to the period (PI >= x >= -PI) where x is a floating point variable. The biggest concert for me is the third part: x can have any value, but I need to handle it sanely (and quickly)... I know this can be done really fast, but I can''t figure it out on my own. Optimizations regarding 1 and 2 most probably include raw assembly code - something which I am not at all adept writing at. So, any help would be very much appreciated. Crispy
"Literally, it means that Bob is everything you can think of, but not dead; i.e., Bob is a purple-spotted, yellow-striped bumblebee/dragon/pterodactyl hybrid with a voracious addiction to Twix candy bars, but not dead."- kSquared
Advertisement
i think you can use modulous on ur third one with the constraints to get the remainder or i just do like

(im rare with c++ so, this is psuedo/real)

do
{
if (x < pi ) x += pi;
if (x > pi ) x -= pi;
}
while (x < pi || x > pi);


ok i guess thats just real
You could speed up that code a lot by doing

while(x < pi )
x += pi;
while(x > pi )
x -= pi;

which eliminates half the conditionals and one branch. Make sure ''pi'' is the same type as X for optimal speed.

I think there is an FPU assembler instruction to compute 1/x but I have no idea what it is. If you can get number 1 fast, you can do something like "y = 1/x" and then "-6 * y * y" to calculate -6/x^2 since multiplication is faster than division.
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
Those loops are nowhere near efficient.

Try this on for size:

// reduce_twopi.h

#define M_PI 3.14159265358979)
#define TWO_PI (2. * M_PI)
#define one_over_two_pi (1. / TWO_PI)

inline double reduce_twopi( double arg )
{
arg = (arg + M_PI) * one_over_two_pi;
arg -= floor( arg );
return arg * TWO_PI - M_PI;
}


Note that a good compiler (with the right options) will turn "floor" into the appropriate inline assembly, thus this function is branchless, pipelines reasonably well, and can get inlined into almost any caller (and thus scheduled with it, to reduce latency penalties).

Note: you can''t use static const double in headers, unfortunately, hence the #defines.
Why would they need to be static?

just do:
const double M_PI = 3.14159265358979
//and so on


I am a signature virus. Please add me to your signature so that I may multiply.
[s]--------------------------------------------------------[/s]chromecode.com - software with source code
Or just do
x=fmod(x+PI,2*PI)-PI;
if you want to be quick about it.

As for the first two things, I don''t think there are any asm codes that will do them faster than what the compiler would do.
Okay, thanks for all of your replies - I think I''ll go with AP2 suggested - loops are simply too slow... Didn''t know about fmod() either - gonna check that out as well.

"Literally, it means that Bob is everything you can think of, but not dead; i.e., Bob is a purple-spotted, yellow-striped bumblebee/dragon/pterodactyl hybrid with a voracious addiction to Twix candy bars, but not dead."- kSquared

This topic is closed to new replies.

Advertisement