#### Archived

This topic is now archived and is closed to further replies.

# cosinus function

This topic is 5876 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Im not sure which forum this should go in blah blah, but here it is.. I made a function that calculates the cosine to an angle:
  // definitions: #define ONEPI 3.14159265358 // 97932384626433832795 #define TWOPI 6.28318530717 // 95864769252867665590 // didnt need ALL those decimals lol #define PRES 25 // Presicion, 25 makes it as presice as the cmath cos function // didnt gain so much speed by decreasing it (strange..) // function: double Cos( double ang ) { bool cyc = true; int u = 2; double vrb[3]; if( ang > TWOPI ) { while( ang > TWOPI ) { ang -= TWOPI; }} if( ang < -TWOPI ) { while( ang < -TWOPI ) { ang += TWOPI; }} vrb[2] = 1; while( u <= PRES ) { vrb[0] = 1; vrb[1] = 1; for(int k = 1; k <= u; k++) { vrb[0] *= k; // vrb[0] = u! vrb[1] *= ang; // vrb[1] = ang^u } if(cyc) { vrb[2] -= vrb[1] / vrb[0]; } else { vrb[2] += vrb[1] / vrb[0]; } cyc = !cyc; u += 2; } return vrb[2]; } 
What u think of it ? Can anyone think of a way to make it faster (without making it too inacurate)? Now it is about 99% the speed of the cmath cos function, but since this thing is placed directly into the source file it should be possible to make it faster.. right ? Edit: trying to make it readable.. Edited by - Jesper T on November 2, 2001 4:17:39 AM

##### Share on other sites
Some ideas:

* restrict the range more, e.g. to 0 to pi/2. The smaller the angle the quicker the power series will converge. Then use the properties of cosine, such as cos(-a) == cos(a), to derive the cosine for angles outside this range from this.
* the initial two lines to limit the range will take a long time with large input angles. If this is a problem consider using mod.
* precalculate the factorials instead of calculating them each time. Store them as 1/factorial to eliminate the need for the diviison. You can also store them as alternate + or -, eliminating the need for ''cyc'' to keep track of whether at odd or even term (though this makes them less useful generally).
* as well as a fixed number of iterations also test for the terms getting small enough. E.g exit the loop if ang^n/n! is less than some limit of accuracy.
* Use seperate floats instead of an array for the locals, as this will let the compiler allocate them to registers (though this is compiler/platform dependent).
* if you can get the number of iterations down consider unrolling the loop (again how much this helps is compiler/processor dependent).
* finally consider doing it another way: if speed is important a method using lookup and interpolation can be a lot faster, at a cost of extra storage for the lookup tables.

##### Share on other sites
GCC''s cosine function is much much faster than yours. Besides that, take out those clamp loops (they slow it down a lot). Just make sure that the user passes clamped data. Here''s an idea. Instead of your loops, try:
  #define FOURPI 12.5663706143// ...while( ang > TWOPI ) { ang -= FOURPI; }while( ang < -TWOPI ) { ang += TWOPI; }

That sped it up a lot in my tests (I was using a big number though). You probably don''t need those if''s, also. Anyway, my advise is still to use the built in cosine function .

[Resist Windows XP''s Invasive Production Activation Technology!]

##### Share on other sites
ok, thanks for the tips

I tried this:

  #define HLFPI 1.57079632679 #define ONEPI 3.14159265358 #define TWOPI 6.28318530717 double FACS[7] = { -0.5000000000000, 0.0416666666667, -0.0013888888889, 0.0000248015873, -0.0000002755731, 0.0000000020876, -0.0000000000014 }; // function:double Cos( double ang ){ bool n = false; double t; double r = 1; int e = 2; if( ang < 0.0 ) { ang = -ang; } while( ang > TWOPI ) { ang -= TWOPI; } if( ang > ONEPI ) { ang = TWOPI - ang; } if( ang > HLFPI ) { ang = ONEPI - ang; n = true; } for(int f = 0; f < 6; f++) { t = ang; for(int k = 1; k < e; k++) { t *= ang; } r += t * FACS[f]; e += 2; } if(n){ return -r; } else { return r; } }

By restricting the angle to: PI/2 > angle > 0 I only need to run the loop six times.. its seems _slightly_ faster, but I think that my system is a bit too unstable to do acurate tests on (lots of strange programs running in the background I think hehe)
Well I'll probably use the built in cosinus/sinus functions anyway, but its fun trying to do make ur own versions aswell

edit: eye cat'n toype corrctly

Edited by - Jesper T on November 2, 2001 5:50:21 AM

##### Share on other sites

"GCC''s cosine function"

whats GCC ?

##### Share on other sites
.. actually, I get correct values even if I only run the loop 5 times.. strangely I didnt notice any speed change.. hmmm

quote:
Original post by Jesper T

"GCC''s cosine function"

whats GCC ?

GCC = "GNU Compiler Collection". Its a freeware compiler for multiple computer platforms, from the Free Software Foundation. It supports various languages including C, C++, Fortran, and others. You can easily write cross-platform console apps using GCC, and on free OS''s such as Linux you can access windowing libraries, OpenGL, etc. But it may not have good support (if any) for Windows application development, using Windows OpenGL, DirectX, or the Windows graphical interface though.

Check out: http://www.gnu.org/software/gcc/gcc.html

Oh, an "GNU" means "Gnu''s Not Unix"

Someone else probably can fill in some blanks on this.

Graham Rhodes
Senior Scientist
Applied Research Associates, Inc.

##### Share on other sites
edit: /me posts two times just to be sure

Edited by - Jesper T on November 2, 2001 2:14:47 PM

##### Share on other sites
hmm ok, thanks.

(I want to guive yuo teh impresion that I am very intelgient)

##### Share on other sites
quote:
Original post by grhodes_at_work
But it may not have good support (if any) for Windows application development, using Windows OpenGL, DirectX, or the Windows graphical interface though.

The GCC compiler works fine for Windows development if you have a port of the Win32 libraries to GCC. Most people use the MinGW32 port, with Dev C++ as their IDE .

[Resist Windows XP''s Invasive Production Activation Technology!]

##### Share on other sites
Here's a fast angle clamperizer

float ClampAngle(float fAngle){	if (fAngle > PI)			fAngle -= floorf( (fAngle+PI)/PI2 ) * PI2;	if (fAngle < -PI)		fAngle -=  ceilf( (fAngle-PI)/PI2 ) * PI2;	return fAngle;}

Edited by - Thrump on November 2, 2001 10:15:53 PM

##### Share on other sites
Thrump, if you want something to be fast NEVER call floor or ceil. They can take over 100 clock cycles on an x86 CPU (that''s worse than a single cos, sin, or sqrt). Use integer truncation instead.

[Resist Windows XP''s Invasive Production Activation Technology!]

##### Share on other sites
Good to know. Those 2 are from the ps2 libs. Yeah, if I''d done that on the pc, I would have been in for a rude (and chunky) surprise.

##### Share on other sites
I don''t know much about MIPS processors, so I couldn''t say anything about them . I guess they switch rounding modes faster, or something.

[Resist Windows XP''s Invasive Production Activation Technology!]

##### Share on other sites
For interests sake, I just timed them. Integer casting is still quite a bit faster.

1000 casts = 2100 cycles
1000 ceilf = 15000 cycles
1000 floorf = 24000 cycles

Not sure if my testing methods are sound. I did this. (T1 is a built in bus counter)
  *T1_COUNT = 0;for(i=0;i<1000;i++){ //x = (int)(x+1.0f); //x = ceilf(x); x = floorf(x);}duration = *T1_COUNT;printf("counter %f\n", duration);

One question. Why do you think doing both ceilf and floorf would only take 28000 cycles?
15000 + 24000 = 28000?
I could hazard a few guesses, but I'm not sure.

Edited by - Thrump on November 2, 2001 11:59:30 PM

##### Share on other sites
Weird. Doing all 3 takes 51000 cycles.

##### Share on other sites
Your compiler probably realizes that it only has to switch the FPU''s rounding mode once before running the many ceils, floors, or casts. If you do all three in one test it has to do 3 switches per loop.

[Resist Windows XP''s Invasive Production Activation Technology!]

##### Share on other sites
Is that on the ps2 you timed them?

On 586+ architectures, if ''s are evil. In tight loops it can be faster to do more work if you can eliminate an if .
Next, destroy loops if possible.

For instance:
  //this((DWORD*)&ang)[1] &= 0x7FFFFFFF;//can replace thisif(ang<0.0) {ang=-ang;}

That shaves off 6 ticks

If you have a number of values that you need to take the cosine of, make a cosine function that takes an array of doubles, takes the cosine then sticks the results back into the array.

You could try making that array of cofactors constant.

If you really want to try to out-do the math cos, get the Intel instruction set reference manual and learn x786
manuals

On my K6-3, I have your Cos function clocked at 973 ticks and math''s cos at 124. There is an opcode for cosines now (fcos), I think it was introduced with the pentium.

Magmai Kai Holmlor
- Not For Rent

##### Share on other sites
Ok, not that I actually understood this:

((DWORD*))[1] &= 0x7FFFFFFF;

But is it like a bit mask or something ?

And this:  ..caused trouble when I tried it

my compiler (msvc) doesnt recognize that character or something..

(I want to guive yuo teh impresion that I am very intelgient)

##### Share on other sites
what do u think?

double cos_table[6]=
{
1,
-0.5,
0.041666666666666666666666666666667,
-0.0013888888888888888888888888888889,
2.4801587301587301587301587301587e-5,
-2.7557319223985890652557319223986e-7
};

double sin_table[6]=
{
1,
-0.16666666666666666666666666666667,
0.0083333333333333333333333333333333,
-0.0001984126984126984126984126984127,
2.7557319223985890652557319223986e-6,
-2.5052108385441718775052108385442e-8
};

double fast_int_power(double x,int y)
{
double r=1;
for (int i=1;i<=y;i++)
r*=x;
return (r);
}

double _cos(double x)
{
double res;
res = 0;
for (int i=0;i<6;i++)
{
res += fast_int_power(x,i<<1)*cos_table;
}
return res;
}

double _sin(double x)
{
float res;
res = 0;
for (int i=0;i<6;i++)
{
res += fast_int_power(x,(i<<1)+1)*sin_table[i];
}
return res;
}

##### Share on other sites
(i<<1) == (i * 2)

right ?

..but theres is no angle checking

(I want to guive yuo teh impresion that I am very intelgient)

##### Share on other sites
For sine/cosine something I tried was to create two rather large arrays of 3600 floats each, and filled them with the appropriate trig function values for 0.0 to 359.9 in .1 increments. Then, to find the sines I just used a macro that cast the argument to an integer and multiplied by ten to get the right array index.

It could probably be sped up considerably by reducing the "resolution" to 1/8 or maybe even 1/4 increments and then instead of multiplying by ten, just shifting to the left a bit.

##### Share on other sites
Cosine is also defined as an infinite series I can''t think of right now... don''t use it, because it won''t be accurate unless you expand it to the first five terms, and then it''s slow to caculate.

##### Share on other sites
Just had a thought.... How about you store the sin and cosine values in an array from 0 to 360... ( in 1 degree increments ). Then, in your function, linearly interpolate between the two closest angles in the array. So, you'd do something like:

  double _cos[360];void InitCosArray(){ for( int i = 0; i < 360; i++ ) { double r = PI/180 * i; _cos[i] = cos( r ); }}// constraints... 0 <= angle < 360// ( the "d_" stands for degrees, a function could also be written for radians ).double d_Cosine( double angle ){ int index1 = (int)angle; // can't get rid of this, at least I don't know how to return _cos[index1] + ( ( angle - index1 ) * ( _cos[index1+1] - _cos[index1] ) );}

The accuracy isn't as good as cos(), but it's accurate to about 4 decimal places. The I timed 1000000 cos and d_Cosine calls with the angle of "179.235754". cos() produced "-0.999911" and timed at 160ms, d_Cosine produced "-0.999884" and timed at 130ms.

Marginally faster. If your using whole degree's, just use a look-up table...

[edit: spelling mistake]

Edited by - python_regious on November 7, 2001 11:51:23 AM

##### Share on other sites
Yeah, that looks fast, but the main problem (I think) is to check if the angle is grater than 2PI/360 and then reduce it, it requires some time, but maybe not so much if checking integers.. dunno.. oh, and btw, does anybody know if it is possible to do this:
  value = -value

by just changing the sign bit ?

I tried this:

  ((DWORD*))[1] &= 0x7FFFFFFF;

.. couldnt make it work.. I have no clue what those signs are.. well some, I guess the F''s are hex..

• ### Forum Statistics

• Total Topics
628703
• Total Posts
2984305

• 23
• 10
• 9
• 13
• 13