• 14
• 12
• 9
• 10
• 13

# Faster Sin and Cos

This topic is 556 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hi

I'm curious, how do you do the fitting/checking? Do you use something like Mathematica? Or another software you can recommend? Or do you use some test routines you write yourself?

##### Share on other sites

I'm curious, how do you do the fitting/checking? Do you use something like Mathematica? Or another software you can recommend? Or do you use some test routines you write yourself?

I wrote my own tests. Mathematica and other tools are good for getting a result mathematically near-exact, but once you feed those numbers into a computer you get different values.
Fitting the values to floats or doubles means to adjust the numbers to account for truncation that happens when converting to floats or doubles.
Accounting for this requires custom tools.

Later I will describe how I check the accuracy and will post all of the results as well as benchmark tests.
I will additionally benchmark Adam_42’s version as well as formatting the way they show in the paper linked by Alberth.

L. Spiro

##### Share on other sites

Interesting endeavour, but I'll stay with Garrett's implementation (see post by Alberth). It is good enough, within 1 digit of float precision, and "proven", and it is consistently the fastest in my tests (22.5 times faster than the C library).

My results, if anyone is interested:

-O3 -march=skylake -fpmath=sse:
sin               : 37470663    [0.120873]
sin_garrett_c11   : 1585131     [0.005113]
sin_garrett_c11_s : 2178746     [0.007028]
sin_spiro_s       : 3010460     [0.009711]
sin_spiro         : 2755805     [0.008890]
ratio: 23.6388431

with --fast-math also:
sin               : 37378012    [0.120574]
sin_garrett_c11   : 1585772     [0.005115]
sin_garrett_c11_s : 2179679     [0.007031]
sin_spiro_s       : 3010939     [0.009713]
sin_spiro         : 2783895     [0.008980]
ratio: 23.570861

sin               : 36788379    [0.118672]
sin_garrett_c11   : 1633470     [0.005269]
sin_garrett_c11_s : 2140569     [0.006905]
sin_spiro_s       : 2981539     [0.009618]
sin_spiro         : 2742520     [0.008847]
ratio: 22.521613

sin               : 36832914    [0.118816]
sin_garrett_c11   : 1636284     [0.005278]
sin_garrett_c11_s : 2141416     [0.006908]
sin_spiro_s       : 2982003     [0.009619]
sin_spiro         : 2740198     [0.008839]
ratio: 22.510098

You can make those functions significantly faster [...] I tested this in a VS 2015 x64 release build. YMMV.

This one is funny. On my machine, with GCC 6.1 (64 bit), it takes about twice as long. Wonder how it can be so different.

##### Share on other sites
You are comparing his 11th-degree version to my 15th-degree version, which will naturally be slower.
I would only be interested in comparing his 15th-degree to my 15th-degree. I will derive better 11th-degree constants later—these take time.
The performance should be exactly equal (unless re-ordering the expression matters) but provide a tighter fit (more accurate).

L. Spiro

##### Share on other sites

You are comparing his 11th-degree version to my 15th-degree version, which will naturally be slower.
I would only be interested in comparing his 15th-degree to my 15th-degree. I will derive better 11th-degree constants later—these take time.
The performance should be exactly equal (unless re-ordering the expression matters) but provide a tighter fit (more accurate).

L. Spiro

Yes, that's true. Running from -PI to +PI in a million steps, and looking at the difference to the C lib function, we get:

sin                : emax=0.000000000000000     eavg=0.000000000000000  sse=0.000000000000000  // unsurprising
sin_garrett_c11    : emax=0.000000291691886     eavg=0.000000051244958  sse=0.000000003505368
sin_garrett_c11_s  : emax=0.000000472727176     eavg=0.000000058468439  sse=0.000000005403125
sin_spiro_s        : emax=0.000000350934626     eavg=0.000000040066997  sse=0.000000003403369
sin_spiro          : emax=0.000000019798559     eavg=0.000000008708427  sse=0.000000000126031
sin_adam42         : emax=0.000000019798559     eavg=0.000000008708427  sse=0.000000000126031

That's the double precision version of your function which is the clear winner (the one you posted is single precision). But double precision runs faster than single precision version, anyway. Interestingly, there are no observable rounding errors between yours and adam_42's version, I would have expected that -- after all they perform operations ordered differently, so the results should differ very slightly.

Will be interesting to see how well the C11 version fares. Question is which metric is most important, I'm almost inclined to think "max error".

Edited by samoth

##### Share on other sites

I can see L. Spiro is doing a good job of getting fast approximations to sin and cos, but I wonder why this is an important problem. What are people doing (in particular in games) that makes the calls to sin and cos take a noticeable chunk of time?

It would be nice if the results were never larger than one (e.g., Sin(1.57083237171173f) gives me 1.00000011920929f). Can the coefficient optimization be constrained by that?

EDIT: This forum is too smart for its own good. I am making two unrelated comments and I am trying to post them in two separate posts, but the forum combines them. Annoying.

Edited by Álvaro