error = sine (f64vec4* result, const f64vec4* angle);
error = cosine (f64vec4* result, const f64vec4* angle);
The routines compute all 4 elements simultaneously and in parallel with SIMD/AVX/FMA instructions. The more generalized function looks like this:
error = sincos (f64vec4 result, const f64vec4* angle, int select);
Where the low 4-bits of the select argument specify whether the sine or cosine is desired for each of the f64 elements in the angle argument. And yes, believe it or not, the routine computes any arbitrary combination of sine and cosine, and it computes them all simultaneously and in parallel. I didn't even realize that was possible (efficiently) until I wrote these routines, but mostly thanks to the vcmppd and vpcmov instructions, they are!
Anyway, my problem is testing the results! I have several versions of these functions based upon 7, 8, 9, 10 and 11 coefficient chebyshev polynomials. Obviously the fewer terms the faster they run... though the difference is strangely tiny. My problem is this. I need to know what errors are produced to make a final decision on which routines to "keep" and make standard.
I can't trust the results of the FPU sine and cosine instructions or fancy math libraries to be precise to the final bit... partly because I don't trust anyone or anything, but partly because my results imply they are not precise. For example, my 9, 10, 11 coefficient routines are often strangely different than those "standard" results by several bits, but strangely similar to each other. Hmmmm. And I learned that some of those "standard" computations are done with 7 and 8 coefficient chebyshev routines, which should be on the low end of the precision scale. My suspicion is further reinforced by observing those "standard" computations give results much closer to my 7 and 8 coefficient routines than my 9, 10, 11 coefficient routines.
But the above is purely "reasonable inference" and "educated guesswork" on my part. I need someone to tell me how to create a "gold routine" to test ALL of them against. I'm plenty math-savvy to take an existing equation (like the chebyshev equations for sine and cosine) plus a table of coefficients --- and make working routines. I am NOT math-savvy enough to figure out how to make "gold standard" sine and cosine routines to verify the precision of all these "standard/common" routines, and my routines.
So, is anyone out there enough of a math genius to face this challenge?
Edited by maxgpgpu, 18 July 2012 - 07:51 PM.