pow alternative
first off, this is actually two questions in one (all c/c++ related):
1) suppose i'm calling this a few million times:
pow(x,.125);
given what you know about numbers (probably a whole lot more than me), how can this be sped up? or can it?
2) also, i need an general purpose alternative to float pow(float,float);. i've searched the web and found some very elaborate versions that supposedly offer better performance at a reduced precision (which is EXACTLY what i need), however none of them seem to produce faster results in my benchmark tests. if you've got a nice alternative pow function in your library, i'd love to see it!
inline assembly versions warmly welcomed :)
Quote:Original post by fluke
1) suppose i'm calling this a few million times:
pow(x,.125);
given what you know about numbers (probably a whole lot more than me), how can this be sped up? or can it?
Can you change your algorithm to call it fewer times?
What properties does x have?
Just a thought really, have you profiled against:
sqrt(sqrt(sqrt(x)));
It might be possible, though I wouldn't know how, for someone else here optimise for the octic root of a number.
because that function call would easily approach accuracy problems with IEEE754 would an approximation be sufficient? the most obvious one I can think of is Newton Raphson; see the Quake3 source code for Carmackien/Abrashien implementation (which you can probably bet is fast..)
see: http://www.beyond3d.com/content/articles/8/
[EDIT: might also give you some ideas about how you could implement a faster pow(float,float) ?]
see: http://www.beyond3d.com/content/articles/8/
[EDIT: might also give you some ideas about how you could implement a faster pow(float,float) ?]
Quote:Original post by dmatterQuote:Original post by fluke
1) suppose i'm calling this a few million times:
pow(x,.125);
given what you know about numbers (probably a whole lot more than me), how can this be sped up? or can it?
Can you change your algorithm to call it fewer times?
What properties does x have?
Just a thought really, have you profiled against:
sqrt(sqrt(sqrt(x)));
It might be possible, though I wouldn't know how, for someone else here optimise for the octic root of a number.
dmatter, unfortunately the algorithm cannot be called fewer times.
i tried sqrt(sqrt(sqrt(x))); and it actually offered a slight performance boost.
where a typical result might yeild a turn-around time of 38891ms, your method resulted in a 32907ms turn-around. i know it's all out of context here, but while that's not the optimal performance increase i was looking for - it is certainly much better than what i had!
and silvermace, yes, approximations are quite fine. i'll take a look at the link you provided. thanks!
this is actually a pretty interesting optimisation problem.. I wonder if you could use SSE to calculate the square roots 4 elements at time? I think there is an unbreakable circular dependency there though, because x is a function of its previous value...
A first attempt seems to be about twice as fast as std::pow(x,.125):
#include <cmath>struct EighthRoot { double input; double f_over_f_prime(double x) { double x2=x*x; double x4=x2*x2; double x6=x4*x2; double x7=x6*x; double x8=x4*x4; return (x8-input)/(8.0*x7); }public: EighthRoot(double input) : input(input) { } operator double () { union { long long as_long_long; double as_double; } hackish_union; hackish_union.as_double = input; hackish_union.as_long_long -= 0x3ff0000000000000ll; hackish_union.as_long_long >>= 3; hackish_union.as_long_long += 0x3ff0000000000000ll; double x = hackish_union.as_double; for(unsigned i=0;i<3;++i) { x -= f_over_f_prime(x); } return x; }};int main() { std::cout << EighthRoot(1000.0) << '\n';}
Quote:Original post by flukeIn that case then, besides getting an actual octic rooter function, you can be looking into finding optimised/approximated sqrt functions - of which I'm sure there are plenty.
i tried sqrt(sqrt(sqrt(x))); and it actually offered a slight performance boost.
Of course, if x has any usable properties then there could be far simpler optimisations. For example, if x only ever takes on a handful of different values, then you can perform the calculations in advance and just conditionally select the correct one. I suspect that's hoping for too much though.
This is a bit better.
#include <iostream>#include <cmath>class EighthRoot { double input; double f_over_f_prime(double x) { double x2=x*x; double x4=x2*x2; double x8=x4*x4; return .125*x*(1.0-input/x8); }public: EighthRoot(double input) : input(input) { } operator double () { // Find first guess by basically shifting the exponent part of the // double by 3 (I don't really care much about the rest of the double) union { long long as_long_long; double as_double; } hackish_union; hackish_union.as_double = input; hackish_union.as_long_long >>= 3; hackish_union.as_long_long += 0x37F2000000000000ll; // I am sure one could find a slightly better constant double x = hackish_union.as_double; x -= f_over_f_prime(x); x -= f_over_f_prime(x); // Repeat more times for extra accuracy return x; }};int main() { std::cout << EighthRoot(1000.0) << '\n';}
Ah, nevermind. What I posted is only marginally faster than sqrt(sqrt(sqrt(x))) and significantly less precise. A better magic constant is 0x37f1504f333fa37ell, but it's not much better.
fiddled with compiler settings? VC++ 7.1/9/10 are pretty capable...
http://msdn.microsoft.com/en-us/library/aa289157(VS.71).aspx
http://msdn.microsoft.com/en-us/library/aa289157(VS.71).aspx
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement