# pow alternative

This topic is 3830 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

first off, this is actually two questions in one (all c/c++ related): 1) suppose i'm calling this a few million times: pow(x,.125); given what you know about numbers (probably a whole lot more than me), how can this be sped up? or can it? 2) also, i need an general purpose alternative to float pow(float,float);. i've searched the web and found some very elaborate versions that supposedly offer better performance at a reduced precision (which is EXACTLY what i need), however none of them seem to produce faster results in my benchmark tests. if you've got a nice alternative pow function in your library, i'd love to see it! inline assembly versions warmly welcomed :)

##### Share on other sites
Quote:
 Original post by fluke1) suppose i'm calling this a few million times: pow(x,.125);given what you know about numbers (probably a whole lot more than me), how can this be sped up? or can it?

Can you change your algorithm to call it fewer times?
What properties does x have?

Just a thought really, have you profiled against:

sqrt(sqrt(sqrt(x)));

It might be possible, though I wouldn't know how, for someone else here optimise for the octic root of a number.

##### Share on other sites
because that function call would easily approach accuracy problems with IEEE754 would an approximation be sufficient? the most obvious one I can think of is Newton Raphson; see the Quake3 source code for Carmackien/Abrashien implementation (which you can probably bet is fast..)

see: http://www.beyond3d.com/content/articles/8/

[EDIT: might also give you some ideas about how you could implement a faster pow(float,float) ?]

##### Share on other sites
Quote:
Original post by dmatter
Quote:
 Original post by fluke1) suppose i'm calling this a few million times: pow(x,.125);given what you know about numbers (probably a whole lot more than me), how can this be sped up? or can it?

Can you change your algorithm to call it fewer times?
What properties does x have?

Just a thought really, have you profiled against:

sqrt(sqrt(sqrt(x)));

It might be possible, though I wouldn't know how, for someone else here optimise for the octic root of a number.

dmatter, unfortunately the algorithm cannot be called fewer times.

i tried sqrt(sqrt(sqrt(x))); and it actually offered a slight performance boost.

where a typical result might yeild a turn-around time of 38891ms, your method resulted in a 32907ms turn-around. i know it's all out of context here, but while that's not the optimal performance increase i was looking for - it is certainly much better than what i had!

and silvermace, yes, approximations are quite fine. i'll take a look at the link you provided. thanks!

##### Share on other sites
this is actually a pretty interesting optimisation problem.. I wonder if you could use SSE to calculate the square roots 4 elements at time? I think there is an unbreakable circular dependency there though, because x is a function of its previous value...

##### Share on other sites
A first attempt seems to be about twice as fast as std::pow(x,.125):
#include <cmath>struct EighthRoot {  double input;  double f_over_f_prime(double x) {    double x2=x*x;    double x4=x2*x2;    double x6=x4*x2;    double x7=x6*x;    double x8=x4*x4;    return (x8-input)/(8.0*x7);  }public:  EighthRoot(double input) : input(input) {  }  operator double () {    union {      long long as_long_long;      double as_double;    } hackish_union;    hackish_union.as_double = input;    hackish_union.as_long_long -= 0x3ff0000000000000ll;    hackish_union.as_long_long >>= 3;    hackish_union.as_long_long += 0x3ff0000000000000ll;    double x = hackish_union.as_double;    for(unsigned i=0;i<3;++i) {      x -= f_over_f_prime(x);    }    return x;  }};int main() {  std::cout << EighthRoot(1000.0) << '\n';}

##### Share on other sites
Quote:
 Original post by flukei tried sqrt(sqrt(sqrt(x))); and it actually offered a slight performance boost.
In that case then, besides getting an actual octic rooter function, you can be looking into finding optimised/approximated sqrt functions - of which I'm sure there are plenty.

Of course, if x has any usable properties then there could be far simpler optimisations. For example, if x only ever takes on a handful of different values, then you can perform the calculations in advance and just conditionally select the correct one. I suspect that's hoping for too much though.

##### Share on other sites
This is a bit better.
#include <iostream>#include <cmath>class EighthRoot {  double input;  double f_over_f_prime(double x) {    double x2=x*x;    double x4=x2*x2;    double x8=x4*x4;    return .125*x*(1.0-input/x8);  }public:  EighthRoot(double input) : input(input) {  }  operator double () {    // Find first guess by basically shifting the exponent part of the    // double by 3 (I don't really care much about the rest of the double)    union {      long long as_long_long;      double as_double;    } hackish_union;    hackish_union.as_double = input;    hackish_union.as_long_long >>= 3;    hackish_union.as_long_long += 0x37F2000000000000ll;    // I am sure one could find a slightly better constant        double x = hackish_union.as_double;    x -= f_over_f_prime(x);    x -= f_over_f_prime(x);    // Repeat more times for extra accuracy    return x;  }};int main() {  std::cout << EighthRoot(1000.0) << '\n';}

##### Share on other sites
Ah, nevermind. What I posted is only marginally faster than sqrt(sqrt(sqrt(x))) and significantly less precise. A better magic constant is 0x37f1504f333fa37ell, but it's not much better.

• ### What is your GameDev Story?

In 2019 we are celebrating 20 years of GameDev.net! Share your GameDev Story with us.

• 9
• 13
• 9
• 9
• 15
• ### Forum Statistics

• Total Topics
634076
• Total Posts
3015350
×