Archived

This topic is now archived and is closed to further replies.

q2guy

Optimizing reciprocal

Recommended Posts

I saw some time ago a trick to optimize a 1.0/something operation, but I have forgot where I see it and how it was done, anyone know it, and have a link or time to post it here ? thanks!

Share this post


Link to post
Share on other sites
no, nothing to do with sqrt aprox, It was a method to do a 1.0f/something ( 1.0/x^2, 1.0/(x+x^2+2*x), 1.0/(a-b) ...) faster than with a fldz/fdiv ...

Share this post


Link to post
Share on other sites
sse has one in.




If that''s not the help you''re after then you''re going to have to explain the problem better than what you have. - joanusdmentia

davepermen.net

Share this post


Link to post
Share on other sites
no, it was reciprocal optimization, like sse instruction will do internally, only I want to know the method to do it (aproximated I think)

thanks

Share this post


Link to post
Share on other sites
well, check out carmacks revsqrt, based on this, you can create a 1/x approx as well.

the carmack one approximates x^-(1/2), you want x^-(1/1).. so all you need to take away is the shift right (wich is division by two), sort of.

i''ve had one once, but thats about 6hd''s back:D




If that''s not the help you''re after then you''re going to have to explain the problem better than what you have. - joanusdmentia

davepermen.net

Share this post


Link to post
Share on other sites
If it''s not based only on a lut, it''s necessarilly the same concept as the used in the IEE754 + NR trick some name it as "Carmack''s" sqrt. But that was found much before Q3. Well Newton in one sense. You have to understand differentials and the IEEE754 format. That''s all. And yes anyway 1/x is rqsrt(x) squared. This to show you how everything is linked.

Share this post


Link to post
Share on other sites
I think I''ve a solution to your problem.

It uses the Newton iteration and an initial guess I came up completely by myself (But perhaps some others have discovered similar methods for the initial guess).


float frcp_estm(float x) // the initial guess

{
unsigned long rcp = 0x7ef311ef - *(unsigned long*)&x;
return *(float*)&rcp;
}

float frcp_aprox(float x) // approximation with 1 Newton iteration

{
float y = frcp_estm(x);
y *= 2-x*y; // Newton iteration

return y;
}


The maximum error of the initial guess is about 5,76%.
This error will be halfed by each Newton iteration.

But... The initial guess does NOT work for:
(i) +0; -0; +Inf; -Inf; NaN
(ii) denormal numbers
(iii) if the result would be a denormal number

Share this post


Link to post
Share on other sites