Could you tell me how the code works ?
I have got some code which calculates the length of a vector, but i don't understand it.
The code is showed below:
========================================================
int Fast_Distance_2D(int x, int y)
{
// this function computes the distance from 0,0 to x,y with 3.5% error
// first compute the absolute value of x,y
x = abs(x);
y = abs(y);
// compute the minimum of x,y
int mn = MIN(x,y);
// return the distance
return(x+y-(mn>>1)-(mn>>2)+(mn>>4));
} // end Fast_Distance_2D
///////////////////////////////////////////////////////////////////////////////
float Fast_Distance_3D(float fx, float fy, float fz)
{
// this function computes the distance from the origin to x,y,z
int temp; // used for swaping
int x,y,z; // used for algorithm
// make sure values are all positive
x = fabs(fx) * 1024;
y = fabs(fy) * 1024;
z = fabs(fz) * 1024;
// sort values
if (y < x) SWAP(x,y,temp)
if (z < y) SWAP(y,z,temp)
if (y < x) SWAP(x,y,temp)
int dist = (z + 11 * (y >> 5) + (x >> 2) );
// compute distance with 8% error
return((float)(dist >> 10));
} // end Fast_Distance_3D
======================================================
The author said it's base on Taylor Series. I know Taylor Series, but I don't know how the code works. Could you help me? Thank you!
I'm not a native English speaker, and I hope my words are clear enough.
It is simply an approximation. In the 2D version, the function d = sqrt(a2 + b2) is approximated by the function d = a + .375b(where a is the length of the longest side and b is the length of the shortest side). Using shortest and longest reduces the error. The 3D version is similar.
Here is a graph of the two functions (a is constant):
[Edited by - JohnBolton on January 20, 2006 3:49:09 PM]
Here is a graph of the two functions (a is constant):
[Edited by - JohnBolton on January 20, 2006 3:49:09 PM]
With all those branches that 'fast approximation' is probably slower than doing it the normal way (and without the 8% error). Maybe someone less lazy than me can whip up a quick benchmark.
i think it's called the manhattan distance.
it uses maclaurin/taylor series to approximate, however i don't know how it's derived since it has two variables and i'm very curious. if anyone can explain?
it uses maclaurin/taylor series to approximate, however i don't know how it's derived since it has two variables and i'm very curious. if anyone can explain?
Quote:Original post by OrangyTang
With all those branches that 'fast approximation' is probably slower than doing it the normal way (and without the 8% error). Maybe someone less lazy than me can whip up a quick benchmark.
I did a quick test, I got the following results:
Fast_Distance_3D: 10.3533 secs, 2^32/10 iterations
sqrt(x*x+y*y+z*z): 7.68405 secs, 2^32/10 iterations
Fast_Distance_2D: 40.7154 secs, 2^32 iterations
int_sqrt(x*x+y*y): 1.39683E-006 secs, 2^32 iterations
For integer sqrt I used the following function:
unsigned int isqrt(unsigned int input){ unsigned int rest = input; unsigned int step = (1<<30); unsigned int result = 0; if (step <= rest) result |= (1<<15), rest -= step; step = (result << 15) + (1<<28); if (step <= rest) result |= (1<<14), rest -= step; step = (result << 14) + (1<<26); if (step <= rest) result |= (1<<13), rest -= step; step = (result << 13) + (1<<24); if (step <= rest) result |= (1<<12), rest -= step; step = (result << 12) + (1<<22); if (step <= rest) result |= (1<<11), rest -= step; step = (result << 11) + (1<<20); if (step <= rest) result |= (1<<10), rest -= step; step = (result << 10) + (1<<18); if (step <= rest) result |= (1<<9), rest -= step; step = (result << 9) + (1<<16); if (step <= rest) result |= (1<<8), rest -= step; step = (result << 8) + (1<<14); if (step <= rest) result |= (1<<7), rest -= step; step = (result << 7) + (1<<12); if (step <= rest) result |= (1<<6), rest -= step; step = (result << 6) + (1<<10); if (step <= rest) result |= (1<<5), rest -= step; step = (result << 5) + (1<<8); if (step <= rest) result |= (1<<4), rest -= step; step = (result << 4) + (1<<6); if (step <= rest) result |= (1<<3), rest -= step; step = (result << 3) + (1<<4); if (step <= rest) result |= (1<<2), rest -= step; step = (result << 2) + (1<<2); if (step <= rest) result |= (1<<1), rest -= step; step = (result << 1) + (1<<0); if (step <= rest) result++; if (input > result*result+result) result++; //for correct rounding return result;}
I did the test on an AMD Athlon64 3200+ with 1024 MB DDR RAM in Release mode with MSVC2005. Also I used QueryPerformanceCounter to get time.
Here is the complete test code:
#include <iostream>#include <cmath>#include <limits>#include <windows.h>#define SWAP(i,j,k) k = i; i = j; j = k#define MIN(x,y) x>y?x:yunsigned int isqrt(unsigned int input){ unsigned int rest = input; unsigned int step = (1<<30); unsigned int result = 0; if (step <= rest) result |= (1<<15), rest -= step; step = (result << 15) + (1<<28); if (step <= rest) result |= (1<<14), rest -= step; step = (result << 14) + (1<<26); if (step <= rest) result |= (1<<13), rest -= step; step = (result << 13) + (1<<24); if (step <= rest) result |= (1<<12), rest -= step; step = (result << 12) + (1<<22); if (step <= rest) result |= (1<<11), rest -= step; step = (result << 11) + (1<<20); if (step <= rest) result |= (1<<10), rest -= step; step = (result << 10) + (1<<18); if (step <= rest) result |= (1<<9), rest -= step; step = (result << 9) + (1<<16); if (step <= rest) result |= (1<<8), rest -= step; step = (result << 8) + (1<<14); if (step <= rest) result |= (1<<7), rest -= step; step = (result << 7) + (1<<12); if (step <= rest) result |= (1<<6), rest -= step; step = (result << 6) + (1<<10); if (step <= rest) result |= (1<<5), rest -= step; step = (result << 5) + (1<<8); if (step <= rest) result |= (1<<4), rest -= step; step = (result << 4) + (1<<6); if (step <= rest) result |= (1<<3), rest -= step; step = (result << 3) + (1<<4); if (step <= rest) result |= (1<<2), rest -= step; step = (result << 2) + (1<<2); if (step <= rest) result |= (1<<1), rest -= step; step = (result << 1) + (1<<0); if (step <= rest) result++; if (input > result*result+result) result++; //for correct rounding return result;}int Fast_Distance_2D(int x, int y){// this function computes the distance from 0,0 to x,y with 3.5% error// first compute the absolute value of x,yx = abs(x);y = abs(y);// compute the minimum of x,yint mn = MIN(x,y);// return the distancereturn(x+y-(mn>>1)-(mn>>2)+(mn>>4));} // end Fast_Distance_2Dfloat Fast_Distance_3D(float fx, float fy, float fz){ int temp; int x,y,z; // used for algorithm // make sure values are all positive x = static_cast<int>(fabs(fx) * 1024.0F); y = static_cast<int>(fabs(fy) * 1024.0F); z = static_cast<int>(fabs(fz) * 1024.0F); // sort values if (y < x) SWAP(x,y,temp); if (z < y) SWAP(y,z,temp); if (y < x) SWAP(x,y,temp); int dist = (z + 11 * (y >> 5) + (x >> 2) ); // compute distance with 8% error return((float)(dist >> 10));}float Distance_3D(float fx, float fy, float fz){ return sqrtf(fx*fx+fy*fy+fz*fz);}int Distance_2D(int fx, int fy){ return isqrt(fx*fx+fy*fy);}#undef maxint main(){ const unsigned int num_iter = std::numeric_limits<unsigned int>::max()/10; const unsigned int num_iter2 = std::numeric_limits<unsigned int>::max(); float result = 0.0F; int result2 = 0; __int64 freq; __int64 time0; __int64 time1; QueryPerformanceFrequency( reinterpret_cast<LARGE_INTEGER*>(&freq)); QueryPerformanceCounter( reinterpret_cast<LARGE_INTEGER*>(&time0)); for(unsigned int i = 0; i < num_iter; ++i) { result += Fast_Distance_3D( result, result, result); } QueryPerformanceCounter( reinterpret_cast<LARGE_INTEGER*>(&time1)); std::cout << "Time to use \"fast\" distance function: " << static_cast<double>(time1 - time0)/static_cast<double>(freq) << std::endl; QueryPerformanceCounter( reinterpret_cast<LARGE_INTEGER*>(&time0)); for(unsigned int i = 0; i < num_iter; ++i) { result += Distance_3D( result, result, result); } QueryPerformanceCounter( reinterpret_cast<LARGE_INTEGER*>(&time1)); std::cout << "Time to use distance function: " << static_cast<double>(time1 - time0)/static_cast<double>(freq) << std::endl; QueryPerformanceCounter( reinterpret_cast<LARGE_INTEGER*>(&time0)); for(unsigned int i = 0; i < num_iter2; ++i) { result2 += Fast_Distance_2D( result2, result2); } QueryPerformanceCounter( reinterpret_cast<LARGE_INTEGER*>(&time1)); std::cout << "Time to use \"fast\" 2d distance function: " << static_cast<double>(time1 - time0)/static_cast<double>(freq) << std::endl; QueryPerformanceCounter( reinterpret_cast<LARGE_INTEGER*>(&time0)); for(unsigned int i = 0; i < num_iter2; ++i) { result2 += Distance_2D( result2, result2); } QueryPerformanceCounter( reinterpret_cast<LARGE_INTEGER*>(&time1)); std::cout << "Time to use 2d distance function: " << static_cast<double>(time1 - time0)/static_cast<double>(freq) << std::endl; std::cout << "Garbage:" << result <<std::endl; std::cin.get(); return 0;}
Well, I know, the discussion about sqrt has nothing new (I have checked all the threads before). I just feel curious in how the formula was derived (though i prefer carmack sqrt in fact). I think the method used in the code i posted may get well used in other calculating.
To CTar:
The isqrt function you posted i have found in glu lib before, but i don't understand it, too. Could you or anyone else do me a favor to explain it ?
Thank you very much !
To CTar:
The isqrt function you posted i have found in glu lib before, but i don't understand it, too. Could you or anyone else do me a favor to explain it ?
Thank you very much !
Ok, no more discussion is needed. I have found all the answers here:
All You Ever Want to Know About Square Root by Paul Hsieh
and here:
Computing Integer Square Root by James Ulery (PDF)
All You Ever Want to Know About Square Root by Paul Hsieh
and here:
Computing Integer Square Root by James Ulery (PDF)
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement