SQRT(); Another Option? give me your ideas....

Started by
54 comments, last by OpenGL_Guru 19 years, 10 months ago
so the thread starter says it all. let me elaborate a moment. you need to find the distance between 2 points. lets say an asteroid and a ship. to get this distance(probably to check for some sort of collision) you need to use the sqrt like so.. sqrt(x*x, y*y) - 2d check sqrt(x*x, y*y, z*z) - 3d check heres the problem and i have seen this happen and have witnessed it first hand... the sqrt functoin takes up anywhere from 60 - 80 CPU cycles. this is incredibly SLOW! probably about 55 times too slow ifyou ask me. so i ask you, whats the best option? yes there are a few and i have my own idea, such as doing fast addition...but i was wondering if anyone else had a similar or better approach. one CPU cycle really isnt that big of a deal. but at 60 FPS or more you are eating away at your CPU. thanks all in advance for your ideas! --guru
heh
Advertisement
Much of the time you don''t actually need to do the square root in distance calculations. For example, collision detection is usually handled by comparing the computed distance against a known threshold. In those cases you can instead compare the computed square distance against the threshold squared.
isqrt.h:
#ifndef __ISQRT_H#define __ISQRT_H#ifdef __cplusplusextern "C" {#endifunsigned short isqrt(unsigned long a);unsigned short iisqrt(unsigned long a);unsigned short ihypot (unsigned long dx, unsigned long dy);#ifdef __cplusplus}#endif#endif 


isqrt.c:
#include "isqrt.h"unsigned short isqrt(unsigned long a){unsigned long rem = 0;unsigned long root = 0;int i;for (i = 0; i < 16; i++) {   root <<= 1;   rem = ((rem << 2) + (a >> 30));   a <<= 2;   root++;   if (root <= rem) {      rem -= root;      root++;      }   else      root--;  }return (unsigned short) (root >> 1);}unsigned short ihypot (unsigned long dx, unsigned long dy){return isqrt (dx * dx + dy * dy);}unsigned short iisqrt(unsigned long a){unsigned long rem = 0;unsigned long root = 0;unsigned long divisor = 0;int i;for (i = 0; i < 16; i++) {   root <<= 1;   rem = ((rem << 2) + (a >> 30));   a <<= 2;   divisor = (root << 1) + 1;   if (divisor <= rem) {      rem -= divisor;      root++;      }   }return (unsigned short)(root);} 
_________karx11erxVisit my Descent site or see my case mod.
What about the sin() and cos() functions?
Any idea to speed these up?
You will not really need sqrt unless you want to normalize vectors.
For this, there''s a pretty good approximation of 1/sqrt(x) using the following function :

float inv_sqrt(float x) {
float xhalf = 0.5f*x;
int i = *(int*)&x
i = 0x5f3759df - (i >> 1);
x = *(float*)&i
x = x*(1.5f - xhalf*x*x);
return x;
}


SaM3d!, a cross-platform API for 3d based on SDL and OpenGL.
The trouble is that things never get better, they just stay the same, only more so. -- (Terry Pratchett, Eric)
SaM3d!, a cross-platform API for 3d based on SDL and OpenGL.The trouble is that things never get better, they just stay the same, only more so. -- (Terry Pratchett, Eric)
ever heard about lookup-tables?

if not, take either the red pill, or google...
sin() and cos() basically use Newton's Method approximation (look it up).

So if you don't want to do the look up table

You can do this yourself and do the newton method to a lesser degree of precision, thus less calculations. (of course to get max performance you would do this with assembly)

EDIT: misspelled newton

[edited by - snisarenko on May 25, 2004 4:15:20 PM]
i know assembly would be faster.. just b/c its right there next to the CPU and you dont have to worry about buses, but in a couple of years we are looking at 5 - 7 Ghz machines.. the CPU''s keep getting faster and faster..i mean when will it be worth anyones time to even use assembly?
heh
Currently, the new Prescott (or was it Northwood?) chipset supports up to 6 GHz processors.

Also, most game consoles have processors < 1 GHz. That''s where you would use assembly. Processor power should also not be measured in clock speeds alone.
- fyhuang [ site ]
erm... you still have to worry bout the bus when writing asmembly. Infact your c compiler complies to bytecode, where your asmembly compiler also compiles to bytecode. it''s just humans can usually code better then a compiler can if they really work at it.

The xbox has a 1.25 ghz processor, yet the gamecube sometimes looks better. you see, the speed of the processor does have a inpact, but so does everything else. I think it''s fine to use asm on a 200 ghz machine if the task calls for it.

This topic is closed to new replies.

Advertisement