Some kind of fast unit vector normalization?
I've been doing some research, and I've come across references to a fast vector normalization algorithm that uses Newton-Raphson approximations for the inverse square root.
Unfortunately, I can't actually find the algorithm. Does anyone know it or another method for quickly finding the unit vector for a vector?
you can also use the rsqrtss assembly instruction for processors that have SSE. It gives 12-bits of precision and runs with just 4 cycles of latency, so the slowest part of the normalization would be dotting the vector on itself.
that InvSqrt is slower than 1.0/sqrt(x), atleast on my computer anyway =)
try this one:
try this one:
float __declspec(naked) __fastcall isqrt ( float x ){ __asm { mov eax, 0be6eb508h mov dword ptr [esp-12], 03fc00000h sub eax, dword ptr [esp+4] sub dword ptr [esp+4], 800000h shr eax, 1 mov dword ptr [esp-8], eax fld dword ptr [esp-8] fmul st, st fld dword ptr [esp-8] fxch st(1) fmul dword ptr [esp+4] fld dword ptr [esp-12] fld st(0) fsub st, st(2) fld st(1) fxch st(1) fmul st(3), st fmul st(3), st fmulp st(4), st fsub st, st(2) fmul st(2), st fmul st(3), st fmulp st(2), st fxch st(1) fsubp st(1), st fmulp st(1), st ret 4 }}
In case vector is already very close to unit length, you can use that:
v=v*(1.5-0.5*(v.x*v.x+v.y*v.y+v.z*v.z))
clickster
Note: if vector is quite non-unit-length, this thing will "explode", in sense that it'll get further and further from 1 and will quickly leave the floating point range.
Other things that is _actually_ closely based on Newton's method (interactive demonstration how Newton's method works) usually does not "explode" so badly, but often are slower.
v=v*(1.5-0.5*(v.x*v.x+v.y*v.y+v.z*v.z))
clickster
Note: if vector is quite non-unit-length, this thing will "explode", in sense that it'll get further and further from 1 and will quickly leave the floating point range.
Other things that is _actually_ closely based on Newton's method (interactive demonstration how Newton's method works) usually does not "explode" so badly, but often are slower.
Quote:Original post by glSmurf
that InvSqrt is slower than 1.0/sqrt(x), atleast on my computer anyway =)
try this one:
*** Source Snippet Removed ***
Yes, on my computer too (athlon XP). InvSqrt gives less precision and is slower than the standard sqrt. Maybe it uses SSE when I compile something with full optimization. I don't know.
...and look out for all those invalid benchmarktests.
v=v*(1.5-0.5*(v.x*v.x+v.y*v.y+v.z*v.z));
this is actually slower than my simple normalize function
I use doubles for all sqrt/cos/sin and so on ...as for instance sqrt is alot faster than sqrtf.
and using my isqrt asm function then this is as fast as I could get it:
this is actually slower than my simple normalize function
inline void Vector::normalize(){ double d = x * x + y * y + z * z; if ( d == 1.0 || d == 0.0 ) return; d = 1.0 / sqrt(d); x *= d; y *= d; z *= d;}
I use doubles for all sqrt/cos/sin and so on ...as for instance sqrt is alot faster than sqrtf.
and using my isqrt asm function then this is as fast as I could get it:
inline void Vector::normalize(){ float d = x * x + y * y + z * z; if ( d == 1.0f || d == 0.0f ) return; d = isqrt(d); x *= d; y *= d; z *= d;}
well maybe itss not he best benchmark test ever made but al functions geet tested under the same conditions:
dont know if my result are due to hardware or faulty benchmark tests.
As I said my simple normalize() was the fastest on my computer ..never said it was the best for YOU.
int time = glutGet(GLUT_ELAPSED_TIME);for (int j=0; j<10000000; ++j){ v.normalize(); //v = v*(1.5-0.5*(v.x*v.x+v.y*v.y+v.z*v.z));}time = glutGet(GLUT_ELAPSED_TIME) - time;printf("time: %i\n", time);
dont know if my result are due to hardware or faulty benchmark tests.
As I said my simple normalize() was the fastest on my computer ..never said it was the best for YOU.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement