optimize square root

Started by
13 comments, last by bzroom 20 years, 1 month ago
quote:Original post by honayboyz
and at that, do i even need to square them? or cout i just get absolute value, which ever is faster?


You may need to square them because usually when you''re looking at distances:

d^2 = x^2 + y^2

You need to square both values and then add them to get the distance squared. Otherwise this will skew your math:

if x1 = 6 and y1 = 6
and x2 = 10 and y2 = 0

then if you don''t square them and just add them, your first point will appear farther away than your second point. Which it isn''t.
Advertisement
I would suggest using a handheld calculator thus releaving the burnden on the processor.
By far the fastest square root you can get,
are those SIMD instructions that are provided with SSE or 3DNow.
They give you 12 bit precision and eat only 2 cycles.

23 bit precision are possible with a few extra instructions.
On SSE you can even do 4 squareroots in parallel.

The drawback is, you have to write the same
for SSE, 3DNow and perhabs Fpu.

Onother option would be to use the faboulus VectorC
compiler, where you hint the compiler to do
lower precision square roots and divisions and reziprocal
square roots. It creates those SIMD instructions for you.


Chris
quote:Original post by Enigma
If you find you do need a square root, here's an optimised sqrt function based on the Carmack InvSqrt. The magic number can probably be tweaked a little more if anybody fancies playing with it. I tested it on the full range of normalised floating point numbers. The maximum percentage error was 2.84658% and the average was 1.73863%.
float mysqrt(float x){	union{		int intPart;		float floatPart;	} convertor;	union{		int intPart;		float floatPart;	} convertor2;	convertor.floatPart = x;	convertor2.floatPart = x;	convertor.intPart = 0x1FBCF800 + (convertor.intPart >> 1);	convertor2.intPart = 0x5f3759df - (convertor2.intPart >> 1);	return 0.5f*(convertor.floatPart + (x * convertor2.floatPart));}     


I think the accuracy can be slightly improved by changing the 0.5f in the return statement into a 0.4999f because the function always overestimates slightly. Again this number can probably be tweaked. Here's the results I got for timing 500 000 000 sqrts:

sqrt (standard): >38s
fsqrt (asm): >38s
mysqrt: <20s

That includes function call and loop overheads.

Enigma


Re: 0x5f3759df

You may wish to take a look at this website: Homepage of Chris Lomont, who has a paper "explaining the magic constant 0x5f3759df floating around the net for inverse square root code, and he shows that 0x5f375a86 is probably a better choice".

Jason Doucette - my online résumé page: www.jasondoucette.com
projects, real-time graphics, artificial intelligence, world records, wallpapers/desktops
"Great minds discuss ideas, average minds discuss events, small minds discuss people." - Admiral Hyman G. Rickover

[edited by - Jason Doucette on May 23, 2003 8:24:52 AM]
Jason Doucette / Xona.comDuality: ZF — Xbox 360 classic arcade shmup featuring Dual Play
So far, the fastest square root algorithm I have tested is this one. Posted by shadow12345 on 18 February 2004 1:43:25 PM:
http://www.gamedev.net/community/forums/topic.asp?topic_id=208248

quote:Original post by shadow12345
EDIT: here is another one from tricks of the 3d game programming gurus (in the far back of the book). It's based on the same idea. I think it's equivalent to the carmack inverse sqrt. It's ridiculously fast, but only within 5% accuracy.

/*	-From tricks of the 3d game programming gurus	I guess this makes me a 3d game programming guru then...right...*/float	Faster_Sqrtf(float f){	float	result;	_asm	{		mov eax, f		sub eax, 0x3f800000		sar eax, 1		add eax, 0x3f800000		mov result, eax	}	return result;}



matthewdoucette.com
- GFX (DOS/3Dfx/Rendered), Music, A.I., Inspirational Quotes, Wallpapers, Chess
- Contact
jasondoucette.com
pureproductivity.com - Pure Productivity Inc. Web Development & Hosting

"Discussion is an exchange of intelligence; argument is an exchange of ignorance." - Unknown

[edited by - Doucette on March 17, 2004 2:26:02 PM]

Matthew Doucette, Xona Games

This topic is closed to new replies.

Advertisement