Back to General and Gameplay Programming

Fast sqrt()?

General and Gameplay Programming Programming

Started by d000hg January 31, 2003 06:13 AM

18 comments, last by d000hg 21 years, 1 month ago

d000hg

1,208

Author

January 31, 2003 06:13 AM

I know this must have come up often but I''m in a rush! Can I do in software square roots to ~2% accuracy faster than calling sqrt() in MSVC++ thankyou! Read about my game, project #1 NEW (18th December)2 new screenshots, one from the engine and one from the level editor John 3:16

civguy

308

January 31, 2003 06:16 AM

Depends on your hardware. Maybe if you do a look-up-table that contains approximate sqrts and then use one newton's iteration step to make it more accurate. Should work fine if the range isn't too large.

edit: linearly interpolating in the LUT would be pretty good too

But of course, using a LUT requires you to convert a float/double into int. I don't really know how slow that operation is in general.

[edited by - civguy on January 31, 2003 7:21:34 AM]

JonnyQuest

331

January 31, 2003 06:22 AM

What range of values are you looking at?
If it''s small, you might be able to do a taylor approximation. Else use a lookup table and interpolate between it, using either linear or cubic spline interpolation.

- JQ
#define NULL 1

~phil

Julio

116

January 31, 2003 06:38 AM

check out the downloads section on my site. there''s an assembly optimized square root. it''s pretty quick.

My Homepage

My HomepageSome shoot to kill, others shoot to mame. I say clear the chamber and let the lord decide. - Reno 911

Jan Wassenberg

1,000

January 31, 2003 08:19 AM

NVidia''s fastmath sqrt is a LUT based approach; I see no asm here

Also, blowing all your L2 cache on a *drumroll* sqrt table is not conducive to performance, if you''re doing anything other than sqrt() (=> cache miss every time => pretty much anything is faster)

copy+paste from a previous post:

quote:
If you can''t avoid the sqrt, calculate it as x * rsqrt(x):
- on Athlons: pfrsqrt + pfmul (7 clock latency)
- otherwise, use a clever bit of code credited to Carmack:

  float InvSqrt (float x){	float xhalf = 0.5f*x;	int i = *(int*)&x	i = 0x5f3759df - (i >> 1);	x = *(float*)&i	x = x*(1.5f - xhalf*x*x);	return x;}

What this does in a nutshell is approximate via one iteration of Newton''s method (the line with the magic number computes an initial guess, important for convergence).

HTH
Jan

E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3

d000hg

1,208

Author

February 07, 2003 06:11 AM

That looks weird - I should know Newton-Raphson but it''s a while since I did it last - what''s the formula?

One situation I need to get a square root is solely for numbers in the range 0-1. This must be easy! I could use a table with 100 entries and do sqrt(x)=table[(int)(x*100)] but I have a feeling that casting is a slow operation? Would this be faster or slower, more or less accurate than that method above?

How accurate generally is the method above?

Read about my game, project #1
NEW (18th December)2 new screenshots, one from the engine and one from the level editor

John 3:16

civguy

308

February 07, 2003 06:34 AM

quote:Original post by d000hg
How accurate generally is the method above?

Uh, perhaps you should try it out yourself instead of wait for breast feeding.

Also, caches aren''t so small that a tiny LUT would kill it. If the LUT is accessed often enough, it''ll stay in cache. If it''s not, then it''s not performance critical.

Jan Wassenberg

1,000

February 08, 2003 08:39 AM

Solve F(y) = 1/y^2 - x, which is obviously 0 <==> |y| = x^(-1/2); Newton''s method will get you the positive root.

> How accurate generally is the method above?
Depends on your initial guess (I think the Taylor expansion for approximating the mantissa, i.e. the magic number step could be better). If it ain''t good enough, just add iterations.

> One situation I need to get a square root is solely for numbers in the range 0-1.
You could approximate by piecewise linear functions (3 ought to cut it).

> I could use a table with 100 entries and do sqrt(x)=table[(int)(x*100)]
That''ll work, but get rid of the cast (ends up as a call to _ftol(), which really sucks).
a) enable -QIfist compiler option (your code will break if compiler != VC)
b) write your own round() (either with fist or by extracting the mantissa from a double)

>> Would this be faster or slower, more or less accurate than that method above?
> Uh, perhaps you should try it out yourself instead of wait for breast feeding.
Agreed

quote:Also, caches aren''t so small that a tiny LUT would kill it. If the LUT is accessed often enough, it''ll stay in cache. If it''s not, then it''s not performance critical.

Of course. My comment referred to the NVidia fastmath table (256 KB), which is not so tiny

E8 17 00 42 CE DC D2 DC E4 EA C4 40 CA DA C2 D8 CC 40 CA D0 E8 40E0 CA CA 96 5B B0 16 50 D7 D4 02 B2 02 86 E2 CD 21 58 48 79 F2 C3

MDI

266

February 08, 2003 11:21 AM

quote:

That looks weird - I should know Newton-Raphson but it''s a while since I did it last - what''s the formula?

Xnew = X - ((f''(x) / f(x))

Why do old statues have small penises?

d000hg

1,208

Author

February 10, 2003 07:35 AM

What is it with people not wanting to give specific information? Thanks for the help but if someone knows how good a result is from experience there''s no need for me to test it is there - if you don''t know without testing yourself then just say so! I''m working to finish a major project in a limited period and don''t have the time to fiddle around unless I need to!

Fast sqrt()?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Fast sqrt()?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines