Fast sqrt

Math and Physics Programming

Started by Zakwayda October 27, 2004 02:39 AM

12 comments, last by grhodes_at_work 19 years, 6 months ago

2,389

Author

October 27, 2004 02:39 AM

I've been looking through the various threads on this forum about fast sqrt and inverse sqrt approximations, and have also been looking at the relevant code in the Doom 3 sdk, and have a couple of questions. 1. I understand enough about the methods used to know that it relies on the specifics of 32 bit IEEE floating point representation. So I assume that means the code breaks if the representation changes? What happens on 64 bit machines like the Mac G5? 2. And, is it worth it? Or would you generally be better off (and close to as fast) using the standard functions, which I assume are gauranteed to be portable and consistent from platform to platform? (Please excuse me if the question seems naive, but it involves some areas I don't know much about, i.e. floating point architecture, etc.)

oliii

2,202

October 27, 2004 02:52 AM

there should be no need for it nowadays. and it's a fast inverse sqrt. Use the normal stuff, and see if it bottleneck your application. Nothing prevents you to have a float finvsqrt(float x) { ... } function in your math lib, then the implementation is left to you and the platform it's running on (so SSE/sqrt+div/Carmack for PC, and whatever for Macs).

I don't think yo ushould worry about it too much :)

Everything is better with Metal.

Nice Coder

366

October 27, 2004 04:32 AM

Look

float InvSqrt (float x){    float xhalf = 0.5f*x;    int i = *(int*)&x;    i = 0x5f3759df - (i >> 1);    x = *(float*)&i;    x = x*(1.5f - xhalf*x*x);    return x;}

And i got it from
here

Yes, it breaks when the representation changes. use the #defines!

You usually never need it. But if you need a lot of sqrts, use the above code.
From,
Nice coder

Click here to patch the mozilla IDN exploit, or click Here then type in Network.enableidn and set its value to false. Restart the browser for the patches to work.

shadow12345

100

October 27, 2004 10:03 AM

it typically takes LOTS of sqrt calls to mean anything in terms of slowdown.

I suggest to all people that have a problem with using sqrt that they put a loop in their code somewhere that keeps calling sqrt. You can typically put it in thousands of times before you even see a FPS drop.

Why don't alcoholics make good calculus teachers?Because they don't know their limits!Oh come on, Newton wasn't THAT smart...

Anonymous

October 27, 2004 01:35 PM

I messed around with the functions a little and did some completely unscientific tests (if I remember correctly, in earlier threads on the subject people did some very rigorous comparisons).

I compared three versions: 1 / sqrtf(), the code from the above post, and the code from Doom 3. Doom 3 uses the same principal, but calculates the seed on the fly using some constants and a lookup table.

Amazingly enough (and unless I messed up somewhere) DoomInvSqrt() returned exactly the same results as 1 / sqrtf(). So no accuracy problems there. And Q3InvSqrt() was plenty close.

I just did a brute-force test - 10,000,000 calls to each function. The Q3 and Doom versions were about 1.8 times as fast as 1 / sqrt().

I would use the Doom version, but I imagine the code is copyrighted, and I don't understand it well enough to recreate it for myself. But I suppose the other code is fair game.

Zakwayda

2,389

Author

October 27, 2004 01:36 PM

Uh...AP = jyk.

Eelco

301

October 27, 2004 01:48 PM

can you copyright just a few lines of code? its probablly just an implementation of something discovered 200 years ago, if not longer ago.

squicklid

132

October 27, 2004 02:05 PM

Is it Newton's Method you want to copywrite?

Zakwayda

2,389

Author

October 27, 2004 02:21 PM

Yes, I believe both versions (Q3 and Doom) use Newton's method. However, the Doom version uses some intricate and specific bit manipulation using lookup tables to find the seed. That's what I'm suspecting may be copyrighted.

Extrarius

1,412

October 27, 2004 02:45 PM

I did a benchmark on various float-type sqrt routines, and the fastest one was the 'sqrtf' in MSVC's standard library (equal to inline assembly). The post is here somewhere on gamedev, but I have no idea where.

The only way you'll get speedup is using the inverse square root formula to actually calculate the inverse square root, and iirc the win was barely one.

Well, you could also get a win using one of the functions that returns much less precision than sqrtf, but in that case you're not really comparing like things.

"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk

Fast sqrt

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Fast sqrt

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines