# floating point consistency and SSE

## Recommended Posts

So according to what I've read, the x87 FPU has 80 bit registers which means that floating point math returns different results depending on the whims of the compiler and seemingly unrelated factors.

I've actually noticed visible differences in the behavior of my supposedly deterministic program between debug and release builds and I assume this is the root cause. Currently, I'm compiling the physics (Box2d) without optimization and the rest of the program with regular optimization, which has made the problems disappear. However, this seems like a haphazard measure and I am trying to look for better approaches.

I've read that even if all variables are stored to memory, there are still differences due to subtle rounding issues. Is it possible to get around this by using SSE2 for all floating point operations? What sort of performance penalties does this incur? Also, what about transcendental functions?

bump

#### Share this post

##### Share on other sites
Even then, you don't get around those "rounding errors" because some of them actually aren't rounding errors (though some are).

What floating point numbers do is they allow you to represent numbers in a much, much, much greater range, from very small to very large. But, with less precision. Or put differently, floats can represent something close to (almost) every number, but not necessarily as exact as you want it.

Try this, for example:
#include <stdio.h>
int main()
{
int i = 16800000;
float f = 16800000.0f;

++i;
++f;

printf("%d\n%f\n", i, f);
return 0;
}

This will, on a "typical" machine, give you:
16800001
16800000.0000

This is not rounding, it is simply because you can't represent 16800001 as float. You can add 1 a million times, and it won't do a thing. Now does that mean floats are useless? Certainly not.

It depends what you need. If your world is 16 kilometers across and you need a 1 mm resolution for your physics, then, well... there you have your answer. Your physics will work just fine near the origin and for the most part, but will mysteriously fail on the far end (where the numbers are big).

On the other hand, there are many cases where a little drift doesn't matter at all, and then it is bliss not having to care how big or small numbers actually are. Float will make it "magically" work.

#### Share this post

##### Share on other sites
I'm talking about consistency, not accuracy.

#### Share this post

##### Share on other sites
Google gave me the following thread: http://www.gamedev.net/community/forums/topic.asp?topic_id=499435. Read hplus0603's posts.

#### Share this post

##### Share on other sites
Quote:
 Original post by StoryyellerSo according to what I've read, the x87 FPU has 80 bit registers which means that floating point math returns different results depending on the whims of the compiler and seemingly unrelated factors.

The 80bit float of the x87 (assuming you've specified that methodology), affects your calculations such that given two machines (one with, and one without the 80bit float) the same operation can result in different results. This is because on the machine with an 80bit float register can represent a larger range of values and therefore can handle the overflow bits you would lose otherwise. Now, this depends also on how your compiler flags are set. If you set the strict/precise flags on your compiler it will typically show a behavior much like:
fld dword ptr [value1]fadd dword ptr [value2]fst dword ptr [value1] //note thisfadd dword ptr [value3]

Where the store operation in the middle truncates the float down to the size of the storage. Whereas when you set the fast flag on the compiler for FPU instructions the code will tend to look more like:
fld dword ptr [value1]fadd dword ptr [value2]fadd dword ptr [value3]

Which doesn't truncate the float, and therefore all of the calculations will be done in 80bits (on the x86).
Quote:
 I've actually noticed visible differences in the behavior of my supposedly deterministic program between debug and release builds and I assume this is the root cause. Currently, I'm compiling the physics (Box2d) without optimization and the rest of the program with regular optimization, which has made the problems disappear. However, this seems like a haphazard measure and I am trying to look for better approaches.

Floats are not very accurate, and their actual accuracy varies based on the value of the float. In general a 32bit float can represent between 6 and 7 significant digits. Doubles are larger, 64 bits according to IEEE, and can hold between 14 and 16 significant digits (depends on the actual values).
Nevertheless, 6192069.999999991 and 6192069.999999999... agree to 15 digits. So don't be surprised when you get 6192070.

Quote:
 I've read that even if all variables are stored to memory, there are still differences due to subtle rounding issues. Is it possible to get around this by using SSE2 for all floating point operations? What sort of performance penalties does this incur? Also, what about transcendental functions?

Switching to SSE/SSE2 will improve consistency (not accuracy!) between calculations on differing machines... and even just general floating point calculations. They'll also tend to be FASTER than x87 calculations. But: If you are looking for accuracy and determinism in your calculations then you're using the WRONG type. Either use a fixed precision type, or bignum.

Many times, when doing game programming not necessarily general purpose programming, when you're comparing to 0 you may actually mean within an epsilon of 0.

## Create an account or sign in to comment

You need to be a member in order to leave a comment

## Create an account

Sign up for a new account in our community. It's easy!

Register a new account

## Sign in

Already have an account? Sign in here.

Sign In Now