Deterministic simulation using floats?

Started by
26 comments, last by ddyer 15 years, 9 months ago
It seems that whenever the Itanium fetches a float/double, it will always convert it to the internal 82-bit format. However, only 1/2 of the FPUs actually work with all 82-bits, the other FPUs only use 32 out of the 82-bits - I assume the compiler gets to choose which FPU-type to use when it's writing the assembly?

It definitely doesn't sound as scary as I was making it out to be, but still, in theory moving a float from the CPU to RAM (and even vica versa) can change it's value bitwise representation, which could end up changing it's value.
Advertisement
Quote:Original post by Hodgman
It definitely doesn't sound as scary as I was making it out to be, but still, in theory moving a float from the CPU to RAM (and even vica versa) can change it's bitwise representation, which could end up changing it's value.
That's correct. Now, the dominant calling conventions push floating point numbers on the (execution) stack, rather than leaving them in the FPU, and the FPU register file is pretty roomy compared to x86, so this will often happen at easy-to-anticipate times. However, the bottom line (as sc4freak mentioned) is that you can't expect consistency across builds... both because of this, and because for best efficiency you want to allow the compiler to mess with your floating-point algebra. (Thanks to the lack of control flow and the large amount of past research, compilers are really, really good in this area... but changing (a*b)+(a*c) to a*(b+c) can produce different results, even without wacky extra internal precision.)

Now, RTS games have the advantage of patches being seen as mandatory. If you have 1.08 installed, you probably can't play with someone with 1.07 installed.... but that guy wouldn't be allowed online anyway, because of the "imploding narwhal" exploit in 1.07. So it works out pretty well.
It's basically impossible to guarantee reproducible results
when using floats to influence any control path. Even though
the floats themselves have deterministic behavior, the overall
program does not. An easy way to see this is if you cache any
results, or sort any arrays, then the next time through the "same"
code you'll take a different path, and if any floating point operations
were skipped or added, you'll upset the deterministic floating point
apple cart. Remember that floating point numbers are not associative
or distributive either.

The same issue arises in alpha-beta, if evaluators use floating point
numbers.

---visit my game site http://www.boardspace.net - free online strategy games

Quote:Original post by ddyer
It's basically impossible to guarantee reproducible results
when using floats to influence any control path. Even though
the floats themselves have deterministic behavior, the overall
program does not. An easy way to see this is if you cache any
results, or sort any arrays, then the next time through the "same"
code you'll take a different path, and if any floating point operations
were skipped or added, you'll upset the deterministic floating point
apple cart.

The "basically impossible" does not follow. The solution to such caching problems is to not use such caching systems. Also, can you give an example of what you're talking about WRT alpha-beta pruning? Floating point operations may not be associative, but they are certainly strictly ordered.
By "basically impossible" I mean impossible to verify or guarantee.
Results from simulations are hard enough to trust without adding
the additional qualifier "and assuming my floating point logic is
flawless".

In alpha-beta searching, it's invaluable to be able to re-run
the same search to investigate a suspected bug in some heuristic.
You won't be able to do so if there are floats in your evaluator.
Evaluators are especially likely to have caches or sorts that
induce floating point divergence.

---visit my game site http://www.boardspace.net - free online strategy games

Quote:Original post by ddyer
By "basically impossible" I mean impossible to verify or guarantee.
Results from simulations are hard enough to trust without adding
the additional qualifier "and assuming my floating point logic is
flawless".
Flawless and deterministic are not the same thing.
Quote:In alpha-beta searching, it's invaluable to be able to re-run
the same search to investigate a suspected bug in some heuristic.
You won't be able to do so if there are floats in your evaluator.
Evaluators are especially likely to have caches or sorts that
induce floating point divergence.
Having seen large amounts of scientific grade hydrodynamics simulation code which requires--and gets--full determinism from quadrillions of floating point operations, I'm disinclined to put much stock in this without specifics. Where, in detail, have you run into problems such as this?
Quote:Original post by implicit
By the way has anyone successfully managed to use floats in a multi-platform RTS?

Yes (Universe at War). With full bit-for-bit synchronization on the results of all calculations (floating-point or otherwise) across all clients, including interop play (i.e. PC players and 360 players playing together in the same game). Unfortunately I wasn't involved on this part of the project so I don't know all the details, however I know it wasn't easy. Essentially (IIRC), all floating-point operations had to be performed in the exact same order on both CPU's, even when compiled into their different instruction sets. All the control modes had to be set properly as well. Certain instructions had to be added (or avoided) on one platform versus the other, and plenty of hackery went into making it work. That's about all I know [smile]
This is the advice column. You're free to ignore my advice, which is based
on multiple instances of spending long hours debugging a problem which turned
out to be confounded by unexpectged floating point results.

There are innumerable ways to get floating point operations to produce
slightly different results. Anything which alters the exact sequence of
flops is likely to trigger low-bit differences due to rounding, and they
will, over time, be magnified until you notice them: way, way beyond where
the actual problem occurred. The source of such problems can be extremely
hard to find. Another way to lose is that helpful compilers will optimize
floating point expressions in slightly different ways depending on considerations
completely outside your purview, so the same expression will generate different
code depending on the expression (and the compiler's) exact context.

There is a certain thrill of the chase in working against this kind of problem,
but if your basic algorithms depend on zero tolerance, I would find better
ways to spend my time than this kind of brinksmanship.

---visit my game site http://www.boardspace.net - free online strategy games

This topic is closed to new replies.

Advertisement