Quote:Original post by JohnBSmall
Quote:Original post by frob
That's often done for speed reasons, as float operations take less time than doubles.
Really? I was under the impression that there wasn't a noticeable difference between the two, speed-wise. In fact, I was under the impression that internally, all floating point operations are performed on 80-bit floats anyway (although that would obviously be processor dependent - maybe we're both right, but for different processors)
Using 32-bit floats instead of 64-bit floats would (or might, depending on how various structures get aligned) reduce the size of things in memory though, which could have an effect due to cache issues.
If I remember correctly (and I might not - you can check it easily enough by looking at the code though), the Lua value union's largest item is the double that it contains (64-bit), with everything else being the size of a pointer (32-bit... for a 32-bit system, at least), so using 32-bit floats instead of 64-bit doubles could reduce the size of Lua's value structure, which means less data to pass around the place.
Anyway, does anyone have any links to information about speed differences between 32-bit and 64-bit floats?
John B
Yes, all floating point operations take place in the same 80 bit (actually 79-bit) register size of 'extended double'. There are speed differences since double arithmatic is carried out several more steps. It is not noticable for casual FPU use, but in a 3D game you WILL see a performance difference.
To prove that to yourself, create some D3D devices and do some math-intensive work on them (such as software T&L). Set the D3DCREATE_FPU_PRESERVE flag on creation, set the FPU to use doubles, and watch your performance plummet.
Also, changing the mode of the FPU from double to float, or setting other flags, or using FPU exceptions, are all fairly long operations, however. Doing it frequently WILL show up on profiling.
Most libraries offer the ability to use double or float, or will constantly reset the FPU state to their preferred state.
Libraries like ODE have double and float versions available. They do it so that you (or the library) don't have to keep changing mode. The Direct3D constant D3DCREATE_FPU_PRESERVE will assume that you have preserved the FPU state ... with a warning that moving to double-precision mode will degrade performance and changing other FPU states will give undefined behavior. Without the flag, it will reset the FPU states every time it does work, causing a measurable performance hit.
Passing floating/doubles around may or may not be an issue, depending on how you do it. If just one, it will be passed using the FP register stack so there is no performace issue. If you pass a pointer to them, the CPU can load them quickly if they are in the data cache; otherwise there will still be roughly the same cache miss time. Yes there will be some cache issues involved, but those are things you need to measure and figure out specifically for your own system.
Finally, the
intel developer centers have more information than any individual could ever use about the exact performance of all the stages of the pipeline.
frob.