# Lua precision problems?

This topic is 4780 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I've been running some tests recently involving the precision of floating point numbers and have encountered some startling results. Lua is supposed to have, by default, 64-bit internal floating point precision. However, the numbers seem to act as 32-bit when arithmetic is performed on them. Exhibit A: The following C++ code was executed:
   double b1 = 0.12345678901234;
for (int i=0;i<=15;++i)
{
print(".20f\n", b1);
b1 = b1 * 10;
}


Which, appropriately enough, produces this output: 0.12345678901234000000 1.23456789012340010000 12.34567890123400100000 123.45678901234001000000 1234.56789012340000000000 12345.67890123400100000000 123456.78901234001000000000 1234567.89012340010000000000 12345678.90123400100000000000 123456789.01234001000000000000 1234567890.12340020000000000000 12345678901.23400100000000000000 123456789012.34001000000000000000 1234567890123.40010000000000000000 12345678901234.00200000000000000000 123456789012340.02000000000000000000 Exhibit B: Now, I mirror this code in Lua
   local b1 = 0.12345678901234
local i = 0
while (i <= 15) do
print(string.format('%.20f', b1))
b1 = b1 * 10.0
i = i + 1
end


Which produces this output: 0.12345678901234000000 1.23456788063049320000 12.34567832946777300000 123.45678710937500000000 1234.56787109375000000000 12345.67871093750000000000 123456.78906250000000000000 1234567.87500000000000000000 12345679.00000000000000000000 123456792.00000000000000000000 1234567936.00000000000000000000 12345679872.00000000000000000000 123456798720.00000000000000000000 1234567954432.00000000000000000000 12345680068608.00000000000000000000 123456804880384.00000000000000000000 The only way to produce this sort of output is to use a float instead of a double in the above C++ code. This leads me to believe something somewhere is being converted to a float.. yet I know not where. I need large numbers to remain stable, which is why the internal double precision fits my purpose so well. Argh. Am I just missing something?

##### Share on other sites
From the Lua Manual:
Quote:
 Number represents real (double-precision floating-point) numbers. (It is easy to build Lua interpreters that use other internal representations for numbers, such as single-precision float or long integers.)

If I remember correctly, there is a line in a Lua header file with
typedef double number;
or somthing like that. If you downloaded your version of Lua as precompiled binaries, it's possible that this was changed to float. If this is indeed the problem, you could fix it by downloading the Lua source and building it for your machine.

##### Share on other sites

As you pointed out, clearly the lua version is a float and not a double, since you get six decimal digits of precision with it.

It could be that your version of Lua was compiled to use float instead of double for the Number basic type.

That's often done for speed reasons, as float operations take less time than doubles.

See how LUA_NUMBER is defined.

frob.

##### Share on other sites
Quote:
 Original post by frobThat's often done for speed reasons, as float operations take less time than doubles.

Really? I was under the impression that there wasn't a noticeable difference between the two, speed-wise. In fact, I was under the impression that internally, all floating point operations are performed on 80-bit floats anyway (although that would obviously be processor dependent - maybe we're both right, but for different processors)
Using 32-bit floats instead of 64-bit floats would (or might, depending on how various structures get aligned) reduce the size of things in memory though, which could have an effect due to cache issues.
If I remember correctly (and I might not - you can check it easily enough by looking at the code though), the Lua value union's largest item is the double that it contains (64-bit), with everything else being the size of a pointer (32-bit... for a 32-bit system, at least), so using 32-bit floats instead of 64-bit doubles could reduce the size of Lua's value structure, which means less data to pass around the place.

Anyway, does anyone have any links to information about speed differences between 32-bit and 64-bit floats?

John B

##### Share on other sites
The thing is, however, that if you check the first line of the Lua code output, the original value is stored as a true double (whereas a float would appear to store 0.12345678901234 as 0.12345679104328156). Once one arithmetic operation occurs on the value, even though the other operand is a double as well, it does a single precision math operation.

The first thing I thought to check would be how a floating point number is stored, and indeed, I built LuaPlus (build 1084) as double precision. I even made sure that lua_Number was a double, and sizeof(lua_Number) does indeed return 8.

Can anybody else verify this issue with Lua?

Regarding the speed of float vs double, internal computation speeds remain unaffected since it's even higher precision and doesn't cost to drop bits, but the only thing you'll really notice being affected is memory bandwidth. Doubles are a tad slower than floats, but nothing that is going to make you want to change the entire structure of code for performance reasons.

##### Share on other sites
Perhaps someone changed the x87's floating-point control word to 32-bit precision. Direct3D likes to do this by default for instance.

##### Share on other sites
Thanks a whole heap load. Direct3D was changing the FPU to single precision since I was omitting setting FPU preserve. The side effects listed were unspecific but made to sound nasty, so I stayed clear of causing unnecessary problems. However, I'll have to see what kinds of issues come up but I don't anticipate anything terrible.

I thank you again.

[Edited by - sordid on November 9, 2005 8:30:01 PM]

##### Share on other sites
Quote:
Original post by JohnBSmall
Quote:
 Original post by frobThat's often done for speed reasons, as float operations take less time than doubles.

Really? I was under the impression that there wasn't a noticeable difference between the two, speed-wise. In fact, I was under the impression that internally, all floating point operations are performed on 80-bit floats anyway (although that would obviously be processor dependent - maybe we're both right, but for different processors)
Using 32-bit floats instead of 64-bit floats would (or might, depending on how various structures get aligned) reduce the size of things in memory though, which could have an effect due to cache issues.
If I remember correctly (and I might not - you can check it easily enough by looking at the code though), the Lua value union's largest item is the double that it contains (64-bit), with everything else being the size of a pointer (32-bit... for a 32-bit system, at least), so using 32-bit floats instead of 64-bit doubles could reduce the size of Lua's value structure, which means less data to pass around the place.

Anyway, does anyone have any links to information about speed differences between 32-bit and 64-bit floats?

John B

Yes, all floating point operations take place in the same 80 bit (actually 79-bit) register size of 'extended double'. There are speed differences since double arithmatic is carried out several more steps. It is not noticable for casual FPU use, but in a 3D game you WILL see a performance difference.

To prove that to yourself, create some D3D devices and do some math-intensive work on them (such as software T&L). Set the D3DCREATE_FPU_PRESERVE flag on creation, set the FPU to use doubles, and watch your performance plummet.

Also, changing the mode of the FPU from double to float, or setting other flags, or using FPU exceptions, are all fairly long operations, however. Doing it frequently WILL show up on profiling.

Most libraries offer the ability to use double or float, or will constantly reset the FPU state to their preferred state.

Libraries like ODE have double and float versions available. They do it so that you (or the library) don't have to keep changing mode. The Direct3D constant D3DCREATE_FPU_PRESERVE will assume that you have preserved the FPU state ... with a warning that moving to double-precision mode will degrade performance and changing other FPU states will give undefined behavior. Without the flag, it will reset the FPU states every time it does work, causing a measurable performance hit.

Passing floating/doubles around may or may not be an issue, depending on how you do it. If just one, it will be passed using the FP register stack so there is no performace issue. If you pass a pointer to them, the CPU can load them quickly if they are in the data cache; otherwise there will still be roughly the same cache miss time. Yes there will be some cache issues involved, but those are things you need to measure and figure out specifically for your own system.

Finally, the intel developer centers have more information than any individual could ever use about the exact performance of all the stages of the pipeline.

frob.

##### Share on other sites
Quote:
Original post by frob
Quote:
Original post by JohnBSmall
Quote:
 Original post by frob...

...

Yes, all floating point operations take place in the same 80 bit (actually 79-bit) register size of 'extended double'. There are speed differences since double arithmatic is carried out several more steps. It is not noticable for casual FPU use, but in a 3D game you WILL see a performance difference.

That's interesting. Thanks for the explanation.

John B

##### Share on other sites
Wouldn't the performance critical parts of Direct3D's software pipeline be implemented in SSE code nowadays (which is unaffected by the fpu precision control)?

1. 1
2. 2
3. 3
Rutin
15
4. 4
5. 5

• 10
• 9
• 9
• 11
• 11
• ### Forum Statistics

• Total Topics
633691
• Total Posts
3013344
×