• Advertisement
Sign in to follow this  

'native' 32bit and 64bit floating point precision...and round off error

This topic is 3816 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Under C/C++, with a 64bit architecture, the 'float' precision is reportedly the same as 32 bit 'double'. However, I've encountered what appears to be round-off error on calculations with 'float' under 64bit architectures, that does not occur under 32bit 'double' precision. These round-off errors also disappear when 64bit 'double' is used. I'm obviously missing something here. Any help?

Share this post


Link to post
Share on other sites
Advertisement
Checked it, you're right SpoonB...as:

float: 32bit: 1.2e-38 to 3.4e38 : 64bit: 1.2e-38 to 3.4e38
double: 32bit: 2.2e-308 to 1.8e308 : 64bit: 2.2e-308 to 1.8e308

So will the decimal precision also be the same? I've noted that double would be to 16 decimal digits (appx) whereas float would be to 7 decimal digits? Correct?

Share this post


Link to post
Share on other sites
Floating point numbers on intel and amd cpus (and many others) follow the IEEE 754 standard, in both 32-bit and 64-bit versions, so there should not be a difference.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement