• ### Announcements

#### Archived

This topic is now archived and is closed to further replies.

# doubles and floats

## Recommended Posts

Are doubles more precise than floats, if so how much better? -CProgrammer

##### Share on other sites
I believe float offers at least 6 significant digits whereas double is supposed to allow at least 10 significant digits, although at least 15 significant digits are more common across most implementations.

##### Share on other sites
Doubles are alot more accurate than floats. And don''t think because they are twice the size means it is only twice as accurate. A long is generally 4 times the size of a char, but it can hold 16843009 times the possible values.

##### Share on other sites
From a games programmer point of view, most games deal with mostly floats due to the fact that floats are supported by the graphics hardware natively.

The actual size between float, double and long double actually change between compilers, so it's not a static thing.

In regards to the precision of a double and float the c++ standard states(not exact wording, simplified it for you guys but if ya interested look up section 3.9.1.8 of the standard):
float - Type float is the smallest floating type.

double - Type double is a floating type that is larger than or equal to type float, but shorter than or equal to the size of type long double.

long double - Type long double is a floating type that is larger than or equal to type double.

Here is some more information about data types and Visual C++. The C++ standard information is above while the actual sizes used in Visual C++ are below.

[edited by - deepdene on January 6, 2004 8:25:26 AM]

##### Share on other sites
quote:
Original post by deepdene
The actual size between float, double and long double actually change between compilers, so it''s not a static thing.
I don''t think that is the case. There is an IEEE standard, which I think plots out the exact bits structure.

##### Share on other sites
quote:
Original post by CWizard
quote:
Original post by deepdene
The actual size between float, double and long double actually change between compilers, so it''s not a static thing.
I don''t think that is the case. There is an IEEE standard, which I think plots out the exact bits structure.

But the C++ Standard doesn''t force you to use the IEEE Standard in your implementation, does it now? (it doesn''t)

##### Share on other sites
Like I was inferring above, 6 significant digits are to be the minimum for a float. 10 significant digits the minimum for a double. It does vary across implementations. In the case of a float it would be (where x is the number of digits):

6 <= x < 10

for double:

10 <= x

##### Share on other sites
quote:
Original post by CWizard
quote:
Original post by deepdene
The actual size between float, double and long double actually change between compilers, so it''s not a static thing.
I don''t think that is the case. There is an IEEE standard, which I think plots out the exact bits structure.
Only so far as saying sizeof(short)<=sizeof(int)<=sizeof(long)

##### Share on other sites
quote:
CWizard posted the following:
I don't think that is the case. There is an IEEE standard, which I think plots out the exact bits structure.

Yeah as someone mentioned earlier c++ doesn't neccessarily follow the IEEE standard.

This is the EXACT text from the c++ standard -- can't get anymore official then this:
3.9.1.8 - There are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double. The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double. The value representation of floating-point types is implementation-defined. Integral and floating types are collectively called arithmetic types. Specializations of the standard template numeric_limits (_lib.support.limits_) shall specify the maximum and minimum values of each arithmetic type for an implementation.

[edited by - deepdene on January 6, 2004 11:41:14 AM]

Hey thanks guys.

##### Share on other sites
quote:
Original post by deepdene
This is the EXACT text from the c++ standard -- can''t get anymore official then this:
3.9.1.8 - There are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double.

I was not aware that long double was actually standard. Learn something new every month.

##### Share on other sites
Here''s a recent thread that discusses floating point numbers and some issues involved:

http://www.gamedev.net/community/forums/topic.asp?topic_id=198236

cheers
sam

##### Share on other sites
I had a problem the other day...

I need to restrict a float to 2 decimal places of precision. I didn't know how to do it (it is not possible with masking), so I multiplied by 1000 and hoped for the best (so far).

Is there a standard method to solve this problem?

BTW...

[42702.658].[DoInit]........................sizeof(float) = 4[42702.658].[DoInit]........................sizeof(double) = 8[42702.658].[DoInit]........................sizeof(int) = 4[42702.658].[DoInit]........................sizeof(long) = 4[42702.658].[DoInit]........................sizeof(DWORD) = 4

... doubles use 8 bytes.

R

EDIT: pardon me... for a VC++6 Win32 application.

[edited by - reaction on January 7, 2004 9:19:25 AM]

• ## Partner Spotlight

• ### Forum Statistics

• Total Topics
627653
• Total Posts
2978433

• 10
• 12
• 22
• 13
• 33