double Presicion

Math and Physics Programming

Started by Naise Mayo March 23, 2002 01:19 PM

13 comments, last by Naise Mayo 22 years ago

122

Author

March 23, 2002 01:19 PM

What is wrong here- Multiplying with type double variables (less than 1 and greater than 10^10) is effecting the precision. The deviations in astrnomical numbers are very noticable over time, and this seems to be the problem. How can I fix this? Thanks double A = 1; double B = 0.1; double C; C = B; // C = 1.000000000000000e-001 C = 1 * B; // C = 1.000000014901161e-001 C = 1 * 0.1; // C = 1.000000000000000e-001 C = A * B; // C = 1.000000014901161e-001 C = pow( 10, 10 ); // C = 1.000000000000000e+010 C = pow( 10, 11 ); // C = 9.999999795200000e+010 C = pow( 10, 20 ); // C = 1.000000020040877e+020 _stprintf( Text, "C = %.015e", C );

Mathematix

259

March 23, 2002 02:27 PM

Have you tried floats? These were designed to cope with exponential numbers and have a much greater range of values.

Regards,
Mathematix.

siaspete

208

March 23, 2002 02:33 PM

doubles are more precise than floats.

I can''t speak for all platforms, but a float is 4 bytes on Win32, and a double is 8 bytes.

Helpful links:
How To Ask Questions The Smart Way | Google can help with your question | Search MSDN for help with standard C or Windows functions

Saurus Games, home of the Independent Game Engine

Beer Hunter

712

March 23, 2002 05:04 PM

  #include <stdio.h>#include <math.h>int main() {  double A = 1;  double B = 0.1;  double C;  C = B;           printf("%.015e\n", C);  C = 1 * B;       printf("%.015e\n", C);  C = 1 * 0.1;     printf("%.015e\n", C);  C = A * B;       printf("%.015e\n", C);  C = pow(10, 10); printf("%.015e\n", C);  C = pow(10, 11); printf("%.015e\n", C);  C = pow(10, 20); printf("%.015e\n", C);  getchar();}

Outputs these numbers under gcc and borland:

1.000000000000000e-01
1.000000000000000e-01
1.000000000000000e-01
1.000000000000000e-01
1.000000000000000e+10
1.000000000000000e+11
1.000000000000000e+20

...I don''t see any problems here.

Dredge-Master

175

March 26, 2002 04:48 PM

-edit- Note: This is off topic, its a retort.

quote:Original post by siaspete
doubles are more precise than floats.

I can't speak for all platforms, but a float is 4 bytes on Win32, and a double is 8 bytes.

Helpful links:
How To Ask Questions The Smart Way | Google can help with your question | Search MSDN for help with standard C or Windows functions

doubles are more precise but the range is smaller than a float.

the reason why a float is called a float is because the decimal point floats around so it makes the range bigger.

besides, on most modern computers (atleast with Sparc, intel and amd, not sure about powerpc chips though, but probably the same if not better) floats are faster.

well anyway, so when dealing with really screwey large or small numbers, go with a float. Best way is to make your own format though, so you get a more precise float. You do not want to be using doubles for really extreme maths though. Maybe if it was a 32byte double, but again, why use a 32byte double when you can use an 8byte or even your normal 4byte float?

Beer - the love catalyst
good ol' homepage

[edited by - Dredge-Master on March 26, 2002 5:50:23 PM]

Beer - the love catalystgood ol' homepage

jenova

122

March 26, 2002 05:01 PM

k, i don''t know what kind of smack you guys are on....

from the MSDN for VS.NET.

C Language Reference

Floating-point variables are represented by a mantissa, which contains the value of the number, and an exponent, which contains the order of magnitude of the number.

The following table shows the number of bits allocated to the mantissa and the exponent for each floating-point type. The most significant bit of any float or double is always the sign bit. If it is 1, the number is considered negative; otherwise, it is considered a positive number.

Lengths of Exponents and Mantissas

Type Exponent length Mantissa length
float 8 bits 23 bits
double 11 bits 52 bits

Range of Floating-Point Types

Type Minimum value Maximum value
float 1.175494351 E – 38 3.402823466 E + 38
double 2.2250738585072014 E – 308 1.7976931348623158 E + 308

If precision is less of a concern than storage, consider using type float for floating-point variables. Conversely, if precision is the most important criterion, use type double.

Floating-point variables can be promoted to a type of greater significance (from type float to type double). Promotion often occurs when you perform arithmetic on floating-point variables. This arithmetic is always done in as high a degree of precision as the variable with the highest degree of precision. For example, consider the following type declarations:

therefore "double" IS MORE PERCISE than "float". EOD.

To the vast majority of mankind, nothing is more agreeable than to escape the need for mental exertion... To most people, nothing is more troublesome than the effort of thinking.

To the vast majority of mankind, nothing is more agreeable than to escape the need for mental exertion... To most people, nothing is more troublesome than the effort of thinking.

TerranFury

142

March 26, 2002 05:20 PM

Double is a floating-point data type.

Doubles are floats with larger mantissas and larger exponents. The only thing that doesn't get more bits is the sign - and if you need more than one for that I don't know what kind of crazy math you're doing.

In other words: There is nothing you can do with a float that you can't do with a double. Doubles are more accurate.

The internal FPU often has higher precision than doubles, but aside from that, doubles are about as good as it gets, unless your compiler supports a long double type (and actually makes it something better than a synonym for double).

[edited by - TerranFury on March 26, 2002 6:24:27 PM]

Beer Hunter

712

March 26, 2002 07:16 PM

quote:Original post by Dredge-Master
doubles are more precise but the range is smaller than a float.

wtf? Doubles dedicate 3 more bits to the range of the number than a float does. Are you thinking of fixed-point numbers or something?

timexfish

122

March 27, 2002 09:54 AM

quote:Original post by Beer Hunter

...I don''t see any problems here.

I don''t see any problems using MSVC++ 6.0 SP5 either.

johnb

352

March 27, 2002 10:40 AM

quote:Original post by TerranFury
In other words: There is nothing you can do with a float that you can''t do with a double.

You can''t write fast games: if we replaced all our float code with double code the game would probably drop from 60 to well under 30 fps.

Floats are accurate to better than 1 part in a million, i.e. errors are less than 1mm per km, I can''t think of any gaming appilcation for better precision than this. For comparison in science experiements are usually accurate to no more than 1 part in a thousand, and are often a lot less accurate.

John BlackburneProgrammer, The Pitbull Syndicate

double Presicion

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

double Presicion

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines