Back to For Beginners

Safe C++ Float To Fixed Point Conversion

For Beginners

Started by Storyyeller January 11, 2011 01:09 AM

9 comments, last by SiCrane 13 years, 3 months ago

Storyyeller

215

Author

January 11, 2011 01:09 AM

I have some double precision floating point numbers, which I would like to convert to fixed point in the form, (signed int) + (unsigned int) * 2^-32. Also, I would like to emit a warning or error of some kind in the case where the double cannot be exactly represented in this form.

I am not sure how to do this in a safe and relatively portable manner. The obvious method is to just cast it to an integer directly, but I am worried about all sorts of corner cases such as rounding and precision issues, NaNs, etc.

I trust exceptions about as far as I can throw them.

nfries88

1,154

January 11, 2011 01:38 AM

isnan() is standard C; every C compiler should have it.

as for safely rounding:



double round(double d)

{

  return floor(d + 0.5);

}

that should do you fine; right?

precision issues are a result of math; not rounding or casting to integers. So unfortunately it's really hard to work around.
If you require low, constant precision on large numbers or high, constant precision on small numbers; the best way is to make a wrapper class around a large signed integer. When you want the float value; just cast to a float and divide by pow(10, precision)

Storyyeller

215

Author

January 11, 2011 01:50 AM

isnan() is standard C; every C compiler should have it.

as for safely rounding:
double round(double d) { return floor(d + 0.5); }

that should do you fine; right?

precision issues are a result of math; not rounding or casting to integers. So unfortunately it's really hard to work around.
If you require low, constant precision on large numbers or high, constant precision on small numbers; the best way is to make a wrapper class around a large signed integer. When you want the float value; just cast to a float and divide by pow(10, precision)

I'm not worried about precision issues as a result of math. What I'm worried about is the same float value might round to different answers depending on the internal details of the FPU and the whims of the compiler.

I trust exceptions about as far as I can throw them.

nfries88

1,154

January 11, 2011 01:58 AM

Well, I don't know how you'd detect that. I assume you'd need to make separate cases for each compiler/FPU set, which would be a major pain and require tons of research and ultimately would probably be so inefficient that you'd be better off ignoring it.

Or you could use the method I described.

Storyyeller

215

Author

January 11, 2011 02:28 AM

Would this work?



#include <cstdint>

#include <cmath>

#include <cassert>



typedef int32_t sint;

typedef uint32_t uint;



void DoubleToInts(double arg, sint& whole, uint& frac)

{

	whole = (sint) floor(arg);



	const double fracbias = 4294967296.0;

	const double r = arg - whole;



	assert(r >= 0.0 && r<1.0);

	frac = (uint) floor(r * fracbias);



	assert(whole + (frac/fracbias) == arg);

}

I trust exceptions about as far as I can throw them.

frob

46,223

January 11, 2011 05:15 AM

I'm not worried about precision issues as a result of math. What I'm worried about is the same float value might round to different answers depending on the internal details of the FPU and the whims of the compiler.

That is the very nature of floating point. You are guaranteed a minimum precision for everything. For all operations you can perform the same operation multiple times and get different results, but within the proper precision it will be the same. Go read and learn about floating point precision.

For conversion to/from storage, the precision is at least 6 decimal digits for floats. It can be different, specified as either FLT_DIG (for the c-style constant) or [font="monospace"]numeric_limits<float>::digits10 (for the c++ edition of the same thing).

Anything outside that precision is up to the implementation.

So back to your question about identifying if it can be directly stored in your data entry, just look at the text version. If it is contains more than the specified number of digits you can omit your warning. So for a float, 1.2345 is okay. 1.234567 is not. 1234.56 is acceptable, 1234.567 should generate a warning. 0.000123456 is acceptable, 0.0001234567 is warning material.

Storyyeller

215

Author

January 11, 2011 06:06 AM

Go read and learn about floating point precision.

I've already read a lot about floating point precision. Why can't you read my posts?

I trust exceptions about as far as I can throw them.

frob

46,223

January 11, 2011 06:35 AM

I'm not worried about precision issues as a result of math. What I'm worried about is the same float value might round to different answers depending on the internal details of the FPU and the whims of the compiler.

I've already read a lot about floating point precision. Why can't you read my posts?

Right back at you.

'frob" said:
That is the very nature of floating point. You are guaranteed a minimum precision for everything. For all operations you can perform the same operation multiple times and get different results, but within the proper precision it will be the same.

As mentioned, the rules are already established. The system operates to very specific, well known, and easily discoverable precision. Anything outside that precision is completely outside your control. That applies to rounding (the one you are concerned about) just as much as any other operation.

SiCrane

11,840

January 11, 2011 01:35 PM

C++ conversion from a floating point type to a integral type is well-defined as long as the value in the floating point type fits inside the range of the integral type: it always truncates (discards the fractional part). If the value doesn't fit inside the range of the integral type the behavior is undefined.

Storyyeller

215

Author

January 11, 2011 05:49 PM

So would it be appropriate to just use assert(fabs(x) < 0x80000000); ?

I trust exceptions about as far as I can throw them.

Safe C++ Float To Fixed Point Conversion

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Safe C++ Float To Fixed Point Conversion

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines