🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Double to float C++

Started by
181 comments, last by JoeJ 6 days, 18 hours ago

Oh no! Please use proper bit manipulation, no strings of ones and zeroes. I beg you…

Advertisement

It’s only for proof-of-concept; a base reference. Once that’s working, then comes optimization, right? : )

Yes, but that's not optimization. It is ‘turning an eye sore of a clumsy and hard to read workaround into generic code’.

The only proper zero in that string is the string terminator. Which causes me a grain of uncertainty and doubt beside the desire to vomit all over the place. :D

But seriously, as far as i can read it, all it does is setting n right bits to zero, so the exact same i have proposed yesterday? Sure it makes a difference?

As you can tell, I’m not a bit twiddler. There are people who are better than me at it…. Such as yourself.

Yeah, maybe because that's the first thing the C64 manual has teached me. It's still super useful, mostly to pack multiple uints / or bools into one.

Your code should give the same result as this:

double value = PI;
uint64_t bits = (uint64_t&) value;
bits = bits & 0b1111111111111111111111111111111110000000000000000000000000000000ull;

double truncated = (double&) bits;

Clearing the least 31 bits, if i got the counting right.
Binary numbers are quite intuitive to use for bit math, sometimes. : )

I tried this, and it's super close to perfection, except in some cases:

#include <iostream>
#include <iomanip>
#include <string>
#include <bitset>
using namespace std;


void get_double_bit_string(double d, string& s)
{
	s = "";

	for (int i = 63; i >= 0; i--)
		s += to_string((reinterpret_cast<uint64_t&>(d) >> i) & 1);
}


double truncate_normalized_double(double d)
{
	//return static_cast<double>(static_cast<float>(d));

	double value = d;
	uint64_t bits = (uint64_t&)value;
	bits = bits & 0b1111111111111111111111111111111110000000000000000000000000000000ull;

	double truncated = (double&)bits;

	string sd = "";
	get_double_bit_string(truncated, sd);
	cout << sd << endl;

	//std::bitset<64> Bitset64(sd);

	//uint64_t value = Bitset64.to_ullong();

	//double dv = reinterpret_cast<double&>(value);
	//string sdv = "";
	//get_truncated_bit_string(dv, sdv);
	//cout << sdv << endl;

	double df = static_cast<double>(static_cast<float>(d));
	string sdf = "";
	get_double_bit_string(df, sdf);
	cout << sdf << endl;

	return truncated;
}	



int main(void)
{
	cout << setprecision(20) << endl;

	for(double d = 0.0; d < 1.0; d += 0.1)
		cout << truncate_normalized_double(d) << endl << endl;

	return 0;
}

The results are:

0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0

0011111110111001100110011001100110000000000000000000000000000000
0011111110111001100110011001100110100000000000000000000000000000
0.099999994039535522461

0011111111001001100110011001100110000000000000000000000000000000
0011111111001001100110011001100110100000000000000000000000000000
0.19999998807907104492

0011111111010011001100110011001100000000000000000000000000000000
0011111111010011001100110011001101000000000000000000000000000000
0.29999995231628417969

0011111111011001100110011001100110000000000000000000000000000000
0011111111011001100110011001100110100000000000000000000000000000
0.39999997615814208984

0011111111100000000000000000000000000000000000000000000000000000
0011111111100000000000000000000000000000000000000000000000000000
0.5

0011111111100011001100110011001100000000000000000000000000000000
0011111111100011001100110011001101000000000000000000000000000000
0.59999990463256835938

0011111111100110011001100110011000000000000000000000000000000000
0011111111100110011001100110011001100000000000000000000000000000
0.69999980926513671875

0011111111101001100110011001100110000000000000000000000000000000
0011111111101001100110011001100110100000000000000000000000000000
0.79999995231628417969

0011111111101100110011001100110010000000000000000000000000000000
0011111111101100110011001100110011000000000000000000000000000000
0.89999985694885253906

0011111111101111111111111111111110000000000000000000000000000000
0011111111110000000000000000000000000000000000000000000000000000
0.99999976158142089844

Seems you need 2 bits more, so

bits = bits & 0b1111111111111111111111111111111110000000000000000000000000000000ull;

should become:

bits = bits & 0b1111111111111111111111111111111111100000000000000000000000000000ull;

Just the last output does not fit the pattern, but should work.

I have a memory corruption bug happening during multithreading, so the crash tells me nothing about the cause.
Thus, after fruitless guessing, i decided to use debug mode. Which i usually can't, because it's slow AF.

After more than two hours of execution, a window popped up. I came just from eating and saw it.
I did read it's an out of bounds write to a std::vector. Great, now i only need to click the button an i'll see where it is, i've thought.

While my hand was moving towards the mouse, the corruption caused a real crash from another thread. :O

Now i can't get back to the first crash. I'm fucked.

That sucks. Sorry to hear about the issues. :( Bugs suck.

Thank you again for all of your ideas. You're an ideas guy, as well as the coder guy. That's really rare.

Turned out the debugger was at the right place, and i can prevent further crashes now. : ) But figuring out the true origin of the problem will take me some more time… >:(

Not sure if zeroing out bits is an idea. That's just the primary way to reduce precision. Actually i wanted to propose it in my first reply, but there was a more detailed response already, and then confusion slipped in.

Maybe it's not the precision that matters.
Beside that, you always make the value a little bit smaller, since there is no rounding.
I still think you either miss a term, or the values from the solar system textbook are not accurate enough to serve as ground truth. Maybe gravity of other planets also affects results enough to cause the error you see.

Btw, how can we even tell something is moving at the speed of light, or not moving at all?
We require some reference, like a global world space. But that does not exist, or does it?
Would the movement of the sun itself also affect your results?

I'm sure that you'll figure it out, after more thought.

Yes, Mach's principle is what you're looking for.

Edit: I'm giving up. Thanks for all of your help!!!

Edit: The code is:

#include <iostream>
#include <iomanip>
#include <string>
#include <bitset>
using namespace std;



void get_double_bit_string(double d, string& s)
{
	s = "";

	for (int i = 63; i >= 0; i--)
		s += to_string((reinterpret_cast<uint64_t&>(d) >> i) & 1);
}


double truncate_normalized_double(double d)
{
	if (d <= 0.0)
		return 0.0;
	else if (d >= 1.0)
		return 1.0;

	//////return static_cast<double>(static_cast<float>(d));

	string s;
	get_double_bit_string(d, s);
	cout << s << endl;

	const int64_t mantissa_size = 52;
	uint64_t max = static_cast<uint64_t>(-1); // 2^64 - 1

	uint64_t bits = reinterpret_cast<uint64_t&>(d);
	bits = bits & 0b1111111111111111111111111111111111100000000000000000000000000000ull;
	double reduced = reinterpret_cast<double&>(bits);

	get_double_bit_string(reduced, s);
	cout << s << endl;

	double df = static_cast<double>(static_cast<float>(d));
	string sdf = "";
	get_double_bit_string(df, sdf);
	cout << sdf << endl;

	return reduced;
}


int main(void)
{
	cout << setprecision(30) << endl;

	for(double d = 0; d < 1.0; d += 0.1)
	cout << truncate_normalized_double(d) << endl << endl;


	return 0;
}
Advertisement