Jump to content

View more

Image of the Day

I've done more tweaks to my color palette selector. #gamedev #screenshotsaturday #madewithunity https://t.co/aJXrC4ruRg
IOTD | Top Screenshots

The latest, straight to your Inbox.

Subscribe to GameDev.net Direct to receive the latest updates and exclusive content.


Sign up now

Conversion of Pascal real48 (48-bit float) to C++ double

4: Adsense

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.


  • You cannot reply to this topic
11 replies to this topic

#1 Xentropy   Members   

140
Like
0Likes
Like

Posted 18 June 2005 - 01:10 PM

I have a need to load data from a Pascal data file using C++, and some of the fields in the file are 48-bit floating point values. Since C++ uses 32- and 64-bit floats, I know of no way to directly convert the data and obtain accurate results. I've been fiddling around with really messy code manually calculating the exponent and mantissa bit by bit and so on, but I'm sure there's got to be a far more elegant way to do this. Does anyone have any suggestions? Thanks in advance.

#2 Dave   Members   

2173
Like
0Likes
Like

Posted 18 June 2005 - 01:38 PM

If you were to use some straight C function like fscanf to read the data in and make the recieving type and escape characters long, then it might well chop the end off the incoming float/long.

ace

#3 iMalc   Members   

2466
Like
0Likes
Like

Posted 18 June 2005 - 02:30 PM

Is the data stored in binary, or textual form?

If it's stored in binary (and by the sounds of it, it is) then you've got a little bit of 48-bit float to double conversion code to write.
What you need to know is hoiw many bits are the mantissa, and how many are the exponent, and what it the base and bias for the exponent.

My sources tell me that it's probably 40-bits (5 bytes) for the mantissa, 7-bits for the exponent, and 1-bit for the sign. The exponent is base 8.

IEEE754 Double's use 52(+1 implied) bits for the mantissa, 11 bits for the exponent, and 1 bit for the sign. The exponent is base 2 and is biased by 1023.

Doesn't sound like a terribly easy conversion, but I'd give it a shot. I'll give you a hand if you like.

#4 Michalson   Members   

1657
Like
0Likes
Like

Posted 18 June 2005 - 02:43 PM

The Pascal "real" format is rather old, dating back the pre-FPU days. It's maintained for backwards compatbility, I think I have some documentation on how it's formatted. It was obviously designed for fast integer emulation of floating point values.

#5 Xentropy   Members   

140
Like
0Likes
Like

Posted 18 June 2005 - 03:30 PM

Yeah, in this case it's being used as basically a large int. I googled around and discovered the format is actually 1 sign bit, followed by 39-bit mantissa, followed by 8-bit exponent, with a bias of 129. It's not close enough to IEEE to just tack on a couple of zero bytes and call it a double. :)

It's stored in a binary file, but I'm loading it into a char [6] and then dealing with it from there. From the responses so far, I guess calculating it manually is pretty much my only option. For now I'm not coming up with correct values, but I'm sure that's just an issue with my code and math. Is there any way to do the mantissa conversion without a lot of expensive calls to pow()?

Edit: Well, I found my error. I wasn't testing for zero and zero was turning into 1 * 2^127. Thought I was way off. But after correcting for that, all of the test values I've tried have come out on the money. Note that the way the data is stored in the file, the most significant byte is the 6th, and it works its way backwards from there, which is why there's all the weird modulus stuff to work on the bits in the right order. Here's my messy code:


double real48ToDouble(char *realValue) {

double mantissa = 1.0;
for (int i = 46; i >= 8; i--) {
if ((realValue[i / 8] >> (i % 8)) & 0x01)
mantissa += pow(2.0, i - 47);
}

char exponent = realValue[0] - 129;

if (mantissa == 1.0 && exponent == 127) // Test for null value
return 0.0;

if (realValue[5] & 0x80) // Sign bit check
mantissa *= -1.0;

return mantissa * pow(2.0, exponent);
}

It works perfectly, I just don't like the LOOK of it. Too many magic numbers abound and such. Oh well, at least since I know it works it can lurk and I don't have to look at the function again.

[Edited by - Xentropy on June 18, 2005 9:30:14 PM]

#6 iMalc   Members   

2466
Like
0Likes
Like

Posted 18 June 2005 - 09:12 PM

What you have does indeed look correct, however to remove the calls to pow in the loop, see the changes below:
double real48ToDouble(const char *realValue) {
double exponent = double((unsigned char)realValue[0]) - 129.0;
double mantissa = 1.0;
double power = 1.0;

for (int i = 46; i >= 8; i--) {
power *= 0.5;
if ((realValue[i >> 3] >> (i & 7)) & 0x01)
mantissa += power;
}

if (mantissa == 1.0 && realValue[0] == 0) // Test for null value
return 0.0;

if (realValue[5] & 0x80) // Sign bit check
mantissa = -mantissa;

return mantissa * pow(2.0, exponent);
}





Take note of the other minor changes I made too.
There are more optimsations that could be done too I think, but this would have already made a huge impact.

[Edited by - iMalc on June 19, 2005 3:12:46 AM]

#7 Zahlman   Members   

1682
Like
0Likes
Like

Posted 18 June 2005 - 11:34 PM

Instead of working with pow(), might I suggest bit-shifting things around within a 64-bit integer type (depending on your platform you may need to do some work to acquire such a beast) and then reinterpret-casting that bit pattern to a double?

#8 Xentropy   Members   

140
Like
0Likes
Like

Posted 19 June 2005 - 08:15 AM

Thank you, iMalc! That's so simple I feel like an idiot for not thinking of it.

I also considered the alternative of reading each byte instead of bit and dividing the most significant byte by 256, the next most by 256 twice, the next most by 256 three times, and so on. It would come to the same conclusion with less iterations through the loop. I suspect compiler optimizations may have already taken care of things like changing / 8 to >> 3, but I'll still implement those in case it doesn't.

I wasn't too concerned with speed since these real48 fields in the files I'm reading are not very common. Most of the fields are ints, and maybe 2% of the files are real48s. But hey, free speed is always nice, so I thank you!

#9 Extrarius   Members   

1412
Like
0Likes
Like

Posted 19 June 2005 - 08:22 AM

Quote:
Original post by Xentropy
[...]It's not close enough to IEEE to just tack on a couple of zero bytes and call it a double. :)[...]
You know the format of a double is a sign bit, the exponent (11 bits) and the mantissa(52 bits), right? All you have to do is swap a few bits around, prepend some zeroes to the exponent, and append some zeroes to the mantissa.

#10 Xentropy   Members   

140
Like
0Likes
Like

Posted 19 June 2005 - 09:18 AM

The bias is also different (129 instead of 1023 for a double, IIRC), so I'd still have to do some math on it or it'd end up off by 2, basically. The order of the fields is also different, exponent at the most significant bits of a double, least significant of a real48.

You're right, though, it may be even faster to work directly with bit shifts on an int64. The question becomes what is the best way to cast a char [6] to an int64, and what kind of speed penalty might that incur? Since the fields are 6 bytes in the files, I can't just change the data types to int64 in the record structures without messing up all my offsets and making the file loading more complex (likely removing any speed improvement I could get on these conversions in the first place).

Err, nevermind that question, I can just append two nulls to the end of the char array and then reintrpret_cast those 8 bytes of memory to an int64. I'll mess around with it and profile the speed difference.

#11 iMalc   Members   

2466
Like
0Likes
Like

Posted 19 June 2005 - 09:02 PM

I was going to try doing it using the __int64 data type with shifts/and/or etc, however I figured that for file loading speed although it is important, it isn't necessarily worth the effort to go furthur, unless of course you've got 100000 to do. It's also a little more difficult than Extrarius suggests.

The beauty of the method coded thus far is that it doesn't have to know anything about the internals of the double data type, and doesn't have endian issues etc, and is thus probably the most portable.

I have a habbit of always using shifts and ands instead of div and mod for things like this. Most of the time the compiler should optimise it, but not always, or with every compiler, particularly when using signed types. But this point isn't worth debating, I just do it like that always.


#12 Xentropy   Members   

140
Like
0Likes
Like

Posted 20 June 2005 - 10:23 AM

Yep, for my purposes the bit manipulation method only profiled marginally faster, since it has to allocate 8 bytes of memory for the __int64 and copy the 6 bytes to the new memory before doing anything else. It is not enough faster to make up for the loss of portability and eaes of understanding and maintenance.

That change to dividing bytes by 256 instead of bits by 2 worked out nicely, though. 3 million conversions took 716ms using the bit-by-bit method and 425ms using byte-by-byte. I tried some other minor tweaks to squeeze out more speed and only managed to make it slightly slower again, so I've nailed it down to a nice method. Considering the single pow() call left at the end of the function to calculate 2^exponent takes about 300ms of that 425ms, there isn't a lot of speed left to squeeze out of the mantissa formulation.

Here's the final fastest code, for future reference (if anyone finds this thread in a search for real48 or something):


double Real48ToDouble(unsigned char realValue[6]) {

if (realValue[0] == 0)
return 0.0; // Null exponent = 0

double exponent = realValue[0] - 129.0;
double mantissa = 0.0;

for (int byte = 1; byte <= 4; byte++) {
mantissa += realValue[byte];
mantissa *= 0.00390625; // mantissa /= 256
}
mantissa += (realValue[5] & 0x7F);
mantissa *= 0.0078125; // mantissa /= 128
mantissa += 1.0;

if (realValue[5] & 0x80) // Sign bit check
mantissa = -mantissa;
return mantissa * pow(2.0, exponent);
}


I do the most significant byte calculation outside the loop since that byte also contains the sign bit and thus needs to be treated slightly differently. This code gets the exact same results as the other code in a little over half the time. A 70% speed savings if you don't include the time spent in the final pow. :)

[Edited by - Xentropy on June 20, 2005 4:23:11 PM]




Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.