Sign in to follow this  
LonelyStar

Endian independent way of taking apart a float

Recommended Posts

Hello together, For sending floats over the network I want to take apart a float into exponent, mantissa and sign, reputting it together on the other side so that I can compress it (by in example using less bits for the exponent and mantissa). On a big endian system, I would get the exponent of a float the following way:
//f is the float we want to take apart,
//exp the "unsigned char" we want to put the exponent into
exp=((*((int*)&f))>>23) & 0xFF;

I reinterpret the float as a integer and than use the shift opertator. On a little endian processor, it would work a little different (and a little bit more complicated). But I want a way, working on both little and big endian systems. Ideas of how to do it? Thanks! Nathan

Share this post


Link to post
Share on other sites
you could possibly create a struct that looks the same on both sides. provided you have two floats on two different endian systems the method of extracting mantisa is the same regardless. bit shifting left is the same as bit shifting left no matter what endian the system is. what I mean is, that routine will extract mantisa off any machiene. also I suggest using unsigned int when you're doing bit extraction as is uses arithmetic right shifts, which is probably not what you want. you want logical right shifts. I don't think it matters for your code there, but it could in other circumstances.

Tim

Share this post


Link to post
Share on other sites
IMO it works EXATCLY the same on a LE as a BE.
It's only when storing/reading data to memory, that you need to take care.
If the data is of the same size you don't need to care either.
For instance, take an int (32 bits) 0x12345678.
Store it to memeory:
On LE you get the following 4 bytes: 0x78 0x56 0x34 0x12
On BE you get the following 4 bytes: 0x12 0x34 0x56 0x78
Now if you read the same int back (on the same machine), you get 0x12345678 no matter what.
However if you read less then 32 bits, for instance a 16 bit short you get:
0x5678 on the LE system and 0x1234 on the BE system.
IMHO LE is more "correct" but BE is more intuitive, correct in the sense that if you write a 32-bit int with the value 17 to address X, then read back an 8-bit integer from address X, it's still going to be 17. While the BE system would return zero.

Now your example with the float:
Suppose that the float contains the bits 0x12345678 (don't care about matissas, exponents etc). I assume the float is 32-bits.
Processors have this float in a register, on a fpu-stack or in memory.
Since you're using then & (address) operator, the compiler needs to store the float into memory (even if it was on a stack or in a register).
So now we have the float in memory as 0x78 0x56 0x34 0x12 for LE and 0x12 0x34 0x56 0x78 for BE.
Then you read it back as an int (also 32-bits assumed), you get 0x12345678 on both LE and BE. In essence you've just made an exact copy of the bits.
Some compilers on some CPU's is smart enought to avoid the temporary store, and use the designated instruction to move between floating point registers to integer registers, but the result is still the same. An exact copy of the bits.

The above is true as long as sizeof(int) == sizeof(float), so working in 64 bits is also fine.

So my bet is that your code snippet will work on both LE and BE as long as sizeof(int) == sizeof(float).

You probably already know this but if you send 32-bits of data from a LE to a BE system things will screw up. The same if you write a struct to disc on one system and then read it on another (assuming that the struct contains elements of more than 8-bits).
The easiest way to handle this IMHO, is to brake all data down into bytes and then reassemble it.
I.e:

void
WriteInt(unsigned char* dest, int i)
{
dest[0] = unsigned char(i >> 24);
dest[1] = unsigned char(i >> 16);
dest[2] = unsigned char(i >> 8);
dest[3] = unsigned char(i);
}

int
ReadInt(const unsigned char* source)
{
int i = int(source[0]);
i <<= 8;
i |= int(source[1]);
i <<= 8;
i |= int(source[2]);
i <<= 8;
i |= int(source[3]);
return i;
}




Do the same for floats, short etc.
Then use ONLY these functions to read/write data to your network packet or shared disc-data.

My 2c

Share this post


Link to post
Share on other sites
Couldn't you just use frexp() and scale the mantissa into a fixed point number?
The standard C library is full of goodies =)
typedef struct { int exponent, mantissa; } pair; 

pair encode(double value) {
pair result;
result.mantissa = frexp(value, &result.exponent) * 65536.0;
return result;
}

double decode(pair value) {
return ldexp(pair.mantissa / 65536.0, pair.exponent);
}

Share this post


Link to post
Share on other sites
I think it is not the same ob BE system as on a LE system. Let me show you way.
Suppose we have the float 1.0f (exponent=127, mantissa=0). In memory, that is:
0011 1111 1000 0000 0000 0000 0000 0000
or better
3F 80 00 00
Reinterpretated as a int on BE, this is:
0x3F800000
Shifting 23 to to the left:
0x0000007F = 127
What I want!

But on a LE, the integer would be:
0x0000803F
Shfting 23 to the left:
0x0 =0
:(

This is how I thought it would be, I could be completly wrong ...
The "frexp" is a great Idea! I think I will use that one!
Thanks to all of you!
Nathan

Share this post


Link to post
Share on other sites
Since most systems (at least that I know of) use the same IEEE floating point formats the simplest way would be to decide on a standard byte ordering for your network protocol, and the one of the two systems involved would have to just byte swap on send/receive. If you wanted to be really clever you could check if they were both the same but not the standard and then swap on neither side.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this