Sign in to follow this  

Byteshifting

This topic is 4355 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

OK i got a couple of questions:
int value = 457;
char data[4];
memset(data,0,4);
data[0] = (char)value;
data[1] = (char)(value >> 8);
data[2] = (char)(value >> 16);
data[3] = (char)(value >> 24);

This doesnt work, the first byte becomes ffffffc9 for some reason, the other bytes are ok why is this. When i use another value like 1337 to put into data everything is ok. Another question i have is how do i do the same thing only with floats, so how to i serialize 4 byte floats into char binarydata[4] and back to float. Thx Aidamina

Share this post


Link to post
Share on other sites
The bit patterns are
for 457: 00000001.11001001 (== 01.c9 hex)
for 1337:00000101.00111001 (== 05.39 hex)

The right half has a leading 1 for 457 but a leading 0 for 1337. Since char is signed, this bit is interpreted as the sign. Reading in the (signed) char back into a (signed) int will expand the sign to the left, so that the mentioned ffffffc9 (for 32 bit int) will occur.

Mask the byte with 0xff after reading in from the char. You could also try to use unsigned chars instead.


To do something like that with floats, you could store the value into a float variable but access that variable by casting like here:
float floatValue = anything;
unsigned int intValue = *((unsigned int*)&floatValue);
what is, hmm, a little bit ugly.

If you have unions in your language available (e.g. C++), you could use
union {
float floatValue;
unsigned char asChars[4];
};
instead. But remember the endianess problem...

Share this post


Link to post
Share on other sites
thx for the quick response, but i still dont get how i solve the first problem. Maybe some code?

And how do i use that union to convert the data to float and viceversa

Share this post


Link to post
Share on other sites
Yes, the first problem is caused by the char being sign extended. It doesn't matter when converting from int to char, but when going from char to int obviously it makes a difference and you should probably use unsigned char.

As for floating point, look into frexp, but it's probably just as portable (assuming int is 32 bits) to use the same function as above but like so:

float fvalue = 7.0;
int value = *(int*)&fvalue;

Since as far as I know all platforms use the IEEE standard floating point format.

Share this post


Link to post
Share on other sites
You can do it with a union (nasty hack) like so:
union
{
int value;
unsigned char data[4];
} conversion;

conversion.value = 457;


At this point the bytes can be retrieved from conversion.data[0] through to conversion.data[3]

Share this post


Link to post
Share on other sites
True, but that won't be endian-portable, where converting to int and bit shifting will.

Edit: that applies to both int and float versions.

Share this post


Link to post
Share on other sites
To bring it at a point:

For int, you could try to use unsigned char, like in the following code snippet. To be sure you may mask the values to restrict them to the allowed range of an octet.

int intValue
unsigned char data[4];
///memset(data,0,4); <-- this is not needed herein

// convert int to 4 chars, LSB in char at index 0
intValue= 457;
data[0] = (unsigned char)(intValue&0xff);
data[1] = (unsigned char)((intValue >> 8)&0xff);
data[2] = (unsigned char)((intValue >> 16)&0xff);
data[3] = (unsigned char)((intValue >> 24)&0xff);

// convert 4 chars to int, LSB in char at index 0
intValue = data[3];
intValue = (intValue<<8)|data[2];
intValue = (intValue<<8)|data[1];
intValue = (intValue<<8)|data[0];

// to read a single char value, e.g. for a printf
int singleOctet = data[index]&0xff



For floats you may use the conversion to an int of the same bit pattern (notice that this does not mean the same or even a similar value), or vice-versa, resp. Then you could apply the methods shown above.

// convert float to 4 chars, LSB in char at index 0
float floatValue = 4.5f;
intValue = *((int*)&floatValue);
// ... int to char[4] from above

// convert 4 chars to float, LSB in char at index 0
intValue = // ... char[4] to int from above
floatValue = *((float*)&intValue);




Endianess is not the question of how many bits are used to store an int but how the order of bytes in use is as soon as more than a single byte is needed. So it doesn't help in this sense to use int, int32_t, or __int32 (or whatever also exists). The almost most often endianesses found are named "little endian" and "big endian" (there are some other exotic ones). Little endian means that the less significant bytes come at lower memory addresses, as is used by Intel and compatible CPUs. Big endian means that the less significant bytes come at higher memory addresses, as is used by PPCs and SPARCS and some others.

The OS and the compiler set-up normally provides ways to determine the endianess at runtime or even at compile time, and often also byte swap routines are available, so it is not really a problem. However, this stuff is OS and/or compiler dependent, and hence itself not portable (maybe there is a portable way, but I don't know one). So you should consider to stay at the methods above.

BTW: float is always 32 bit wide, and double is always 64 bits wide, since today these types are compatible to the IEEE 754 standard, so something like __float32 isn't necessary.

Share this post


Link to post
Share on other sites

This topic is 4355 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this