• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
Servant of the Lord

Serializing floats to bytes, and byte ordering issues

3 posts in this topic

Is this code, barring byte order issues, sufficient to write floats in a cross-platform and cross-hardware way?

void ConvertFloatToBytes(float mcFloaty, char *buffer)
{
	//Asserts that they are in IEC 559 (aka IEEE 754) format, which is the most common format.
	static_assert(std::numeric_limits<float>::is_iec559(), "The code requires we use the IEEE 754 floating point format for binary serialization of floats and doubles.");
	
	//Convert to bytes.
	unsigned char *bytes = reinterpret_cast<unsigned char*>(&type);
	
	//We're assuming 'buffer' has enough space.
	std::memcpy(buffer, bytes, sizeof(float));
}

 
Also, when and how do I need to handle byte order?
If I have an array of bytes, do I need to swap every four bytes in that array, regardless of what I put in the array, and regardless of whether the code is running on 64 bit or 32 bit hardware?
 
So, if I write the string, "0123456789", and I'm on a LittleEndian machine, and I want it converted to Network Byte Order (which I want to use for game file formats, for cross-platform use), the resulting order should be: "3210765498"?
 
Or do I just need to worry about integers and floats that are larger than one byte?

So using my byte-ordering code:

//Swaps between Big and Little endian types. Returns the result.
#define ChangeEndian16(value)    (((value & 0xFF00) >> 8) | \
                                  ((value & 0x00FF) << 8))

#define ChangeEndian32(value)    (((value & 0xFF000000ul) >> 24) | \
                                  ((value & 0x00FF0000ul) >>  8) | \
                                  ((value & 0x0000FF00ul) <<  8) | \
                                  ((value & 0x000000FFul) << 24))

#define ChangeEndian64(value)    (((value & 0xFF00000000000000ull) >> 56) | \
                                  ((value & 0x00FF000000000000ull) >> 40) | \
                                  ((value & 0x0000FF0000000000ull) >> 24) | \
                                  ((value & 0x000000FF00000000ull) >>  8) | \
                                  ((value & 0x00000000FF000000ull) <<  8) | \
                                  ((value & 0x0000000000FF0000ull) << 24) | \
                                  ((value & 0x000000000000FF00ull) << 40) | \
                                  ((value & 0x00000000000000FFull) << 56))

inline int16_t LocalToBigEndian(int16_t value)   {   return (BigEndianOrder? value:ChangeEndian16(value));   }
inline uint16_t LocalToBigEndian(uint16_t value) {   return (BigEndianOrder? value:ChangeEndian16(value));   }

inline int32_t LocalToBigEndian(int32_t value)   {   return (BigEndianOrder? value:ChangeEndian32(value));   }
inline uint32_t LocalToBigEndian(uint32_t value) {   return (BigEndianOrder? value:ChangeEndian32(value));   }

inline int64_t LocalToBigEndian(int64_t value)   {   return (BigEndianOrder? value:ChangeEndian64(value));   }
inline uint64_t LocalToBigEndian(uint64_t value) {   return (BigEndianOrder? value:ChangeEndian64(value));   }

#define LocalToNetworkOrder(value)    LocalToBigEndian(value)
#define NetworkOrderToLocal(value)    BigEndianToLocal(value)

.

 

Am I guaranteed this code will work on all Little and Big Endian architectures where IEEE 754 is used?

void ConvertFloatToBytes(float myFloat, char *buffer)
{
	//Asserts that they are in IEC 559 (aka IEEE 754) format, which is the most common format.
	static_assert(std::numeric_limits<float>::is_iec559(), "The code requires we use the IEEE 754 floating point format for binary serialization of floats and doubles.");
	
        //Convert to network byte order.
        uint32_t networkOrdered = LocalToNetworkOrder(reinterpret_cast<uint32_t>(myFloat));

	//Convert to bytes.
	unsigned char *bytes = reinterpret_cast<unsigned char*>(&networkOrdered);
	
	//We're assuming 'buffer' has enough space.
	std::memcpy(buffer, bytes, sizeof(float));
}
Edited by Servant of the Lord
0

Share this post


Link to post
Share on other sites
Sadly, while IEEE 754 specifies bit order it doesn't specify anything about byte order. It's possible to find hardware that is otherwise little endian where floats appear like you would expect on a big endian machine and vice versa. If you want to support multiple hardware architectures you're going to have to be prepared to special case your floating point conversions.

For strings, you generally don't re-arrange byte orders for pretty much the same reason you don't do any conversion on text files between machines.

Also reinterpret_cast<uint32_t>(myFloat) probably won't do what you expect. You probably wanted to reinterpret the address rather than the value.
0

Share this post


Link to post
Share on other sites

Sadly, while IEEE 754 specifies bit order it doesn't specify anything about byte order. It's possible to find hardware that is otherwise little endian where floats appear like you would expect on a big endian machine and vice versa. If you want to support multiple hardware architectures you're going to have to be prepared to special case your floating point conversions.


All I care about supporting is iOS, Android, Mac, Linux, and Windows. I know some of these have different byte orders.
So do I do anything different to floats than what I do to integers?

For strings, you generally don't re-arrange byte orders for pretty much the same reason you don't do any conversion on text files between machines.


So:
A) I only need to handle basic types larger than 1 char, such as floats, uint16_t, uint32_t, doubles, etc...?
B) I handle floats and doubles the exact same way I handle uint32_t and uint64_t?
C) I handle uint64_t by entirely mirroring the order of the bytes? So bytes [01234567] becomes [76543210]?
0

Share this post


Link to post
Share on other sites

Sadly, while IEEE 754 specifies bit order it doesn't specify anything about byte order. It's possible to find hardware that is otherwise little endian where floats appear like you would expect on a big endian machine and vice versa. If you want to support multiple hardware architectures you're going to have to be prepared to special case your floating point conversions.


All I care about supporting is iOS, Android, Mac, Linux, and Windows. I know some of these have different byte orders.
So do I do anything different to floats than what I do to integers?

 
errm.
 
actually, it has more to do with the hardware and CPU architecture than with the OS.
 
on x86 and x86-64 targets (PCs, laptops, etc...), they use little endian pretty much exclusively (regardless of Windows vs Linux vs ...).
 
iOS and Android generally run on ARM targets, where ARM also defaults to little-endian.
(could require further verification though, so can't say conclusively that they are LE...).
 
OTOH: other architectures, such as PowerPC, tend to default to big-endian (IOW: XBox360, PS3, Wii).
 
 
I generally prefer though to write endianness independent code over the use of explicit conditional swapping, where basically endianness independent code is code written in such a way that the bytes will be read/written in the intended order regardless of the CPU's native endianness.
 
in some cases, I have used typedefs to represent endianness specific values, typically represented as a struct:
typedef struct { byte v[4]; } u32le_t; //unsigned int 32-bit little-endian
typedef struct { byte v[8]; } s64be_t; //unsigned int 64-bit big-endian
typedef struct { byte v[8]; } f64le_t; //little-endian double
...

typically, these are handled with some amount of wrapper logic, and being structs more or less prevents accidental mixups (they also help avoid target-specific padding and access-alignment issues).

some target-specific "optimizations" may also be used (say, on x86, directly getting/setting the values for little-endian values rather than messing around with bytes and shifts).

note that these types are generally more used for storage, and not for working with data values (values are typically converted to/from their native forms).
 
 
while it is true that not all hardware has floating-point and integer types have the same endianness, relatively few architectures like this are still in current use AFAIK.
 
one option FWIW, is to basically detect various targets and when possible use a fast direct-conversion path, with a fallback case resorting to the use of arithmetic to perform the conversions (where the arithmetic strategy will still work regardless of the actual internal representation).


note that, in general though, endianness is handled explicitly per-value or per-type, rather than by some sort of generalized byte-swapping.
 

for many of my custom file-formats, I actually prefer the use of variable-width integer and floating-point values (typically because they are on-average more compact, with each number effectively encoding its own length).

typically a floating-point value will be encoded as a pair of signed variable length integers (this also works well for things like encoding floating-point numbers and vectors into an entropy-coded bitstream, typically this is base,exponent where value=base*2.0^exp, with base=0 as a special case for encoding 0/Inf/NaN/etc...).
 
but, this is its own topic (there are many options and tradeoffs for variable-length numbers, and even more so when entropy-coding is thrown into the mix...).


otherwise, when designing formats, I tend to prefer little-endian, but will use big-endian if it is already in use in the context (such as when designing extension features for an existing file-format).

common reasons to prefer little-endian are mostly that this is what the most common CPU architectures at this point tend to use.

common reasons to prefer big-endian is that it is the established "network" byte order.
 
 

For strings, you generally don't re-arrange byte orders for pretty much the same reason you don't do any conversion on text files between machines.


So:
A) I only need to handle basic types larger than 1 char, such as floats, uint16_t, uint32_t, doubles, etc...?
B) I handle floats and doubles the exact same way I handle uint32_t and uint64_t?
C) I handle uint64_t by entirely mirroring the order of the bytes? So bytes [01234567] becomes [76543210]?


I think more because ASCII text is generally byte-order agnostic by its basic nature.

if we see something like:
"foo: value=1234, ..."
it is pretty well settled how the digits are organized (otherwise people are likely to start rage-facing).

similarly, it would just be weird if one machine would print its digits in one order, but another machine uses another.


generally, for binary file-formats, it is preferable if they "choose one". most file-formats do so, stating their endianness explicitly as part of the format spec.

some file-formats leave this issue up in the air though (leaving the endianess as a per-file, or worse, per-structure-type, matter...). similarly annoying is formats which use file-specific field sizes (so, it is necessary, say, to determine if the file is using 16 or 32 bits for its 'int' or 'word' type, ...). luckily, these sorts of things are relatively uncommon.


it is worth noting that there is also a fair bit of a "grey area", namely binary formats which are stream-based and are byte-order agnostic, for similar reasons to ASCII text.


this is sort of also true of bitstreams, despite them introducing a new notion:
the relevance of how bits are packed into bytes.

interestingly, word endianness naturally arises as a result of this packing (start packing from the LSB using the low-order bit, and you get little-endian, or from the MSB using the high-bit, and you get big-endian). granted, it is technically possible to mix these, effectively getting bit-transposed formats, but these cases tend to be harder to encode/decode efficiently (it tends to involve either reading/writing a bit at a time, or using a transpose-table, *1).

*1: Deflate is partly an example of this: it uses little-little packing for the most part, but Huffman codes are packed starting at the high-bit, resulting in the use of a transpose table when setting up the Huffman tables (but not during the actual main encoding/decoding process).

granted, in bitstream formats, it isn't really uncommon to find a wide range of various forms of funkiness. Edited by cr88192
1

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0