• Create Account

Like
9Likes
Dislike

# Writing Endian Independent Code in C++

By Promit Roy | Published Jul 24 2013 11:13 PM in General Programming
Peer Reviewed by (jbadams, jjd, Dragonsoulj)

c++ endian

## What does "endian" mean?

Endians are a confusing topic for many people. Hopefully, by reading this article you will understand both what endian means and how to write code to deal with it. So I might as well start with the obvious question—"What does endian mean?"

The actual origin of the term "endian" is a funny story. In Gulliver's Travels by Jonathan Swift, one of the places Gulliver visits has two groups of people who are constantly fighting. One group believes the a hardboiled egg should be eaten from the big, round end (the "big endians") and the other group believes that it should be eaten from the small, pointy end (the "little endians"). Endian no longer has anything to do with hard boiled eggs, but in many ways, the essence of the story (two groups fighting over a completely pointless subject) still remains.

Note:
This article was originally published to GameDev.net in 2004. It was revised by the original author in 2008 and published in the book Advanced Game Programming: A GameDev.net Collection, which is one of 4 books collecting both popular GameDev.net articles and new original content in print format.

Suppose we start with an unsigned 2 byte (16 bit) long number; we'll use 43707. If we look at the hexadecimal version of 43707, it's 0xAABB. Now, hexadecimal notation is convenient because it neatly splits up the number into its component bytes. One byte is 'AA' and the other byte is 'BB'. But how would this number look in the computer's memory (not hard drive space, just regular memory)? Well, the most obvious way to keep this number in memory would be like this:

| AA | BB |

The first byte is the "high" byte of the number, and the second one is the "low" byte of the number. High and low refers to the value of the byte in the number; here, AA is high because represents the higher digits of the number. This order of keeping things is called MSB (most significant byte) or big endian. The most popular processors that use big endian are the PPC family, used by Macs. This family includes the G3, G4, and G5 that you see in most Macs nowadays.

So what is little endian, then? Well, a little endian version of 0xAABB looks like this in memory:

| BB | AA |

Notice that it's is backwards of the other one. This is called LSB (least significant byte) or little endian. There are a lot of processors that use little endian, but the most well known are the x86 family, which includes the entire Pentium and Athlon lines of chips, as well as most other Intel and AMD chips. The actual reason for using little endian is a question of CPU architecture and outside the scope of this article; suffice to say that little endian made compatibility with earlier 8 bit processors easier when 16-bit processors came out, and 32-bit processors kept the trend.

It gets a little more complicated than that. If you have a 4 byte (32 bit) long number, it's completely backwards, not switched every two bytes. Also, floating-point numbers are handled a little differently as well. This means that you can't arbitrarily change the endian of a file; you have to know what data is in it and what order it's in. The only good news is that if you're reading raw bytes that are only 8 bits at a time, you don't have to worry about endians.

## When does the endian affect code?

The short answer is, endians come into play whenever your code assumes that a certain endian is being used. The most frequent time this happens is when you're reading from a file. If you are on a big endian system and write, say, 10 long integers to a file, they will all be written in big endian. If you read them back, the system will assume you're reading big endian numbers. The same is true of little endian systems; everything will work with little endian. This causes problems when you have to use the same file on both little and big endian systems. If you write the file in little endian, big endian systems will get screwed up. If you write it in big endian, little endian systems get screwed up. Sure, you could keep two different versions of the file, one big, one small, but that's going to get confusing very quick, and annoying as well.

The other major time that endians matter is when you use a type cast that depends on a certain endian being in use. I'll show you one example right now (keep in mind that there are many different type casts that can cause problems):

unsigned char EndianTest[2] = { 1, 0 };
short x;

x = *(short *) EndianTest;


So the question is, what's x? Let's look at what this code is doing. We're creating an array of two bytes, and then casting that array of two bytes into a single short. What we've done by using an array is basically forced a certain byte order, and we're going to see how the system treats those two bytes. If this is a little endian system, the 0 and 1 will be interpreted backwards, and will be seen as if it is 0,1. Since the high byte is 0, it doesn't matter and the low byte is 1—x will be equal to 1. On the other hand, if it's a big endian system, the high byte will be 1 and the value of x will be 256. We'll use this trick later to determine what endian our system is using without having to set any conditional compilation flags.

## Writing Endian Independent Code

So we finally get down to the most important part; how does one go about writing code that isn't bound to a certain endian? There are many different ways of doing this; the one I'm going to present here was used in Quake 2, and most of the code you'll see here is somewhat modified code out of the Quake 2 source code. It's mostly geared towards fixing files that are written in a certain endian, since the type casting problem is much harder to deal with. The best thing to do is to avoid casts that assume a certain byte order.

So the basic idea is this. Our files will be written in a certain endian without fail, regardless of what endian the system is. We need to ensure that the file data is in the correct endian when read from or written to file. It would also be nice to avoid having to specify conditional compilation flags; we'll let the code automatically determine the system endian.

Step 1: Switching Endians

The first step is to write functions that will automatically switch the endian of a given parameter. First, ShortSwap:

short ShortSwap( short s )
{
unsigned char b1, b2;

b1 = s & 255;
b2 = (s >> 8) & 255;

return (b1 << 8) + b2;
}


This function is fairly straightforward once you wrap your head around the bit math. We take apart the two bytes of the short parameter s with some simple bit math and then glue them back together in reverse order. If you understand bit shifts and bit ANDs, this should make perfect sense. As a companion to ShortSwap, we'll have ShortNoSwap, which is very simple:

short ShortNoSwap( short s )
{
return s;
}


This seems utterly pointless at the moment, but you'll see why we need this function in a moment.

Next, we want to swap longs:

int LongSwap (int i)
{
unsigned char b1, b2, b3, b4;

b1 = i & 255;
b2 = ( i >> 8 ) & 255;
b3 = ( i>>16 ) & 255;
b4 = ( i>>24 ) & 255;

return ((int)b1 << 24) + ((int)b2 << 16) + ((int)b3 << 8) + b4;
}

int LongNoSwap( int i )
{
return i;
}


LongSwap is more or less the same idea as ShortSwap, but it switches around 4 bytes instead of 2. Again, this is straightforward bit math.

Lastly, we need to be able to swap floats:

float FloatSwap( float f )
{
union
{
float f;
unsigned char b[4];
} dat1, dat2;

dat1.f = f;
dat2.b[0] = dat1.b[3];
dat2.b[1] = dat1.b[2];
dat2.b[2] = dat1.b[1];
dat2.b[3] = dat1.b[0];
return dat2.f;
}

float FloatNoSwap( float f )
{
return f;
}


This looks a little different than the previous two. There are three major steps here. First, we set one of our unions, dat1, equal to f. The union automatically allows us to split the float into 4 bytes, because of the way unions work. Second, we set each of the bytes in dat2 to be backwards of the bytes in dat1. Lastly, we return the floating-point component of dat2. This union trick is necessary because of the slightly more complex representation of floats in memory (see the IEEE documentation). The same thing can be done for doubles, but I'm not going to show the code here, as it simply involves adding more bytes, changing to double, and doing the same thing.

Step 2: Set function pointers to use the correct Swap function

The next part of our implementation is the clever twist that defines this method of endian independence. We're going to use function pointers to automatically select the correct endian for us. We'll put their declarations in a header file

extern short (*BigShort) ( short s );
extern short (*LittleShort) ( short s );
extern int (*BigLong) ( int i );
extern int (*LittleLong) ( int i );
extern float (*BigFloat) ( float f );
extern float (*LittleFloat) ( float f );


Remember to put them in a C or C++ file without extern so that they are actually defined, or you'll get link errors. Each one of these functions is going to point to the correct Swap or NoSwap function it needs to invoke. For example, if we are on a little endian system, LittleShort will use ShortNoSwap, since nothing needs to change. But if we are on a big endian system, LittleShort will use ShortSwap. The opposite is true of BigShort.

Step 3: Initialization

In order to initialize all of these function pointers, we'll need a function to detect the endian and set them. This function will make use of the byte array cast I showed you earlier as an example of a byte cast that assumes a certain endian.

bool BigEndianSystem;  //you might want to extern this

void InitEndian( void )
{
byte SwapTest[2] = { 1, 0 };

if( *(short *) SwapTest == 1 )
{
//little endian
BigEndianSystem = false;

//set func pointers to correct funcs
BigShort = ShortSwap;
LittleShort = ShortNoSwap;
BigLong = LongSwap;
LittleLong = LongNoSwap;
BigFloat = FloatSwap;
LittleFloat = FloatNoSwap;
}
else
{
//big endian
BigEndianSystem = true;

BigShort = ShortNoSwap;
LittleShort = ShortSwap;
BigLong = LongNoSwap;
LittleLong = LongSwap;
BigFloat = FloatSwap;
LittleFloat = FloatNoSwap;
}
}


Let's examine what's going on here. First we use the cast to check our endian. If we get a 1, the system is little endian. All of the Little* functions will point to *NoSwap functions, and all of the Big* functions will point to *Swap functions. The reverse is true if the system is big endian. This way, we don't need to know the endian of the system we're on, only the endian of the file we're reading or writing.

## A Practical Demonstration

Ok, let's suppose we have some sort of structure we need to read from file and write to file, maybe a vertex. Our structure looks like this:

struct Vertex
{
float Pos[3];
float Normal[3];
long Color;
float TexCoords[2];
};


Nothing special here, just a typical vertex structure. Now we're going to decide that vertices will always be stored to file in little endian. This is an arbitrary choice, but it doesn't matter. What we're going to do is add a function to this struct that will fix the endian after loading or before saving it:

struct Vertex
{
float Pos[3];
float Normal[3];
long Color;
float TexCoords[2];

void Endian()
{
for(int i = 0; i < 3; ++i) //our compiler will unroll this
{
Pos[i]= LittleFloat( Pos[i] );
Normal[i] = LittleFloat( Pos[i] );
}
Color = LittleLong( Color );
TexCoords[0] = LittleFloat( TexCoords[0] );
TexCoords[1] = LittleFloat( TexCoords[1] );
}
};


Let's be honest here; it's not exactly the easiest thing ever. You're going to have to write one of those painfully boring functions for each and every structure that goes in and out of files. After those functions are written, though, writing endian independent code elsewhere is going to be a breeze. Notice that we used Little* functions here because our files are all little endian. If we had decided to use big endian, we could have simply used the Big* functions.

Now what will the actual functions for working with the vertices be like? Well, to read a vertex:

void ReadVertex( Vertex* v, FILE* f )
{
if( v == NULL || f == NULL )
return;

fread( v, sizeof(Vertex), 1, f );

//now, our not quite magical endian fix
v->Endian();
}


Notice the simplicity of correcting the endian; although our structure definitions are slightly messy, the loading code has become very simple. The code to save is equally simple:

void WriteVertex( Vertex* v, FILE* f )
{
if( v == NULL || f == NULL )
return;

v->Endian();
fwrite( v, sizeof(Vertex), 1, f );
}


And that's all there is to it. I'm including a sample C++ header/source file pair that you're free to use in your own programs; it's GPL code though (since it's actually part of a GPL project I'm working on) so you'll have to write your own if you don't want to release your code.

As interesting as this article is (and it really is interesting with good information), is byte ordering actually still an issue today on modern platforms? As I understand it just about everything is in LSB ordering (x86, ARM, etc.). Or are there popular devices with cross-platform applications where this can still be an issue?

EDIT:

Just answered my own question -- turns out it could still be relevant particularly when it comes to network traffic and legacy file formats. As I understand it network byte ordering is still big-endian so that ought to come into play when considering endianness issues. It also appears that Oracle's byte ordering is also big-endian which may play into how it handles files (don't use java much so someone more experienced could fill that in).

This:

  union
{
float f;
unsigned char b[4];
} dat1, dat2;

dat1.f = f;
dat2.b[0] = dat1.b[3];
dat2.b[1] = dat1.b[2];
dat2.b[2] = dat1.b[1];
dat2.b[3] = dat1.b[0];
return dat2.f;


is undefined behaviour in C++. You are allowed to read only from the union member that has been assigned last (with few exceptions not applicable here).

This pointer cast:

byte SwapTest[2] = { 1, 0 };

if( *(short *) SwapTest == 1 )


is also undefined behaviour. Additionally, it also assumes sizeof(short) == 2 and that it uses 2's complementary encoding which is not guaranteed.

What you could do is use std::memcpy:

float f1, f2;
std::array<char, sizeof(float)> buf;
std::memcpy(buf.data(), &f1, sizeof(f1));
std::reverse(buf.begin(), buf.end());
std::memcpy(&f2, buf.data(), sizeof(f2)); 

However, it seems awkward that you would keep "reversed" floats around. This should be handled by your deserialization layer, so you should never end up with a need to swap bytes in primitive types.

As interesting as this article is (and it really is interesting with good information), is byte ordering actually still an issue today on modern platforms? As I understand it just about everything is in LSB ordering (x86, ARM, etc.). Or are there popular devices with cross-platform applications where this can still be an issue?

Yes, all of the PowerPC-based consoles are big-endian (Xbox360, PS3, Wii).

This:

  union
{
float f;
unsigned char b[4];
} dat1, dat2;

dat1.f = f;
dat2.b[0] = dat1.b[3];
dat2.b[1] = dat1.b[2];
dat2.b[2] = dat1.b[1];
dat2.b[3] = dat1.b[0];
return dat2.f;


is undefined behaviour in C++. You are allowed to read only from the union member that has been assigned last (with few exceptions not applicable here).

Actually while technically true by the C++ standard it is not true in practice.  All of the major compilers VC, GCC, Clang have an explicit statement about this case as being allowed in those compilers.  I looked this up to be certain when using this in an example in my article about bit fields.  You can double check this yourself, look up the strict aliasing rules for each of the compilers as applied to unions and all three have a specific exception for this usage listed.

There's a little problem with calling FloatSwap, at least on x86. It will only cause a problem with a few numbers but they can be a real pain to track down:

Not all floats are numbers, as some bit combinations are NaNs. The bit pattern for a NaN is:
s1111111 1axxxxxx xxxxxxxx xxxxxxxx

It's the "a" bit that's causing problems. It tells the x86 if operations on this NaN can cause a float exception. If an operation on the NaN can cause an exception, it's a signalling NaN (sNaN), if not it's a quiet NaN (qNaN).

When the x86 encounters an sNaN it will trigger a float exception (if enabled), and then turn the sNaN into a qNaN.

This won't usually cause a problem because a NaN can't really be any other number. As long as it's treated as a float, all NaNs are NaNs, signalling or not.

The problem is that when FloatSwap is called, it will return the swapped value as a float. A valid float number can be returned as an sNaN, because the bits in the float that says if the float is an sNaN is loaded from the bottom end of the mantissa. If it is an sNaN, the x86 will flip the "a" bit to make it a qNaN.

Two FloatSwap's in a row won't always give you the same number back on x86. Since the flipped bit has a fairly low significance, it'll only be a small change in the value.

E.g. this :
#include <stdio.h>
#include <math.h>

union floatintbyte
{
float f;
unsigned char b[4];
unsigned int u;
};

float FloatSwap(float f)
{
floatintbyte dat1, dat2;

dat1.f = f;
for(int i = 0; i < 4; ++i)
dat2.b[i] = dat1.b[3 - i];

return dat2.f;
}

void main(int argc, char *argv[])
{
floatintbyte start;
start.u = 0x3f8080ff; // 1.0f + sNaN in the lower bits.

float once = FloatSwap(start.f);
float twice = FloatSwap(once);
printf("start : %f\nonce  : %f\ntwice : %f\n", start.f, once, twice);
}


will output

start : 1.003937
once  : -1.#QNAN0
twice : 1.005890

-Morten-

I do not see those assumptions stated in the article anywhere.

Actually while technically true by the C++ standard it is not true in practice.  All of the major compilers VC, GCC, Clang have an explicit statement about this case as being allowed in those compilers.

Then it should be mentioned in the article that the code is not C++ compliant and will only work with given compilers that support such extensions.

A lot of low-level data-manipulation algorithms have specifications which contain stuff like "and the output shall be written as a 64-bit big-endian unsigned integer", so endianness matters a lot still. Though I prefer to detect and use htobe*/htole*/etc.. functions, this approach is interesting too.

Nice article!

Though he function pointers may be convenient, but wouldn't it be better if the choice is in compile time instead of runtime?

No system should change its endianess, and it should be known at compile time right?

I'm not sure how to check for it though...

Nice article!

Though he function pointers may be convenient, but wouldn't it be better if the choice is in compile time instead of runtime?

No system should change its endianess, and it should be known at compile time right?

I'm not sure how to check for it though...

I believe he would have to compile two executables if it was decided while compiling.

Nice article!

Though he function pointers may be convenient, but wouldn't it be better if the choice is in compile time instead of runtime?

No system should change its endianess, and it should be known at compile time right?

I'm not sure how to check for it though...

I believe he would have to compile two executables if it was decided while compiling.

Wouldn't you have to do that anyway, to support the new system?

The binary has to be compiled to match the system endianness, so you should know then if you need to swap or not.

As interesting as this article is (and it really is interesting with good information), is byte ordering actually still an issue today on modern platforms? As I understand it just about everything is in LSB ordering (x86, ARM, etc.). Or are there popular devices with cross-platform applications where this can still be an issue?

Yes, all of the PowerPC-based consoles are big-endian (Xbox360, PS3, Wii).

Good point. I'm not a console developer so I never considered the case of consoles, only computing platforms (PC's and Mac's) and common mobile platforms.

The standard solution to this problem is to rearrange the characters in place before reading them (or after writing them) by calling a function that either swaps them or does nothing, depending on a preprocessor directive or some other compile-time mechanism. For network (big-endian) order, this is often done by calling functions like htons().

Another solution, which is kind of cute, is reading an integer from a char * as follows:

int read_int_from_big_endian_char_array(char const *a) {
return a[0] + (a[1]<<8) + (a[2]<<16) + (a[3]<<24);
}

return (a[0]<<24) + (a[1]<<16) + (a[2]<<8) + a[3];
}


That will "magically" work on both little-endian and big-endian machines. I mentioned this to a friend that used to work implementing network protocols and he said it's just too slow. But I am pretty sure using function pointers is going to be slower.

another possible strategy is wrapping an array of bytes in a struct, such that the struct represents a fixed-size fixed-endianess value ("u32be_t" = "unsigned 32-bit big-endian", "s64le_t"="signed 64-bit little endian", ...).

the conversion functions may then either directly cast the value (if applicable for a given target), or shift the bytes into place.

this also helps avoid cases where values can leak through without getting swapped, and helps avoid mixing up the conversions (using the wrong-sized conversion, ...), since any "mishap" results in a compiler error.

@Alvaro: your solution is significantly faster and generally better than the one suggested in the article (I'm too lazy to do benchmarking, but function call - with all the pushing/register-freeing/popping/... starts from around 20 clocks on x86, and an extra read from L1 is like 3 clocks); see also Byte order fallacy

An alternative (which I like a bit better) is to provide generic endianness-agnostic parser like yours, and to have specialization for specific platforms of interest, as described here: 64 Network DO’s and DON’Ts for Game Engine Developers. Part IIa: Protocols and APIs  (look for item 8g). This one will work everywhere, and on those specific platforms of interest it will be the fastest possible, so your friend won't be able to tell anything bad about it ;-)

NB (and shameless bragging ;-) ): another article with a supposed improvement over this approach has been published on GDNet: http://www.gamedev.net/page/resources/_/technical/general-programming/writing-efficient-endian-independent-code-in-c-r4080

Note: Please offer only positive, constructive comments - we are looking to promote a positive atmosphere where collaboration is valued above all else.

PARTNERS