Endianness

Started by
3 comments, last by aregee 10 years, 1 month ago
Coming from a big endian platform in the past, the good old Amiga computers that had the Motorola 680x0 processor series, I always found it weird with the seemingly backward little endian that is Intel 'x86. I always thought that the choice of endianness was an arbitrary choice, a weird one, and according to wikipedia, it indeed is, but today I was thinking about some code that I was writing, and thinking "why is this really working"?
First, let me show you the code I am taling about:

NSInputStream *iStream = [[NSInputStream alloc] initWithFileAtPath:@"<somefile>"];
[iStream open];

uint64_t value = 0;

[iStream read:(uint8_t *)&value maxLength:1];
uint64_t myValue1 = value;

[iStream read:(uint8_t *)&value maxLength:2];
uint64_t myValue2 = value;

[iStream read:(uint8_t *)&value maxLength:4];
uint64_t myValue3 = value;

[iStream read:(uint8_t *)&value maxLength:8];
uint64_t myValue4 = value;
This works well on little endian platforms.
Hint: Look at where in the uint64_t I am reading the smaller values into and how I am storing the values afterwards.
If you were to do this on a big endian platform, you would need to write something like this (to my understanding):

[iStream read:(uint8_t *)(&value + 7) maxLength:1];
//store the value
[iStream read:(uint8_t *)(&value + 6) maxLength:2];
//store the value
[iStream read:(uint8_t *)(&value + 4) maxLength:4];
//store the value
[iStream read:(uint8_t *)&value maxLength:8];
//store the value
Unless there is some magic voodoo that I don't understand going on here, then I have found one good reason for the choice of little endian on an architecture, and to me it is not so arbitrary any more, even though it might really be...
Little endian, I do understand you existence a little bit more now... ;)
EDIT:
Oh yes, wikipedia also mentions this realisation:
"The little-endian system has the property that the same value can be read from memory at different lengths without using different addresses (even when alignment restrictions are imposed). For example, a 32-bit memory location with content 4A 00 00 00 can be read at the same address as either 8-bit (value = 4A), 16-bit (004A), 24-bit (00004A), or 32-bit (0000004A), all of which retain the same numeric value. Although this little-endian property is rarely used directly by high-level programmers, it is often employed by code optimizers as well as by assembly language programmers."
Advertisement

Yeah big endian is nice that it matches the way we usually write numbers, e.g.
0x12345678 -- 0x12 is the most-significant byte, 0x78 is the least significant byte. Big endian data appears in that same order - MSB first.
01 02 03 04 -- address
12 34 56 78 -- data

Little endian is harder to read in a memory debugger (assuming you're still using left-to-right page ordering), but it's really nice that overlapping variables share the same address!
01 02 03 04 -- address
78 56 34 12 -- data


int8*  a = (int8* )0x1; assert( *a == 0x78 );
int16* b = (int16*)0x1; assert( *b == 0x5678 && (int8)*b == 0x78 );
int32* c = (int32*)0x1; assert( *c == 0x12345678 && (int16)*c == 0x5678 && (int8)*c == 0x78 );

If you use right-to-left writing, little endian looks more sensible, but I don't think I've ever see a debugger do this laugh.png e.g.

04 03 02 01 -- address

12 34 56 78 -- data

P.S. Little endian is superior, COME TO THE DARK SIDE! ph34r.png

Most debuggers on little-endian systems do support groupings, so you can see 2, 4, 8 byte sequences (and sometimes longer depending on the debugger) as "natural" orderings.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Yeah big endian is nice that it matches the way we usually write numbers, e.g.
0x12345678 -- 0x12 is the most-significant byte, 0x78 is the least significant byte. Big endian data appears in that same order - MSB first.
01 02 03 04 -- address
12 34 56 78 -- data

Little endian is harder to read in a memory debugger (assuming you're still using left-to-right page ordering), but it's really nice that overlapping variables share the same address!
01 02 03 04 -- address
78 56 34 12 -- data


int8*  a = (int8* )0x1; assert( *a == 0x78 );
int16* b = (int16*)0x1; assert( *b == 0x5678 && (int8)*b == 0x78 );
int32* c = (int32*)0x1; assert( *c == 0x12345678 && (int16)*c == 0x5678 && (int8)*c == 0x78 );

If you use right-to-left writing, little endian looks more sensible, but I don't think I've ever see a debugger do this laugh.png e.g.

04 03 02 01 -- address

12 34 56 78 -- data

P.S. Little endian is superior, COME TO THE DARK SIDE! ph34r.png

I was never able to decide if big endian or little endian is good

thogh it seem to me that i just felt better when working on big endian , so today i seem propably to be closer to opinion that big endian is better

yet one argument if storing things like pi 3.14159.... storing this like ...95141.3 would be worse

P.S. Little endian is superior, COME TO THE DARK SIDE! ph34r.png

I think I already have been convinced to join the dark side lol :D - but I guess there are pros and cons with both apporaches.

This topic is closed to new replies.

Advertisement