|
||||||||||||||||||
Add Forum to Favorites | Send Topic To a Friend | View Forum FAQ | Track this topic Page: 1 2 »» |
Last Thread Next Thread ![]() |
| Writing Endian Independent Code in C++ |
|
![]() randomZ Member since: 8/17/2002 From: Germany |
||||
|
|
||||
| A flaw of this article, if I can say that, is that it makes assumptions on the size of the various data types. If we care to tolerate systems with both endian systems, we should also take care to support those that have different sizes of data types. Instead of int, float etc., one should probably use stdint.h's standard types, or other types whose size is known. Nice article, apart from that. I think portability is a very important issue; probably even more so as (some) people start thinking about alternatives to Windows. --- Just trying to be helpful. Sebastian Beschke Just some student from Germany http://zqf.de/randomz/ |
||||
|
||||
![]() Anonymous Poster |
||||
|
||||
| Since you only need to worry about this during file reads, why not write a class (ok, a set of functions for you C types) that encapsulates all of the changes? That way the endian code does not propogate throughout the project, and is confined to one class. |
||||
|
||||
![]() RamboBones Member since: 9/23/2002 From: Blacktown, Australia |
||||
|
|
||||
I have only one concern with your code and that is here
void WriteVertex( Vertex* v, FILE* f )
{
if( v == NULL || f == NULL )
return;
v->Endian();
fwrite( v, sizeof(Vertex), 1, f );
}
You are altering the Vertex that is given to the write function such that it can not be used again, this may or may not be a problem depending on how your code is setup but I still think it's better to create a temporary variable. On another minor point with that source you shouldn't write the entire Vertex structure using sizeof(Vertex) because different compilers may add different ammounts of padding to it. However I personally think that if you're serious about using endian independent code then it's probably better to store most of your data as plain text files, like Doom 3 is doing, which will eliminate the problem entirely. |
||||
|
||||
![]() krez GDNet+ Member since: 10/10/2001 From: NJ - The Garbage State |
||||
|
|
||||
| only a fool would eat an egg from the big end! |
||||
|
||||
![]() SnprBoB86 Member since: 5/29/2001 |
||||
|
|
||||
Heres another idea:
class EndianStream
{
protected:
stream _stream;
};
class EndianIStream : public istream { ... }
class EndianOStream : public ostream { ... }
class EndianIOStream : public EndianIStream, public EndianOStream { ... }
That should be fine if you provide some smart wrapper functions that handle endian on the way out/in (if I got my iostream class hierachy stuff right, it has been a while since I did any serious C++) You could define constructiors with an bigEndian flag or something that sets what set of conversion functions to use. This would make for a general solution that works without having to code an Endian function per struct, right? -SniperBoB- [Edit: Formatting error.] [edited by - Oluseyi on April 27, 2004 9:52:36 PM] |
||||
|
||||
![]() Anonymous Poster |
||||
|
||||
| ok, i can understand why you want to avoid the preprocessor, but using it could have some advantages, for example you can define a simple template version of these functions, where the template parameter toggles if this system is little or big endian, there would be a const wich toggles this behaviour, this way you won't have any speed penalty |
||||
|
||||
![]() Promit Moderator - Graphics Programming and Theory Member since: 7/29/2001 From: Baltimore, MD, United States |
||||
|
|
||||
quote: Fair enough, but this is merely an example of how to change the endian. It's not meant to be ultra-perfect design wise. quote: The sizeof operator includes padding, so this is a non-issue. SnprBoB86: Not a bad idea, but remember this came from Quake 2, which was written in C. Plus I'm personally not fond of streams In general, I'm not that interested in criticisms of the method, because it isn't my method. I thought it would be valuable to show people because it came directly from the source of a major commercial title -- Quake 2. |
||||
|
||||
![]() MENTAL Member since: 4/8/2000 From: Herne Bay, United Kingdom |
||||
|
|
||||
| for the SwapLong functions, shouldn't you use a long datatype and not an integer? seeing as 64bit processors are becomming more common slipups like these are going to cost. Or, rename the functions to something like SwapI16 and SwapI32 (shorts and longs) and SwapF32 and SwapF64 (floats and doubles). other than that, good article |
||||
|
||||
![]() RamboBones Member since: 9/23/2002 From: Blacktown, Australia |
||||
|
|
||||
quote: The problem is that the sizeof operator includes padding. For instance on platform 1 compiler a has 32 bytes of padding, and writes the data. On platform 2 compiler b has 1 byte of padding and readsd the data. Although it's a minor point but still worth pointing out, I have been tripped up by this in the past |
||||
|
||||
![]() dwarfsoft Moderator Member since: 6/5/2000 From: Toowoomba, Australia |
||||
|
|
||||
quote: Thats why you write the length before any structures or complex data types ... then you have NO problem, and can actually pass the REAL sizeof data on to the reading/writing function Great article BTW. Although a file manipulation class that encapsulates file writing in a specific endian form would be more up my alley ;P |
||||
|
||||
![]() keen Member since: 3/25/2000 From: Goteborg, Sweden |
||||
|
|
||||
quote: This isnt enough. Think of a structure like this:
struct vars
{
long size;
char var0;
long var1;
char var2;
}
var0 could be padded to 4 bytes here i.e. The trick you mentioned only works if the padding is done after the last variable which might not be the case |
||||
|
||||
![]() prowst Member since: 5/1/2002 From: Mesa, AZ, United States |
||||
|
|
||||
| Good Article. Sure there is criticizers pointing out ways to improve the code. But Im sure the intent wasnt to give the ultimate universal solution to the problem. I never worried about endian in the past, and now you've got me nervous in the service! I just have a few simple questions, as I am a complete noob with endian. 1) Has x86 processors always been little endian? 2) Could you assume new x86 processors will remain little endian? 3) The endian is processor specific, nothing else in the system determines the endian? 4) Do I have to bother with endian if my programs run only on windows platforms? 5) Why would you eat the big end of the egg first? Thanks for replies! -prowst |
||||
|
||||
![]() Promit Moderator - Graphics Programming and Theory Member since: 7/29/2001 From: Baltimore, MD, United States |
||||
|
|
||||
| 1, 2) x86 processors always have been, and always will be, little endian. Changing that would cause havoc in the computing world. 3) Yes, processor only. 4) Possibly. What people don't realize is that the Windows code is actually quite portable, and that there are versions of Windows for various MIPS and Alpha processors. I personally don't know the endians for thesechipsets -- maybe someone on the forums can fill you and me in. 5) It's easier to fit the spoon in the big end, usually. |
||||
|
||||
![]() Shadowdancer Member since: 6/25/2000 |
||||
|
|
||||
| Nice code, but it doesn't handle PDP-endianness which may be outdated and completely f***ed up by today's standards. Why not just use the network {nh}to{hn}{ls}() functions which provide a *standardized* way to do this stuff? You always send or store your stuff through hton{sl}() and read through ntoh{sl}() without having to determine endianness first. |
||||
|
||||
![]() keen Member since: 3/25/2000 From: Goteborg, Sweden |
||||
|
|
||||
quote: Agreed, it makes a lot more sense to use these standard functions. |
||||
|
||||
![]() Promit Moderator - Graphics Programming and Theory Member since: 7/29/2001 From: Baltimore, MD, United States |
||||
|
|
||||
| I didn't write this code, nor did I design the method. I just adapted and explained it. I'm thinking of doing a second part which goes for a more robust approach, probably using network byte order, wrapping classes, and streams. |
||||
|
||||
![]() atcdevil Member since: 2/12/2001 From: NJ |
||||
|
|
||||
| I'm happy I finally understand what endianness is all about. Thanks a lot. |
||||
|
||||
![]() voodoo_john Member since: 5/2/2001 From: Glasgow, United Kingdom |
||||
|
|
||||
#ifndef _SWAP_ENDIAN
#define _SWAP_ENDIAN
namespace Platform
{
/******************************************************************************
FUNCTION: SwapEndian
PURPOSE: Swap the byte order of a structure
EXAMPLE: float F=123.456;; SWAP_FLOAT(F);
******************************************************************************/
#define SWAP_SHORT(Var) Var = *(short*) SwapEndian((void*)&Var, sizeof(short))
#define SWAP_USHORT(Var) Var = *(unsigned short*)SwapEndian((void*)&Var, sizeof(short))
#define SWAP_LONG(Var) Var = *(long*) SwapEndian((void*)&Var, sizeof(long))
#define SWAP_ULONG(Var) Var = *(unsigned long*) SwapEndian((void*)&Var, sizeof(long))
#define SWAP_FLOAT(Var) Var = *(float*) SwapEndian((void*)&Var, sizeof(float))
#define SWAP_DOUBLE(Var) Var = *(double*) SwapEndian((void*)&Var, sizeof(double))
void* swapEndian(void* Addr, const int Nb);
}
#endif
#include "SwapEndian.h"
namespace Platform
{
/******************************************************************************
FUNCTION: SwapEndian
PURPOSE: Swap the byte order of a structure
EXAMPLE: float F=123.456;; SWAP_FLOAT(F);
******************************************************************************/
void* swapEndian(void* Addr, const int Nb)
{
static char Swapped[16];
switch (Nb)
{
case 2: Swapped[0]=*((char*)Addr+1);
Swapped[1]=*((char*)Addr );
break;
case 4: Swapped[0]=*((char*)Addr+3);
Swapped[1]=*((char*)Addr+2);
Swapped[2]=*((char*)Addr+1);
Swapped[3]=*((char*)Addr );
break;
case 8: Swapped[0]=*((char*)Addr+7);
Swapped[1]=*((char*)Addr+6);
Swapped[2]=*((char*)Addr+5);
Swapped[3]=*((char*)Addr+4);
Swapped[4]=*((char*)Addr+3);
Swapped[5]=*((char*)Addr+2);
Swapped[6]=*((char*)Addr+1);
Swapped[7]=*((char*)Addr );
break;
case 16:Swapped[0]=*((char*)Addr+15);
Swapped[1]=*((char*)Addr+14);
Swapped[2]=*((char*)Addr+13);
Swapped[3]=*((char*)Addr+12);
Swapped[4]=*((char*)Addr+11);
Swapped[5]=*((char*)Addr+10);
Swapped[6]=*((char*)Addr+9);
Swapped[7]=*((char*)Addr+8);
Swapped[8]=*((char*)Addr+7);
Swapped[9]=*((char*)Addr+6);
Swapped[10]=*((char*)Addr+5);
Swapped[11]=*((char*)Addr+4);
Swapped[12]=*((char*)Addr+3);
Swapped[13]=*((char*)Addr+2);
Swapped[14]=*((char*)Addr+1);
Swapped[15]=*((char*)Addr );
break;
}
return (void*)Swapped;
}
}
|
||||
|
||||
![]() Combat Wombat Member since: 11/27/2003 From: Australia |
||||
|
|
||||
| I appreciate what you're trying to do here, and it's nice to see some programmers trying to give something back to the community. I realise how much one puts one's ass out in the wind when one posts stuff, so let me just say there's nothing particularly wrong with what you have, but just some suggestions on how to improve it: There are a couple of minor problems here - firstly, you're wrapping it up in a function, and secondly you're wrapping it up so that the compiler most likely can't inline the function (as you're setting function pointers). I would be extremely impressed if the compiler could optimise away or inline that setup function. Also, on a system where your data already has the correct endianness, you're calling a function each time for no reason at all. To do cross-platform code, try this: * Create yourself a platform.h file - and use whatever compiler options you need to to get platform.h to uniquely identify the build environment you're using (eg Macs define "macintosh", VC++ defines WIN32, use this information): #if defined(macintosh) #define IS_NETWORKENDIAN #elif defined(WIN32) // ... don't define the symbol ... #else #error I don't have a platform type, fix me! #endif * Then use your BIG_ENDIAN to set up a bunch of macros (I also use the platform.h to set up all my fundamental types, as you'll see names for in the stuff below) inline uint16 gp_swap_uint16(uint16 arg) { return (arg & 0xFF00) >> 8 | (arg & 0x00FF) << 8; } inline uint32 gp_swap_uint32(uint32 arg) { return (arg & 0xFF000000) >> 24 | (arg & 0x00FF0000) >> 8 | (arg & 0x0000FF00) << 8 | (arg & 0x000000FF) << 24; } inline int16 gp_swap_int16(int16 arg) { return (int16)gp_swap_uint16((uint16)arg); } inline int32 gp_swap_int32(uint32 arg) { return (int32)gp_swap_uint32((uint32)arg); } inline float32 gp_swap_float32(float32 arg) { uint32 ui = *((uint32*)&arg); ui = gp_swap_uint32(ui); return *((float *)&ui); } #if defined(IS_NETWORKENDIAN) #define HTN_INT16(arg) (arg) #define HTN_INT32(arg) (arg) #define HTN_UINT16(arg) (arg) #define HTN_UINT32(arg) (arg) #define HTN_FLOAT32(arg) (arg) #define NTH_INT16(arg) (arg) #define NTH_INT32(arg) (arg) #define NTH_UINT16(arg) (arg) #define NTH_UINT32(arg) (arg) #define NTH_FLOAT32(arg) (arg) #define HTB_INT16(arg) (arg) #define HTB_INT32(arg) (arg) #define HTB_UINT16(arg) (arg) #define HTB_UINT32(arg) (arg) #define HTB_FLOAT32(arg) (arg) #define BTH_INT16(arg) (arg) #define BTH_INT32(arg) (arg) #define BTH_UINT16(arg) (arg) #define BTH_UINT32(arg) (arg) #define BTH_FLOAT32(arg) (arg) #define HTL_INT16(arg) gp_swap_int16(arg) #define HTL_INT32(arg) gp_swap_int32(arg) #define HTL_UINT16(arg) gp_swap_uint16(arg) #define HTL_UINT32(arg) gp_swap_uint32(arg) #define HTL_FLOAT32(arg) gp_swap_float32(arg) #define LTH_INT16(arg) gp_swap_int16(arg) #define LTH_INT32(arg) gp_swap_int32(arg) #define LTH_UINT16(arg) gp_swap_uint16(arg) #define LTH_UINT32(arg) gp_swap_uint32(arg) #define LTH_FLOAT32(arg) gp_swap_float32(arg) #else #define HTN_INT16(arg) gp_swap_int16(arg) #define HTN_INT32(arg) gp_swap_int32(arg) #define HTN_UINT16(arg) gp_swap_uint16(arg) #define HTN_UINT32(arg) gp_swap_uint32(arg) #define HTN_FLOAT32(arg) gp_swap_float32(arg) #define NTH_INT16(arg) gp_swap_int16(arg) #define NTH_INT32(arg) gp_swap_int32(arg) #define NTH_UINT16(arg) gp_swap_uint16(arg) #define NTH_UINT32(arg) gp_swap_uint32(arg) #define NTH_FLOAT32(arg) gp_swap_float32(arg) #define HTB_INT16(arg) gp_swap_int16(arg) #define HTB_INT32(arg) gp_swap_int32(arg) #define HTB_UINT16(arg) gp_swap_uint16(arg) #define HTB_UINT32(arg) gp_swap_uint32(arg) #define HTB_FLOAT32(arg) gp_swap_float32(arg) #define BTH_INT16(arg) gp_swap_int16(arg) #define BTH_INT32(arg) gp_swap_int32(arg) #define BTH_UINT16(arg) gp_swap_uint16(arg) #define BTH_UINT32(arg) gp_swap_uint32(arg) #define BTH_FLOAT32(arg) gp_swap_float32(arg) #define HTL_INT16(arg) (arg) #define HTL_INT32(arg) (arg) #define HTL_UINT16(arg) (arg) #define HTL_UINT32(arg) (arg) #define HTL_FLOAT32(arg) (arg) #define LTH_INT16(arg) (arg) #define LTH_INT32(arg) (arg) #define LTH_UINT16(arg) (arg) #define LTH_UINT32(arg) (arg) #define LTH_FLOAT32(arg) (arg) #endif // IS_NETWORKENDIAN // Code free for use by whoever wants it... |
||||
|
||||
![]() randomZ Member since: 8/17/2002 From: Germany |
||||
|
|
||||
| There's one thing I don't quite get yet: Isn't s >> 8always equivalent to dividing s by 256? I thought I'd heard so. And if that's the case, isn't the >> operator transparent to endianness? |
||||
|
||||
![]() Anonymous Poster |
||||
|
||||
| I would rather define a "standard byte order" for the application; i.e. you say that "Everything stored to disk or network from this application will be stored in <THIS> endian". If you do that, all you have to do is create a set of read/write functions (/defines) for your basic types (this assumes you're also using a set of fixed-size integer typedefs; which you should)
// Convert "Standard Byte Order" to the byte order of the local platform
read_[ui]{64|32|16}
read_f32 // float
read_f64 // double
// Convert platform byte order to standard byte order
write_[ui]{64|32|16}
write_f32
write_f64
I left out the implementation - that's up to your platform-specific kludges. Note that this is a *significantly* smaller set of defines than Combat Wombat's solution. From a SwEng perspective, it is probably bad to let the coder try to figure out in each case "am I going from network/host/little/big or PDP endian? and am I going to network/host/little/big or PDP?" (as CW's solution will have you do) - often you'll only have "local" and "standard" byte order anyways, where standard is the one used by your network protocol and file formats. Re: serializing structs: fwriting the raw memory contents of a struct (even after endian conversion) is a really *bad* idea! In C++, the struct will also contain a virtual function pointer. Imagine what would happen if the user or (heavens forbid!) a *remote computer* was able to alter the value of the virtual function pointer! Added to this is the fact that different compilers (or same comp. on diff. platforms) add different amounts of padding. (As previous posters already have been through) The right way to go is to use some kind of serialization framework. On Java or with MFC you'll have this included (Java's even takes care of endianness too!). Otherwise, you can always roll your own (it isn't too hard, really), or google for an open-source one. An interresting note to add, when talking network protocols, we can add integer compression transparently (i.e. use a dynamic number of bytes depending on the size of the integer value) to the read/write_u* functions. One algorithm that does this is described in the VCDIFF RFC (http://www.faqs.org/rfcs/rfc3284.html): 7 bits of each bytes are used to store the integer, the highest bit is then used as a flag to say if the integer is continued in the next byte. |
||||
|
||||
![]() Oluseyi Staff Member since: 5/14/2001 From: New York, NY, United States |
||||
|
|
||||
quote:Only on an MSB architecture. On an LSB, it's equivalent to multiplying by 256. |
||||
|
||||
![]() randomZ Member since: 8/17/2002 From: Germany |
||||
|
|
||||
quote: Thanks for clarifying that. I was confused because so many people bring bit shifting up as an "optimization" for multiplying / dividing by powers of two. But if my code is ever to be compiled on a machine of different endianness, this would break if I cross byte boundaries, right? |
||||
|
||||
![]() crypticmind Member since: 11/20/2003 From: Argentina |
||||
|
|
||||
quote: I agree too, the enginuity code made by Richard Fine (see http://www.gamedev.net/reference/programming/features/enginuity5/) used the SDL_Net to achieve the same. I used the {nh}to{hn}{ls}() functions along with the serialization design (it's c++, but that's OT). I think these functions are the most portable way to address this issue. |
||||
|
||||
|
Page: 1 2 »» All times are ET (US) ![]() |
Last Thread Next Thread ![]() |
|