Jump to content

  • Log In with Google      Sign In   
  • Create Account


#ActualCornstalks

Posted 06 March 2013 - 11:34 PM

char isn't guranteed to be one byte, it seems it's usually the size of whatever the processor processes things in, which is sometimes 16-bit (2 byte) increments

sizeof(char) is required to always be 1 (or in other words, a char is always 1 "byte," but this "byte" doesn't have to be 8 bits (it will be at least 8 bits, but could be more).

 

integer can be 1 byte on some architectures, which would break this code

sizeof(char) <= sizeof(int). Additionally, int is required to be at least 16 bits. It's possible char and int are both 16 (or greater) data types.

 

I also have one question, is endianness byte ordering always based on 8-bit bytes or would it be ordered in sections of 16-bits on architectures with 16-bit chars?

Endianness is about byte ordering, but here "byte" doesn't have to mean 8 bit groups. If you're on a system with 16 bit bytes, then a two byte word (32 bits) would still have endianness (because it's made up of two (16 bit) bytes, and the order of those bytes in memory determines endianness). A single byte (no matter how many bits are in it) does not have any endianness. Endianness has to do with the ordering of bytes, not the ordering of bits.

I would suggest doing this check at compile time. If you still want to do some kind of runtime switching (for whatever insane reason), you can hide the compile time code inside of a function and still call the function at run time. There's no reason to be checking at runtime by messing with pointers.

Beware, however, that if you do use compile time checking, that you do it properly. If you are cross compiling your program and you relied on some bad macros, it's possible you may detect the endianness of the compiling platform instead of the target platform.

For what it's worth, there may not be any endianness to a system. For example, if sizeof(char) == sizeof(short) == sizeof(int) == sizeof(long) == sizeof(long long), then the system doesn't really have any endianness since everything is one byte. Thus, to be uber pedantic, you would check if the system is big endian, little endian, or "no endian."

Anyway, like I said, I would check using compiler flags/preprocessing/macros (i.e. check at compile time). If you really wanted to check at runtime, you can do:

// This is, of course, assuming that a system is either big, little, or no endian. It's
// technically possible for a system to be middle endian: http://en.wikipedia.org/wiki/Endianness#Middle-endian
// Also note that its technically possible for integer and floating point data types to
// have different endianness: http://en.wikipedia.org/wiki/Endianness#Floating-point_and_endianness
// To be honest, you just have to draw the line somewhere and say "We support A, B, and C."
// It's really not worth your time trying to support everything under the sun. Just
// come up with a sane set of basic requirements that you think are realistic for your
// target market.
enum Endianness
{
    BIG,
    LITTLE,
    NONE
};

Endianness checkEndianness()
{
    // Note: Before C11, unsigned char was the only data type that was guaranteed to not have any padding bits.
    // Since C11, that has since been changed so that both signed char and unsigned char have no padding bits.
    // I'm not sure if C++11 says whether or not signed char may have padding bits, but unsigned char certainly
    // does not.

    // I would personally actually use macros to detect endianness at compile time, and just make this function
    // return the endianness as a compile time flag.
    unsigned long long a = 1;
    if (sizeof(a) == sizeof(unsigned char))
    {
        return NONE;
    }
    else if (*(unsigned char*)&a == 1)
    {
        return LITTLE;
    }
    else
    {
        return BIG;
    }
}

#1Cornstalks

Posted 06 March 2013 - 11:32 PM

char isn't guranteed to be one byte, it seems it's usually the size of whatever the processor processes things in, which is sometimes 16-bit (2 byte) increments

sizeof(char) is required to always be 1 (or in other words, a char is always 1 "byte," but this "byte" doesn't have to be 8 bits (it will be at least 8 bits, but could be more).

integer can be 1 byte on some architectures, which would break this code

sizeof(char) <= sizeof(int). Additionally, int is required to be at least 16 bits. It's possible char and int are both 16 (or greater) data types.

I also have one question, is endianness byte ordering always based on 8-bit bytes or would it be ordered in sections of 16-bits on architectures with 16-bit chars?

Endianness is about byte ordering, but here "byte" doesn't have to mean 8 bit groups. If you're on a system with 16 bit bytes, then a two byte word (32 bits) would still have endianness (because it's made up of two (16 bit) bytes, and the order of those bytes in memory determines endianness). A single byte (no matter how many bits are in it) does not have any endianness. Endianness has to do with the ordering of bytes, not the ordering of bits.

I would suggest doing this check at compile time. If you still want to do some kind of runtime switching (for whatever insane reason), you can hide the compile time code inside of a function and still call the function at run time. There's no reason to be checking at runtime by messing with pointers.

Beware, however, that if you do use compile time checking, that you do it properly. If you are cross compiling your program and you relied on some bad macros, it's possible you may detect the endianness of the compiling platform instead of the target platform.

For what it's worth, there may not be any endianness to a system. For example, if sizeof(char) == sizeof(short) == sizeof(int) == sizeof(long) == sizeof(long long), then the system doesn't really have any endianness since everything is one byte. Thus, to be uber pedantic, you would check if the system is big endian, little endian, or "no endian."

Anyway, like I said, I would check using compiler flags/preprocessing/macros (i.e. check at compile time). If you really wanted to check at runtime, you can do:
// This is, of course, assuming that a system is either big, little, or no endian. It's
// technically possible for a system to be middle endian: http://en.wikipedia.org/wiki/Endianness#Middle-endian
// Also note that its technically possible for integer and floating point data types to
// have different endianness: http://en.wikipedia.org/wiki/Endianness#Floating-point_and_endianness
// To be honest, you just have to draw the line somewhere and say "We support A, B, and C."
enum Endianness
{
    BIG,
    LITTLE,
    NONE
};

Endianness checkEndianness()
{
    // Note: Before C11, unsigned char was the only data type that was guaranteed to not have any padding bits.
    // Since C11, that has since been changed so that both signed char and unsigned char have no padding bits.
    // I'm not sure if C++11 says whether or not signed char may have padding bits, but unsigned char certainly
    // does not.

    // I would personally actually use macros to detect endianness at compile time, and just make this function
    // return the endianness as a compile time flag.
    unsigned long long a = 1;
    if (sizeof(a) == sizeof(unsigned char))
    {
        return NONE;
    }
    else if (*(unsigned char*)&a == 1)
    {
        return LITTLE;
    }
    else
    {
        return BIG;
    }
}

PARTNERS