# is size of char 4 ???

## Recommended Posts

hi, please have a look at this sample code using visual c++ 2005 express edition. typedef unsigned char byte ; struct Header { byte one ; int two ; }; void main() { cout << sizeof(Header) << endl ; } ok, the i would expect the result to print 5, BUT, I get 8 instead, how is this possible.

##### Share on other sites
Your compiler is probably optimizing your struct by offsetting the int variable by 3 bytes.

##### Share on other sites
this is creating some problems when using win TCP/IP and send/recv function.

do you suggest I forget about using bytes and use UINT everywhere.
or
try to disable that compiler optimization.

##### Share on other sites
Quote:
 Original post by dynameatthis is creating some problems when using win TCP/IP and send/recv function.

It's generally a bad idea to send a struct directly over the wire. You can run into all sorts of trouble if you aren't careful. You would be much better served pulling the struct fields and manually packing them into a byte array in a fixed format. Aside from avoiding certain problems, it frees you up to more easily implement some form of packet compression later on down the road if you find that you need it. By separating the storage type from the transfer type of a value, you can do compression tricks like condensing several boolean values into a single 32-bit bit mask.

##### Share on other sites
Some good links there from lessbread, but here is somethings to take into consideration.
Padding structures should be your last choice as it adds a speed penalty and is platform specific.
Correctly aligning your own structures costs nothing.
Not sure if bitfields incur a speed penalty?
You can develop a serialisation method that takes care of everything for you,see hplus' post in this thread

Finally use types which are guaranteed to be the same on all peers, what is the size of the variable "two"?

##### Share on other sites
its already been said that you shouldn't send your struct through the network... but if you do write the entire struct at once, just be sure to read the entire struct at once (i.e. match your reads and writes with the same types).

e.g.

lets say you have send( void* data, uint size ) and receive( void* data, uint size );

you shouldn't do:
my_struct data;data.one = 128;data.two = 6552053;send( &data, sizeof(my_struct) );....my_struct data;receive( &data.one, sizeof(char) );receive( &data.two, sizeof(int) );

you should do: (this is better, but not portable)
my_struct data;send( &data, sizeof(my_struct) );....my_struct data;receive( &data, sizeof(my_struct) );

or better yet: (guaranteed to work)
my_struct data;send( &data.one, sizeof(char) );send( &data.two, sizeof(int) );....my_struct data;receive( &data.one, sizeof(char) );receive( &data.two, sizeof(int) );

##### Share on other sites
What I'd do is have
struct foo{  data d1;  data d2;  data d3;  char* packetize()  {    //function that puts all members    //into proper packet form  }  void fromPacket(char* packet)  {    //take packet from form in packetize()    //and assign data to the members  }};

then you can just call packetize when sending and fromPacket when receiving ;0

##### Share on other sites
For the record, sizeof(char) is always 1.

##### Share on other sites
Quote:
 Original post by CmpDevSome good links there from lessbread, but here is somethings to take into consideration.

OK, let's consider them.

Quote:

Impossible; the compiler will run its structure-padding algorithm automatically (although *if* the members are put in the correct order *and if* the sizes all add up right, it might turn out that the total padding amounts to zero bytes).

Quote:
 as it adds a speed penalty

Incorrect. The compiler does it *specifically because*, among other things, it is expected to *improve* performance. You might lose out in *size*, though, if you just put in your data members in any old order rather than trying to pair up your shorts and arrange bytes in sets of four. But those numbers are platform specific; and anyway, there is nothing that can be done in the OP's example since there is only one char-sized member.

Quote:
 and is platform specific.

Well, yes, but...

Quote:
 Correctly aligning your own structures costs nothing.

The manner of correctly aligning structures technically is platform specific as well (nothing prevents me from creating hardware with a "native" integer size of 24 bits, for example, upon which the reference C++ compiler would naturally provide 24-bit ints, and your efforts to align things to 32-bit barriers would be all for naught). And certainly it *does* cost a fair bit - in terms of your development time.

(It is worth noting that the compiler is not allowed to rearrange your data members so that they will pack better.)

Quote:
 Not sure if bitfields incur a speed penalty?

Depends what you mean by "penalty"; i.e. what you are comparing to which.

Quote:
 You can develop a serialisation method that takes care of everything for you,see hplus' post in this thread

Of course.

Quote:
 Finally use types which are guaranteed to be the same on all peers, what is the size of the variable "two"?

A good thought. Unfortunately, the language itself doesn't really provide you with any, except for char (which is guaranteed to have a sizeof() == 1). In order to keep things in agreement, therefore, one normally relies on typedefs - either (again) platform-specific ones, or a wrapper such as provided by Boost (which, in turn, asks the preprocessor what the platform is and selects the appropriate platform-specific typedefs... I'm sure there's a more clever, template meta-programming way to make this work, but they don't seem to be doing it).

##### Share on other sites
If you have an up-to-date compiler, it may support the fixed-representation integer types introduced in C99. They are: int8_t, int16_t, int32_t, int64_t, uint8_t, uint16_t, uint32_t, uint64_t. They have the exact size in bits that their names imply, and the u* variants are unsigned. All the bits are used, and the signed variants must use a two's complement representation.

You need to include <stdint.h> to see these typedefs. Sadly, there are no corresponding fixed-representation types for floats.

(It occurs to me that since there's no guaranteed endianness for these types, they don't actually have a fixed-representation, rendering them considerably less useful for a portable description of a binary structure. There are extensions in most compilers for explicitly unpadded structures, but I don't know of any extensions for explicit endianness?)

As an aside, C99 also includes special-case "generic macros" for overloading specific math functions for the various floating-point types. Delightfully, it is not possible to implement these macros in C99 itself.

##### Share on other sites
Quote:
 Original post by ZahlmanA good thought. Unfortunately, the language itself doesn't really provide you with any, except for char (which is guaranteed to have a sizeof() == 1).

And even char is not to be trusted. C++ measure the length of a variable in byte, but these are not machine bytes. The size of a C++ byte is not defined anywhere in the standard, meaning that it can be 9 bits (of course, this is quite strange and rare) or 16, or 32 - depending on what the compiler vendor thinks as the best way to handle a C++ byte.

It means that if on one plateform your C++ byte is 8 bit and on the other one the C++ byte is 16 bits, you can run into problems. Of course, this does not happen very often (although it can happen if one of the plateform is only able to do 16 bit addressing).

In the PC world, a C++ is often defined to be equivalent to a machine byte (8 bits).

Regards,

##### Share on other sites
Quote:
Original post by Zahlman
Quote:
 Original post by CmpDevSome good links there from lessbread, but here is somethings to take into consideration.

OK, let's consider them.
Quote:

Impossible; the compiler will run its structure-padding algorithm automatically (although *if* the members are put in the correct order *and if* the sizes all add up right, it might turn out that the total padding amounts to zero bytes).

Zahlman, I'm pretty sure CmpDev's reference to "Padding" was referring to manually padding structures yourself, not the compilers automatic padding algorithm. His reference to "correct alignment" seems to suggest that instead of padding things, you should arrange them in the order most likely to make this algorithm happy.
No need to attack his well-intentioned post with a condescension-cannon...

##### Share on other sites
I know it's slightly OT, but :)

..if you want to maximize speed, another good heuristic is natural alignment and ordering members from largest to smallest in the struct. This works for local variables on the stack too.

The concept of 'natural alignment' is key here.

This ensures (in conjunction with compiler padding) that if the beginning of your structure or local variables is "naturally aligned" (ie. fastest access), then your first member will be naturally aligned, as will your second member, and so forth.

Of course if your data is quite large (spanning many cache lines and with complex data access), you need to keep in mind the access pattern for your structure to minimize cache misses too.

More links on natural alignment (in other languages, but the same concept applies)

http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.xlcpp8a.doc/proguide/ref/alignment.htm

http://www.intel.com/software/products/compilers/flin/docs/main_for/mergedprojects/optaps_for/fortran/optaps_prg_algn_f.htm

##### Share on other sites
Quote:
 ..if you want to maximize speed, another good heuristic is natural alignment and ordering members from largest to smallest in the struct. This works for local variables on the stack too.

you can even automate it....

#define MAX 10#define NAME(name) \     template<typename T> struct name { struct type { T name; }; };#define GEN_NONE(z, n, data) struct BOOST_PP_CAT(none, n) { struct type { }; };BOOST_PP_REPEAT(MAX, GEN_NONE, ~)struct has_smaller_size{    template<typename T1, typename T2>    struct apply    {        typedef boost::mpl::bool_<(sizeof(T1) < sizeof(T2))> type;      };};#define SORT_LIST(z, n, data) \     typename boost::mpl::at<typename boost::mpl::sort<data, \                                                       has_smaller_size, \                                                       boost::back_inserter<boost::mpl::vector<> > >::type, \                             n>::typetemplate<BOOST_PP_ENUM_BINARY_PARAMS(MAX, typename T, = None)>struct Data :    BOOST_PP_ENUM(MAX, SORT_LIST, BOOST_PP_ENUM_BINARY_PARAMS(MAX, T, ::type BOOST_PP_INTERCEPT)){ };NAME(myint);NAME(mychar);NAME(mychar2);// Examplestruct MyStruct :    Data    <        myint<int>,        mychar<char>,        mychar2<char>    >{     int dostuff()    {        return myint * 2;    }};// is the same asstruct MyStruct{    char mychar;    char mychar2;    int myint;};// Disclamer: the code above assumes inherited things are placed in order at // the start of the structs memory, i cant remember if the standard // garuentess that so the above may not be portable.// it also assumes that your compiler implements the empty base class // optimization

##### Share on other sites
Quote:
Original post by Emmanuel Deloget
Quote:
 Original post by ZahlmanA good thought. Unfortunately, the language itself doesn't really provide you with any, except for char (which is guaranteed to have a sizeof() == 1).

And even char is not to be trusted. C++ measure the length of a variable in byte, but these are not machine bytes. The size of a C++ byte is not defined anywhere in the standard, meaning that it can be 9 bits (of course, this is quite strange and rare) or 16, or 32 - depending on what the compiler vendor thinks as the best way to handle a C++ byte.

There are at least some restrictions on what it is allowed to pretend it thinks is the best way, though. In particular, there must be no "holes" in memory, i.e. bits that are not part of any C++ byte, and pointer types need to be able to address C++ bytes of memory. Oh, and it does have to be *at least* 8 bits.

Which is why,

Quote:
 In the PC world, a C++ is often defined to be equivalent to a machine byte (8 bits).

It's just easier that way.

## Create an account

Register a new account

• ### Forum Statistics

• Total Topics
627735
• Total Posts
2978854

• 10
• 10
• 21
• 14
• 12