[c++] my structs gets byte stuffed?

Started by
4 comments, last by Tispe 12 years, 3 months ago
I am reading BYTEs, WORDs and DWORDs from a file which I also have open in a HexEditor.

Here are an 16 bytes excerpt from the file:

01 00 80 00 01 00 20 00 00 00 69 09 89 09 00 00


My struct from which I cast data to is:

struct DataHeader
{
BYTE id1; //is 0x01
BYTE id2; //is 0x00
WORD type1; //is 0x0080
WORD type2; //is 0x0001
DWORD OffsetData; //supposed to be 0x00000020, but is 0x09690000
}


Somehow I get 0x09690000 instead of 0x00000020 as a value in OffsetData, so reading the first bytes and words right, but then the DWORD is offset by two bytes, why?
Advertisement
Look up pragma pack it is what you are looking for.

Without pack, there is no guarantee your compiler wont pad out your struct.


By their very nature, pragmas are not standard, but pack is pretty universal. I can't say I've encountered a compiler it doesn't work on.



#pragma pack(push,1)
struct DataHeader
{
BYTE id1; //is 0x01
BYTE id2; //is 0x00
WORD type1; //is 0x0080
WORD type2; //is 0x0001
DWORD OffsetData; //supposed to be 0x00000020, but is 0x09690000
}
#pragma pack(pop)



The above will sort you out.

Essentially the pack line at the top says pack at a size of 1 byte ( aka, no padding ) and at the same time "pushes" the current pack'ing. Then you declare your structure, and call #pragma pack(pop), which reverts to the pack'ing setting you pushed earlier. AKA, back to normal.

Put simpler, this preprocessor directive tells the compiler not to pad your struct, then to go back to normal packing rules after your struct has been handled.
The compiler pads members inside a struct to improve memory alignment and help improve overall speed. If you want to disable this feature, you can use #pragma pack.
Could it be that maybe you're discarding the missing word while loading the data in? (nvm :D)

Yo dawg, don't even trip.

Coincidentally, if for some reason pragma pack doesn't work for you, you can also use c++ bit fields





struct DataHeader
{
unsigned int id1 : 8;
unsigned int id2 : 8;
unsigned int type1 : 16;
unsigned int type2 : 16;
unsigned int offsetData : 32;
}



Should also work. There are a couple caveats. id1, id2 and type1 will be packed into a single unsigned int. Then ( as we overflowed the size of an unsigned int ) it will create a new one, allocate 16 bits to it, then the first 16 bits of offsetData, then allocate another 4 bytes for the remaining bytes of offsetData, even though only 2 were needed, meaning you will be wasting 2 bytes. It is possible the compiler is smart enough to figure this out, but I doubt it, it's a pretty big edge case. Also, bit packing is NOT endian safe, so the bits will be in a different order on big endian processors as opposed to little endian ones.



That said, this post is more a matter of general trivia, and not a recommendation!
Thank you, solved it.

This topic is closed to new replies.

Advertisement