unsigned char = 4 Byte ???

Started by
8 comments, last by TheMummy 22 years, 5 months ago
Hi, I write this image description structure into a file:
  
struct IMGINFO                  // sizeof(IMGINFO) = 56

{
    char fileName[32];          // 32 bytes

    unsigned int numChannels;   // 4 bytes

    unsigned char encoding;     // 1 byte

    unsigned int offset;        // 4 bytes

    unsigned int width;         // 4 bytes

    unsigned int height;        // 4 bytes

    unsigned int numColors;     // 4 bytes

};                          // =  53 bytes !?

  
The programm writes a 56 bytes structure into the file, so the encoding byte needs 4 bytes, in the file .. WHY? I am using the ofstream class and the program has been compiled with Borlands C++ 5.3...... thanks in advance
Advertisement
i think this is because of the byte alignment, in vc++ you can ensure correct alignment using #pragma pack() /pop()
AP is right, it is due to alignment. Accessing dwords that aren`t aligned on a 4 byte boundary is slow, so the compiler inserts 3 bytes of padding after your "encoding" char to line up the rest of the structure.

When you write the structure out to file, you are just dumping the data from memory, so the pad bytes get saved. If you don`t want to save the padding, you could write a function to output the structure one field at a time (making sure to write encoding out as a byte). You`d need a corresponding load function too.

If you can reorder the fields in your struct, that might be a good idea too. It wont save you any bytes here, but say you add another char to the structure at the start. Both it and the encoding char get padded to 4 bytes, costing you 6 bytes. If both characters were together at the end of the struct, you would only get 2 bytes padding.

Edited by - Krunk on November 9, 2001 7:36:06 AM
Thanks !

So the 4 bytes alignment is a kind of standard ??

So load and write the struct will work on all OS ?
No, it varies per compiler/computer.

Also you have to watch out for little and big endianness if you''re porting to a different system (ie Mac).
If I remember correctly, the point is that addressing memory is allways done in multiples of four - if you want to retrieve an integer (4 bytes) that is at an address that is a multiple of four, it can be done in one single access of memory - however, if it starts say for example at a multiple of four address + 1, then the first 3 bytes are retrived on the first access, then a second access has to be made to retrieve the fourth byte.



You know, I never wanted to be a programmer...

Alexandre Moura
The reason that the memory adressing is done in multiples of 4 is that (most) consumer systems are 32-bit systems (win_32_); i.e. 4 * 8 = 32 bits. When 64 bit systems arrive in the consumer market, addressing will be done using multiples of 8 instead.
/pitchblack
I should point out that the OS has absolutely nothing to do with the word size of a machine. It is the X86 Architecture that is a 32 bit architecture. Win32 has nothing to do with it..at all.
What will happen on a 64bit system, when I use the pragma pack() directive ?? Should I care about that or not.. I thought about writing bytewise loading and writing methods, since pragma pack() is vc++ specific ...
Addressing is NOT performed in multiples of 4. The issue here is alignment.

If you want to read a double word (4 bytes) from an odd address, you can do it. But X86 chips perform best if data is aligned on a double word boundary (an address which is a multiple of 4.)

On some processors (especially RISC chips), trying to access data on a misaligned address will cause an exception. I believe the X86 supports this feature (Alignment Checking bit in one of the control registers?) but it isn''t used by any operating systems that I know of.

---
Bart
----Bart

This topic is closed to new replies.

Advertisement