Archived

This topic is now archived and is closed to further replies.

Abstraction

sizeof and its wrath

Recommended Posts

Abstraction    122
So I had this struct and I was pretty dang sure it took up 14 bytes. Take a look...
  
typedef struct MS3D_HEADER
{
     char id[10];
     int  version;
};
  
Then I call sizof(MS3D_HEADER) and it says the thing is 20 bytes long. After a lot of recounting followed by breaking things and cursing I fumbled around the documentation of the sizeof function and read this (more or less)... sizeof will return the size of the object in bytes taking into account bytes added for memory padding. What!? Why are bytes added and what the heck is the padding for? Is it done by the system or the compiler? Does the padding come at the begining or the end? And what about this scenerio -
  
MS3D_HEADER header;
BYTE *prt;

ptr = (BYTE *)header;
ptr += 10;
  
Where does ptr point to. Is it at header.version, at some random byte spliced in there for memory allignment, or what? I am completely self taught and get blindsided by stuff like this all the time. So after you call me stupid and answer my questions (hopefully) could you suggest some books or other sources that would educate me. I''m getting tired of this sort of thing.

Share this post


Link to post
Share on other sites
Draxis    122
Well I THINK I read about padding once before and I THINK it works like this:
It likes everything in the structure to fit into 'cells' of memory which are the same size as the largest variable in the structure.
So, for instance, in your structure, the larges variable is that 10 byte string, but instead of appending the 4 byte int to the end of that 10 byte string, it allocates ANOTHER 10 bytes and sticks it in there.
BASICALLY it likes to increment the memory allocated to a structure by the largest element in that structure.
I not good at explaining...

PS: Consequently, if you really want to control this, I THINK you can use bit fields to restrict the size of the structure, but you need to look that up because I have no idea what I'm talking about =P

Edited by - Draxis on February 6, 2002 10:30:54 PM

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster
Ok, so here is the deal.

Each processor has different alignment issues, if the data is aligned properly it will improve preformance, if it is not aligned properly it will decrease preformance.

Using a pack macro(look it up on MSDN) you can pack your data into the extra bits that are used to pad each variable, or you can change the alignment in the compiler options (if your using visual C++). What you don''t want to do is not have the data aligned as it will decrease your overall preformance.

Share this post


Link to post
Share on other sites
Spiral    122
A 3-second Google search came up with this, it should explain to you why your compiler pads structs.

If you want to change which boundary your compiler pads at, read your compiler's manual. If you're using MSVC, you can either use the /ZpX compiler switch (where X is the number of bytes to align to, 1 being no padding), or you can use the #pragma pack(push, X) directive and #pragma pack(pop). e.g:

    

#pragma pack(push, 1) // changes byte alignment to 1


struct myStruct {
int myInt;
long myLong;
};

#pragma pack(pop) // changes back to the default byte alignment




Edited by - Spiral on February 6, 2002 10:38:53 PM

Share this post


Link to post
Share on other sites
Spiral    122
Just to illustrate why your struct came out at 20 bytes:

      
typedef struct MS3D_HEADER
{
char id[10]; // Data size is 10, this will be padded to 16 bytes, so +6 extra

int version; // Data size is 4, this will have no padding

// hence, 10 + 6 + 4 = 20

};


quote:
Original post by Abstraction
Where does ptr point to. Is it at header.version, at some random byte spliced in there for memory allignment, or what?



I've actually never thought about this, but it should point to the padding byte just after id[9]. Why dont you try it?

[edit] - Just to clarify, the data sizes are the ones on my computer (ie char is 1 byte, int is 4 bytes). They may be different on other platforms. I use win2k.

Edited by - Spiral on February 6, 2002 10:51:47 PM

Share this post


Link to post
Share on other sites
Spiral    122
quote:
Original post by Draxis
Well I THINK I read about padding once before and I THINK it works like this:
It likes everything in the structure to fit into ''cells'' of memory which are the same size as the largest variable in the structure.
So, for instance, in your structure, the larges variable is that 10 byte string, but instead of appending the 4 byte int to the end of that 10 byte string, it allocates ANOTHER 10 bytes and sticks it in there.
BASICALLY it likes to increment the memory allocated to a structure by the largest element in that structure.
I not good at explaining...

PS: Consequently, if you really want to control this, I THINK you can use bit fields to restrict the size of the structure, but you need to look that up because I have no idea what I''m talking about =P



There was absolutely nothing correct in this post I''ve just explained your first point, and bit fields have nothing to do with alignment. They just allow you to specify the size/width of a variable.

Share this post


Link to post
Share on other sites
EvilCrap    134
i think that it aligns to sizeof int, since the cpu works with ints best.

are u sure that 20 isnt a typo ? it should be 16, i think, since ints are 4 bytes long.

actually, i just tried it in mvc++, and its 16: what compiler are u using?
this is probably over cryptic. but anyway,
    
#include <iostream>
using namespace std;
const max = 10;
struct S
{
char c[max];
int i;
};
void main(void)
{
cout << sizeof(S) << endl;
S s;
for(int i = 0; i <max; i++)
s.c[i] = i;
s.i = 15;
char* ptr = (char*)&s;
cout << (__int8)ptr[max - 1] << endl;
cout << *(int*)(&ptr[max] + sizeof(char) * ( 4 - max % 4));
cout << endl;
cout << (char*)&s.c - (char*)&s.i;
}

i woiuldnt count on a compiler organizing the data to literally represent a struct, even though in this example, the pads are addes to before the int.

im pretty sure there is a way to turn off padding...

Edited by - evilcrap on February 6, 2002 11:03:47 PM

Share this post


Link to post
Share on other sites
Abstraction    122
Thanks for the replies.

AP - you basically comfirmed my suspicsions, very reassuring.

Spiral - So it is my compiler and not some bizzar mechanism in memory. Thanks for the search. I have been doing the same thing, however, sometimes real live typing(?) people shed light on the little detials and pitfalls that are hard to realize on your own. Anyway, #pragma pack(push, 1) and pack(pop) sounds like a good solution. I''ve seen that before and always wondered, I quess now I know. Thanks a bunch (all of you) for your most excellent help.

Just one more thing. If I use the solution above and take one of my objects out of aligment will preformance be poor just in sections dealing with instances of the object or will it throw the whole application out of alligment.

Oh, I do have MSVC 6.0 (Introductory Edition). It didn''t come with a manual just some book about MFC and a book on CD that teaches you how to program (tries to at least). Only basic (very basic) information is given about the compiler''s settings and interface. I''ll shutup now.

Share this post


Link to post
Share on other sites
Abstraction    122
EvilCrap - It comes out 20 every time, I swear. I have MSVC++ 6.0. Once I started hearing "memory alignment" and padding I totally thought it should by 16 or by some stretch 24 but not 20. I''m thinking Spiral knows what he''s talking about though.

Share this post


Link to post
Share on other sites
Spiral    122
Wait a second... i could be wrong in my analysis of what would happen in your struct... i''ve just tried it out on my compiler (MSVC 6 pro) and EvilCrap is right, it comes out as size 16. With some playing i''ve discovered it inserts 2 padding bytes after the char id[10].

Obviously one of those differences between the Introductory/Standard edition and Pro compilers, i suppose.

Anyway...

quote:

Just one more thing. If I use the solution above and take one of my objects out of aligment will preformance be poor just in sections dealing with instances of the object or will it throw the whole application out of alligment.



Wrapping whichever structs you want to change the alignment for with #pragma pack will just change the byte alignment for structs of that type... it will not effect the whole of your application. Using the /Zp switch will.

Share this post


Link to post
Share on other sites
Spiral    122
Abstraction - perhaps you would be good enough to compile & run EvilCrap''s code to see if I was right about where the padding takes place? This has got me stumped now.

Share this post


Link to post
Share on other sites
Draxis    122
quote:
Original post by Spiral
There was absolutely nothing correct in this post I''ve just explained your first point, and bit fields have nothing to do with alignment. They just allow you to specify the size/width of a variable.


Yeesh! Don''t bite my head off! I said ''THINK'' and I''ve only been programming C/C++ for little under a year so I''m fairly newbish =P
I need to get better books...

Share this post


Link to post
Share on other sites
Oluseyi    2112
quote:
Original post by Spiral
Obviously one of those differences between the Introductory/Standard edition and Pro compilers, i suppose.

Um, no. The data is word aligned, and each word is 4 bytes - meaning the next word-aligned boundary is 12 bytes (2 extra bytes). However, if for some reason the data was double word aligned (since the bus width is usually 64 bits on 32-bit platforms) then you''d get the 20-byte size.


[ GDNet Start Here | GDNet Search Tool | GDNet FAQ | MS RTFM [MSDN] | SGI STL Docs | Google! ]
Thanks to Kylotan for the idea!

Share this post


Link to post
Share on other sites
Spiral    122
quote:
Original post by Draxis
Yeesh! Don''t bite my head off! I said ''THINK'' and I''ve only been programming C/C++ for little under a year so I''m fairly newbish =P



Sorry, i didn''t mean it to come out that way. I did throw in a smiley in the hope that you''ld understand i wasn''t biting anything

quote:

Original post by Oluseyi
Um, no. The data is word aligned, and each word is 4 bytes - meaning the next word-aligned boundary is 12 bytes (2 extra bytes). However, if for some reason the data was double word aligned (since the bus width is usually 64 bits on 32-bit platforms) then you''d get the 20-byte size.



Thanks for clearing that up.

Share this post


Link to post
Share on other sites
Draxis    122
quote:
Original post by Spiral

Sorry, i didn''t mean it to come out that way. I did throw in a smiley in the hope that you''ld understand i wasn''t biting anything



It''s okie! I''m sorry, too. I''m just so used to getting my head bitten off by programmers for being wrong
And thanks for the explination, everyone!
You set me strait and I learned something new.
Yay! Learning is fun!

Share this post


Link to post
Share on other sites
Abstraction    122
Oh crap! You guys are going to be ticked.
Turns out...um...well...sizeof(MS3D_HEADER) really does return 16. *banging head on wall* Don''t ask, I''m really sorry. I''ve been know to read things down up instead of up down.

Oluseyi - if padding depends on the bus size than if my app is compiled on a sytem with a 64 bit bus what happens when it is is executed on a system with a 32 bit bus. Assuming I don''t suppress the padding and my compiler is optimizing things for my computer. I''m guessing the 32 bus system will just end up doing more work, right? Do games ever dynamically pad their critical structures to get the best preformance?

Goodness I feel like a little kid...so many questions.

I have to go now so no big deal if you don''t answer. I''am sure it''ll hit me sooner or later. That site Spiral pointed me to has tons of good stuff in it (thanks for that).

Thanks for the postive feed back everyone, good night.

Share this post


Link to post
Share on other sites
Oluseyi    2112
quote:
Original post by Abstraction
Oluseyi - if padding depends on the bus size than if my app is compiled on a sytem with a 64 bit bus what happens when it is is executed on a system with a 32 bit bus.

Generally, all processors of the same family/instruction set will have similar architecture. If you compile an application on a 64-bit bus processor, with processor-specific optimizations (like padding), it simply won''t run on a lesser processor. For example, a Pentium-specific app generally won''t run on a 486.

[ GDNet Start Here | GDNet Search Tool | GDNet FAQ | MS RTFM [MSDN] | SGI STL Docs | Google! ]
Thanks to Kylotan for the idea!

Share this post


Link to post
Share on other sites
Gamekeeper    122
(have not read all replys)
The operating system adds some extra information after the vector that is used for memory management. If you for example create a pointer and initialize it to point to a vector like this:

char* vec = new char[number];

the system needs to add extra information for when your gonna delete the vector like this.

delete[] vec;

how would it otherwize recognise the end of the vector? This extra information is also added on vectors that are initialized this way:

char vec[number];

(or this way)

char vec[] = {...};




Message from above:
Damn, my hair is grey!

Share this post


Link to post
Share on other sites