Sign in to follow this  

Performance of #Pragma Pack(1)

This topic is 4557 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

What are the performance costs for using #Pragma Pack(1) for accessing structures? Inaddition, I've read it pads every element for 4 bytes (I've also read it doesn't), what does it actually do? eg... is this correct?
struct strObject
{
char chrHi;
char _Padding1, _Padding2, _Padding3;

int intTest;

short intTesting;
short _Padding1;
}
My program accesses packed structures a half million times per frame (And runs at roughly 160FPS), could modifying my struct boost its performance?

Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
[...]My program accesses packed structures a half million times per frame (And runs at roughly 160FPS), could modifying my struct boost its performance?
Why not make such trivial changes and then benchmark them to see what difference they make? Such profiling would be infinitely more accurate than any guesses we could make a bout your code.

Share this post


Link to post
Share on other sites
Quite frankly, I don't know how to go about declaring some structs packed, and others non-packed.

My guess would be...
#pragma pack (1)
struct strHi
{
char blahblah;
}

#pragma pack
struct strBye
{
char blahblah;
}

Share this post


Link to post
Share on other sites
No. Use it like this:


#pragma pack(show)
#pragma pack(push)
#pragma pack(1)
#pragma pack(show)
struct first
{
char a;
int b;
};
#pragma pack(pop)
#pragma pack(push)
#pragma pack(8)
#pragma pack(show)
struct second
{
char a;
int b;
};
#pragma pack(pop)


Regards,

Share this post


Link to post
Share on other sites
Don't guess, look at the docs.

The docs are for MSVC. Structure padding and alignment requirements can vary from compiler to compiler and processor to processor. For MSVC the default is to align members to thier natural alignment (e.g. 2-byte variables to 2-byte alignment, 4-byte to 4-byte, etc) unless told otherwise.

Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
What are the performance costs for using #Pragma Pack(1) for accessing structures?

Inaddition, I've read it pads every element for 4 bytes (I've also read it doesn't), what does it actually do?

pack(1) causes structures to be aligned on 1 byte boundaries. So there's never any padding between elements. pack(2) aligns on 2 byte boundaries, so chars will have one byte after them. By default, it is probably aligned on 4-byte boundaries but you can't be sure of this. *edit: except by consulting your compiler documentation. The C and C++ standards don't specify anything.

The performance cost is that processors are designed to access memory on word bounaries. With a four byte word, you access 0x00 through 0x03 all at once, but not 0x01 through 0x04. So if you have an int that starts on 0x01, then it requires two reads and some shuffling to get all the data. With #pragma pack(1) in place this happens if you have a char followed by an int. So it slows down the program, but decreases the memory requirements [because without it a few bytes might be added between the two to ensure the int is easy to read].

It also helps with portability in some cases, but you don't ask about that.

As to your code, that is what the compiler may or may not do by default [except for the closing pad, that's unneccessary in all circumstances], or if you specified #pragma pack(4). If you do pack(1), there would be no padding.

CM

Share this post


Link to post
Share on other sites
Quote:
Original post by Conner McCloud
The performance cost is that processors are designed to access memory on word bounaries. With a four byte word, you access 0x00 through 0x03 all at once, but not 0x01 through 0x04. So if you have an int that starts on 0x01, then it requires two reads and some shuffling to get all the data. With #pragma pack(1) in place this happens if you have a char followed by an int. So it slows down the program, but decreases the memory requirements [because without it a few bytes might be added between the two to ensure the int is easy to read].


So a packed structure of say..
struct strRGB
{
unsigned char bytBlue,bytGreen,bytRed;
}



In the memory it looks like this.. (Packed)

[b][g][r][b] | [g][r][b][g] | [r][b][g][r] |

... reading the first RGB is going to be fast since all the data is already word aligned, however, in reading the second and third RGB's it is going to have to read from the first and second words, right?

And in the case of padding...
[b][g][r][_] | [b][g][r][_] | [b][g][r][_] |

Reading each structure takes reading only one word?

Edit: Or in padding, is it aligned like this..?
[b][_][_][_] | [g][_][_][_] | [r][_][_][_] |

Share this post


Link to post
Share on other sites
When padding, it adds all the free bytes at the end of sequence. (I don't know how to properly express myself)

struct rgb
{
char r;
char g;
char b;
}

would be padded as

struct rgb
{
char r;
char g;
char b;
char _not_used_;
}

and it is read as b g r _ | b g r _ | b g r _ etc...

Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
So a packed structure of say..
*** Source Snippet Removed ***

In the memory it looks like this.. (Packed)

[b][g][r][b] | [g][r][b][g] | [r][b][g][r] |

... reading the first RGB is going to be fast since all the data is already word aligned, however, in reading the second and third RGB's it is going to have to read from the first and second words, right?

Well, first of all, let me take back something I said before. I suggested padding at the end would never be necessary, my logic being that when you declare two structures, padding can be added after the first if needed, rather than making it a part of the first. But in the case of an array I don't think they can add padding between the elements of the array. So it may be necessary in that sense. I just don't know, so ignore that comment.

With that out of the way, it really depends. I believe single bytes can be read just as efficiently from anywhere, so there is probably no reason to pad them. But I might be wrong...that's really up to the processor. Which is why we have optimizing compilers. Microsoft pays engineers a lot of money to figure out how the various processors like to read memory, specifically so that we don't have to.

If you really want to know what your compiler is doing, then just take the address of the members and see how far apart they are.

CM

Share this post


Link to post
Share on other sites
An array of your strRGB will always be layed out as

[b][g][r][b] | [g][r][b][g] | [r][b][g][r] 


regardless of #pragma (again, this is MSVC). You can write a really trivial test program to verify that for yourself. The whole structure has a natural alignment of 1 because all of it's members have a natural alignment of 1.

Quote:
... reading the first RGB is going to be fast since all the data is already word aligned, however, in reading the second and third RGB's it is going to have to read from the first and second words, right?

Maybe. It depends whether the compiler is smart enough to know for sure that the first read would always be aligned properly or not. If it can't deduce that then it has to assume the worst and do unaligned reads all the time.

Quote:
The performance cost is that processors are designed to access memory on word bounaries. With a four byte word, you access 0x00 through 0x03 all at once, but not 0x01 through 0x04. So if you have an int that starts on 0x01, then it requires two reads and some shuffling to get all the data. With #pragma pack(1) in place this happens if you have a char followed by an int. So it slows down the program

Or even more fun, on RISC processors your app will crash if you try to do an unaligned read.

Share this post


Link to post
Share on other sites
Quote:
Original post by Emmanuel Deloget
No. Use it like this:


#pragma pack(show)
#pragma pack(push)
#pragma pack(1)
#pragma pack(show)
struct first
{
char a;
int b;
};
#pragma pack(pop)
#pragma pack(push)
#pragma pack(8)
#pragma pack(show)
struct second
{
char a;
int b;
};
#pragma pack(pop)


Regards,


I pragma the definitions, not the instances right? (I could swear MSDN said otherwise).

eg..

I don't do this...

struct first
{
char a;
int b;
};
struct second
{
char a;
int b;
};
#pragma pack(show)
#pragma pack(push)
#pragma pack(1)
#pragma pack(show)
struct first Test1;
#pragma pack(pop)
#pragma pack(push)
#pragma pack(8)
#pragma pack(show)
struct second Test2;



Quote:
Original post by Anon Mike
Or even more fun, on RISC processors your app will crash if you try to do an unaligned read.

Always good to know. [smile]

Share this post


Link to post
Share on other sites
Quote:
Original post by Thevenin
I pragma the definitions, not the instances right? (I could swear MSDN said otherwise).

Yes. The layout of the structures are established when the structure is defined. Otherwise, you could use the pack command to create two instances of a structure with different sizes. sizeof(x) would be useless in this case.

CM

Share this post


Link to post
Share on other sites

This topic is 4557 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this