Jump to content
  • Advertisement
Sign in to follow this  
TylerShao

Memory alignment question

This topic is 2523 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I am reading book "game coding complete". There is a confusion about memory alignment. It says,

The CPU reads and writes memory-aligned data much faster than other data. Any N-byte data type is memory aligned if the starting address is evenly divisible by N. For example, a 32-bit integer is memory aligned on a 32-bit architecture if the starting address is 0x04000000. The same 32-bit integer is unaligned if the starting address is 0x04000002, since the memory address is not evenly divisible by 4 bytes.

And then the i am confused by the example code below. In the code, the book says the first struct is the slowest, second is slow, third is fast. Only the last one is memory aligned. How the conclusion is made? I know the definition that Any N-byte data type is memory aligned if the starting address is evenly divisible by N. But how to determine the starting address of a structure?



#pragma pack(push, 1)
struct ReallySlowStruct
{
char c : 6;
__int64 d : 64;
int b : 32;
char a : 8;
};
struct SlowStruct
{
char c;
__int64 d;
int b;
char a;
};
struct FastStruct
{
__int64 d;
int b;
char a;
char c;
char unused[2];
};
#pragma pack(pop)

Share this post


Link to post
Share on other sites
Advertisement
In general, his main point is that you can sometimes save space and lower access time better than the compiler if you add up the bytes as you go down the structure and minimize the points where your bits are not a multiple of 8 and make the size of the structure as a whole a multiple of 4 bytes if you are writing 32-bit code.
In any case, I'd dare say that you might just want to avoid #pragma pack(push, 1) and let the compiler pad it for you unless your project is going to be pushing the limits of modern computers; even then, it might be better to write it without that kind of optimization first and optimize it later if you identified the memory usage of the struct as a relevant key performance hit. As Knuth said, "...premature optomization is the root of all evil."

Share this post


Link to post
Share on other sites

I am reading book "game coding complete". There is a confusion about memory alignment.
Memory alignment techniques like that are an intermediate-to-advanced topic.

You will need to know a little bit about it for serialization (saving/loading blocks of memory to storage or across a network).

His point can be summarised as to just lay out your classes and structures with the biggest elements first and the smallest elements last, or at least next to each other.

The rest of it is a micro-optimization that is POTENTIALLY useful inside very tight loops, such as particle systems where it must iterate and process the collection of objects potentially many tens of millions of times per second. (A few tens of thousands of particles running at high speed really can consume CPU power and memory bandwidth, and needs that type of optimization.)

Do it wrong and there is a very high performance cost, accessing misaligned data can either crash the program or cause a stall as the CPU must re-align it internally, potentially crossing cache lines and doing other nasty expensive stuff.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!