Archived

This topic is now archived and is closed to further replies.

polar

How do I do align 16 when calling malloc()?

Recommended Posts

I checked the pointer returned from malloc(), seems it only align data to 4 bytes. But I need my data to begin with a address multiple of 16. How can I do it? Do I have to override the function malloc()? If so, how is it implemented? thanx

Share this post


Link to post
Share on other sites
If you want the address of the block to be a multiple of 16, you can always allocate a block 16 bytes bigger than what you need. Then shift the pointer over so that it is properly aligned. You''ll have to save the original pointer so that you can free the memory.

Domini

Share this post


Link to post
Share on other sites
> i *think* there is a precompiler directive to change the default alignment.

#pragma pack ?

If I understand it correctly, it applies to structures and unions.
Quoting the MSDN docs:


Specifies packing alignment for structure and union members. Whereas the packing alignment of structures and unions is set for an entire translation unit by the /Zp option, the packing alignment is set at the data-declaration level by the pack pragma. The pragma takes effect at the first structure or union declaration after the pragma is seen; the pragma has no effect on definitions.



"Paranoia is the belief in a hidden order behind the visible." - Anonymous

Share this post


Link to post
Share on other sites
thanx guys.
i tried this:

#pragma pack(push,16)

typedef struct{
}whatever;

#prgama pack(pop)

guess what? it doesn''t work.
the address returned from malloc() is still not aligned to 16. i guess #pragma pack only align the size of the structure itself to 16, not the address.
i''ve heard that i can reset the hook in malloc.h to make the address multiple of 16. but i have no idea how to do it. is somebody familiar with this?

btw, i want to do this because SIMD instructions use 128 bit regsters, so they will take extra clock cycles when reading from an address not aligned to 16.

Share this post


Link to post
Share on other sites
byte *original_ptr;
byte *aligned_ptr;

original_ptr = malloc(size + 15);
aligned_ptr = (original_ptr + 15) & ~15; // align to 16 bytes

now, use aligned_ptr in your code, because it''s aligned to 16 bytes. when you freeit, you have to free(original_ptr) ;

Share this post


Link to post
Share on other sites