How do I do align 16 when calling malloc()?

Started by
5 comments, last by polar 24 years ago
I checked the pointer returned from malloc(), seems it only align data to 4 bytes. But I need my data to begin with a address multiple of 16. How can I do it? Do I have to override the function malloc()? If so, how is it implemented? thanx
Advertisement
i *think* there is a precompiler directive to change the default alignment.
but i am not at all sure that this is possible on a win32 platform.
i''ll look it up though
If you want the address of the block to be a multiple of 16, you can always allocate a block 16 bytes bigger than what you need. Then shift the pointer over so that it is properly aligned. You''ll have to save the original pointer so that you can free the memory.

Domini
> i *think* there is a precompiler directive to change the default alignment.

#pragma pack ?

If I understand it correctly, it applies to structures and unions.
Quoting the MSDN docs:

Specifies packing alignment for structure and union members. Whereas the packing alignment of structures and unions is set for an entire translation unit by the /Zp option, the packing alignment is set at the data-declaration level by the pack pragma. The pragma takes effect at the first structure or union declaration after the pragma is seen; the pragma has no effect on definitions.


"Paranoia is the belief in a hidden order behind the visible." - Anonymous
yes, #pragma pack that''s what i meant
but i don''t quite understand why you need 16-bytes aligned pointers if a pointer is only 32bit big...
thanx guys.
i tried this:

#pragma pack(push,16)

typedef struct{
}whatever;

#prgama pack(pop)

guess what? it doesn''t work.
the address returned from malloc() is still not aligned to 16. i guess #pragma pack only align the size of the structure itself to 16, not the address.
i''ve heard that i can reset the hook in malloc.h to make the address multiple of 16. but i have no idea how to do it. is somebody familiar with this?

btw, i want to do this because SIMD instructions use 128 bit regsters, so they will take extra clock cycles when reading from an address not aligned to 16.
byte *original_ptr;
byte *aligned_ptr;

original_ptr = malloc(size + 15);
aligned_ptr = (original_ptr + 15) & ~15; // align to 16 bytes

now, use aligned_ptr in your code, because it''s aligned to 16 bytes. when you freeit, you have to free(original_ptr) ;

This topic is closed to new replies.

Advertisement