Custom C++ allocator for alligned data

Started by
21 comments, last by Zipster 6 years, 6 months ago

I am a bit confused about writing a custom allocator for aligned data. My aligned structs/classes typically derive AlignedData (passing the struct/class name as class template argument). The latter provides a custom operator new/new[]/delete/delete[]. I now want to write a custom allocator for these aligned structs, but I am confused about allocate/deallocate vs construct/destroy.

General


#include <malloc.h>

inline void *AllocAligned(size_t size, size_t alignment = 16) noexcept {
    return _aligned_malloc(size, alignment);
}

template < typename DataT >
inline DataT *AllocAligned(size_t count) noexcept {
    return static_cast< DataT * >(AllocAligned(count * sizeof(DataT)));
}

inline void FreeAligned(void *ptr) noexcept {
    if (!ptr) {
        return;
    }

    _aligned_free(ptr);
}

AlignedData


template< typename DataT >
struct AlignedData {

public:

    static void *operator new(size_t size) {
        const size_t alignment = __alignof(DataT);
        
        // __declspec(align) on DataT is required
        static_assert(alignment > 8, 
            "AlignedData is only useful for types with > 8 byte alignment.");
        
        void * const ptr = AllocAligned(size, alignment);
        if (!ptr) {
            throw std::bad_alloc();
        }

        return ptr;
    }

    static void operator delete(void *ptr) noexcept {
        FreeAligned(ptr);
    }

    static void *operator new[](size_t size) {
        return operator new(size);
    }

    static void operator delete[](void *ptr) noexcept {
        operator delete(ptr);
    }
};

AlignedAllocator:


template< typename DataT, size_t AlignmentS = __alignof(DataT) >
struct AlignedAllocator {
    
public

    using value_type = DataT;
    using pointer =  DataT *;
    using reference = DataT &;
    using const_pointer = const DataT *;
    using const_reference = const DataT &;
    using size_type = size_t;
    using difference_type = ptrdiff_t;

    template< typename DataU >
    struct rebind {

    public:

        using other = AlignedAllocator< DataU, AlignmentS >;

        rebind &operator=(const rebind &r) = delete;
        rebind &operator=(rebind &&r) = delete;

    private:

        rebind() = delete;
        rebind(const rebind &r) = delete;
        rebind(rebind &&r) = delete;
        ~rebind() = delete;
    };

    AlignedAllocator() noexcept = default;
    
    AlignedAllocator(
        const AlignedAllocator< DataT, AlignmentS > &allocator) noexcept = default;
    
    AlignedAllocator(
        AlignedAllocator< DataT, AlignmentS > &&allocator) noexcept = default;
    
    template< typename DataU >
    AlignedAllocator(
        const AlignedAllocator< DataU, AlignmentS > &allocator) noexcept {}
    
    ~AlignedAllocator() = default;

    AlignedAllocator< DataT, AlignmentS > &operator=(
        const AlignedAllocator< DataT, AlignmentS > &allocator) noexcept = delete;

    AlignedAllocator< DataT, AlignmentS > &operator=(
        AlignedAllocator< DataT, AlignmentS > &&allocator) noexcept = delete;

    DataT *address(DataT &data) const noexcept {
        return &data;
    }
    
    const DataT *address(const DataT &data) const noexcept {
        return &data;
    }

    size_t max_size() const noexcept {
        return (static_cast< size_t >(0) - static_cast< size_t >(1)) 
            / sizeof(DataT); // independent of definition of size_t
    }

    DataT *allocate(size_t count) const {
        return new DataT[count];
    }

    template< typename DataU >
    DataT *allocate(size_t count, const DataU *) const {
        return allocate(count);
    }

    void deallocate(DataT *data, size_t count) const {
        if (count == 0) {
            return;
         }
         else if (count == 1) {
            delete data;
         }
         else {
            delete[] data;
          }
    }

    template< typename DataU, typename... ConstructorArgsT >
    void construct(DataU *data, ConstructorArgsT&&... args) const {
        new ((void *)data) DataU(std::forward< ConstructorArgsT >(args)...); // this looks like an allocation after all?
    }

    template< typename DataU >
    void destroy(DataU *data) const {
        data->~DataU();
    }

    bool operator==(
        const AlignedAllocator< DataT, AlignmentS > &rhs) const noexcept {
        
        return true; // stateless allocator
    }

    bool operator!=(
        const AlignedAllocator< DataT, AlignmentS > &rhs) const noexcept {
        
        return false; // stateless allocator
    }
};

I am not sure about the AlignmentS template parameter yet. I think of supporting non-AlignedData values as well. In this case, I would duplicate/replicate the content of my custom operator new/new[]/delete/delete[] in allocate and deallocate themselves.

🧙

Advertisement

Why not using aligned_alloc ?

I linked to the C++ documentation since it is more easy to point to it than the C version, but this is the same for pure C.

3 hours ago, _Silence_ said:

Why not using aligned_alloc ?

Instead of _aligned_malloc? I seem to be more familiar with Windows specific code :D . But if they are the same, I consider using the standard one, though it seems strange that the associated free (versus _aligned_free) is used for malloc/calloc as well.

 

Without relying on AlignedData:


DataT *allocate(size_t count) const {
  void * const ptr = AllocAligned(sizeof(DataT) * count, AlignmentS);
  if (!ptr) {
    throw std::bad_alloc();
  }
  
  return static_cast< DataT * >(ptr);
}

template< typename DataU >
DataT *allocate(size_t count, const DataU *hint) const {
  return allocate(count);
}

void deallocate(DataT *data, size_t count) const {
  FreeAligned((void *)data);
}

 

🧙

19 hours ago, matt77hias said:

Instead of _aligned_malloc? I seem to be more familiar with Windows specific code  . But if they are the same, I consider using the standard one, though it seems strange that the associated free (versus _aligned_free) is used for malloc/calloc as well.

Well I don't know at all about the Microsoft version of this function. If you use C++ 11 or C 11, then aligned_alloc will be present in the standard library. If you use C++, then you'll have even more functionalities. So why using a non-standard and non-portable version of a function, even if you don't plan to develop on another platform.

I like this way, so i know exactly what my memory is and where everything does. Also you can include a guard memory page after to detect overflow:


void *AllocateMemory(const size_t) {
  assert(size > 0);
  void *memory = VirtualAlloc(0, size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
  return(memory);
}

void FreeMemory(void *ptr) {
	assert(ptr != nullptr);
	VirtualFree(ptr, 0, MEM_FREE);
}

void *AllocateAlignedMemory(const size_t size, const size_t alignment) {
    assert(size > 0);
    assert((alignment > 0) && !(alignment & (alignment - 1)));

    // Allocate empty memory to hold a size of a pointer + the actual size + alignment padding 
    size_t newSize = sizeof(void *) + size + (alignment << 1);
    void *basePtr = AllocateMemory(newSize);

    // The resulting address starts after the stored base pointer
    void *alignedPtr = (void *)((uintptr_t)basePtr + sizeof(void *));

    // Move the resulting address to a aligned one when address is not aligned to the given alignment
    uintptr_t mask = alignment - 1;
    if ((alignment > 1) && (((uintptr_t)alignedPtr & mask) != 0)) {
        *(uintptr_t *)alignedPtr += ((uintptr_t)alignment - ((uintptr_t)alignedPtr & mask));
    }

    // Write the base pointer before the alignment pointer
    *(void **)((void *)((uintptr_t)alignedPtr - sizeof(void *))) = basePtr;

    return(alignedPtr);
}

void FreeAlignedMemory(void *ptr) {
    assert(ptr != nullptr);

    // Free the base pointer which is stored to the left from the given pointer
    void *basePtr = (void *)((void **)((uintptr_t)ptr - sizeof(void *)));
    FreeMemory(basePtr);
}

aligned_alloc() is not the same, as it does thread syncronisation as well.

54 minutes ago, Finalspace said:

assert((alignment > 0) && !(alignment & (alignment - 1)));


assert(size);
assert(alignment && !(alignment & (alignment - 1)));

size_t will always be some unsigned whatever sized integer.

54 minutes ago, Finalspace said:

size_t newSize = sizeof(void *) + size + (alignment << 1);

How does the (alignment << 1) padding work? Assume: alignment = 16,  32 bytes will be added by default independent of size?

🧙

1 hour ago, matt77hias said:


assert(size);
assert(alignment && !(alignment & (alignment - 1)));

size_t will always be some unsigned whatever sized integer.

How does the (alignment << 1) padding work? Assume: alignment = 16,  32 bytes will be added by default independent of size?

Well the size can be zero. Allocating zero bytes of memory makes no sense - so that assert has its place.

And the padding is always appended, with a little bit of memory waste in mind. If you want you can improve it by calculating the right amount of padding.

2 minutes ago, Finalspace said:

Well the size can be zero. Allocating zero bytes of memory makes no sense - so that assert has its place.

I don't mean the is-not-zero test (I still have these two tests in my small refactor), I mean the greater-than-test feels less to-the-point since we are dealing with unsigneds anyway. :)

3 minutes ago, Finalspace said:

If you want you can improve it by calculating the right amount of padding.

Idd. this sounds more appropriate, especially if you have many small chuncks (versus using your own memory arenas).

🧙

5 minutes ago, matt77hias said:

I don't mean the is-not-zero test (I still have these two tests in my small refactor), I mean the greater-than-test feels less to-the-point since we are dealing with unsigneds anyway.

True, for me its a non-brainer. I personally dont think about this kind of things, the compiler will catch it as a warning eventually and it wont hurt. Its just a _mm_cmpge_ps or _mm_cmpgt_ps in the end and done with one or two cycles.

@OP that's a lot of code! Maybe it would be simpler to use malloc/free and perform your own alignment? Here was my strategy for 16 byte aligned allocations: https://github.com/RandyGaul/tinyheaders/blob/master/tinysound.h#L475-L490


static void* malloc16( size_t size )
{
	void* p = malloc( size + 16 );
	if ( !p ) return 0;
	unsigned char offset = (size_t)p & 15;
	p = (void*)TS_ALIGN( p + 1, 16 );
	*((char*)p - 1) = 16 - offset;
	TS_ASSERT( !((size_t)p & 15) );
	return p;
}

static void free16( void* p )
{
	if ( !p ) return;
	free( (char*)p - (size_t)*((char*)p - 1) );
}

 

This topic is closed to new replies.

Advertisement