Is my frustum culling slow ?

Started by
44 comments, last by lipsryme 11 years ago

class and type sizes are unrelated to instance adresses.

Excuse my stupidity i still don't understand.

You mean instances of a _Plane in an std::array or vector might not be contiguous in memory (with their default allocator)?

Advertisement

But i don't understand why isn't it already 16byte aligned as data is 4 floats ?

It's made up of primitives, each of which only requires 4-byte alignment to work correctly, so the struct will work as long as it's 4-byte aligned. It doesn't need to be 8 or 16 byte aligned in order to function correctly, only 4-byte aligned (and this is assuming that floats actually need to be 4 byte aligned).


class and type sizes are unrelated to instance adresses.

Excuse my stupidity i still don't understand.
You mean instances of a _Plane in an std::array or vector might not be contiguous in memory (with their default allocator)?


As far as I know:
  • std::allocator<T>::allocate (which is used by the std containers) or "new T" will return a block of memory that is correctly aligned for the type "T" (i.e. the address is a multiple of alignof(T))
  • malloc( sizeof(T) ) wont.

if a class is declared as align on a 16 byte boundary, the class size will be a multiple of 16 too, because we can write things like MyClass array[N]; and if array is align on 16, then array[i+1] will too.

But if you write MyClass* array = new MyClass[N]; you have no guarante on the memory address return by new, because the operator new do not receive an alignment value. You will need to overload the operator new of MyClass to do some internal memalign, or globaly override the memory allocator to always return address multiple of 16.

And std::array is a C array, so the memory is contigus but do not rely on a memory allocator, so the compiler is able to align the members with the declspec value (as long as the objet storing an std::array<_Plane,6> is not allocate with a bare new )

As far as I know:

std::allocator::allocate (which is used by the std containers) or "new T" will return a block of memory that is correctly aligned for the type "T".
malloc( sizeof(T) ) wont.

Sadly, the standard do not consider simd types, the largest native alignment use by the default std::alocator is the one for a long long, that is only 8 bytes.

Am i getting this right:

__declspec(align(16))
struct Foo
{
    float x;
    float y;
    float z;
};

storage space for each element would be 16 bytes?

the operator new do not receive an alignment value

This is true, but new is also required to return a block of memory that is suitably aligned to represent any object of the requested size. This is pretty ridiculous, because if you're allocating an object of size 32, the operator new function has absolutely no way of knowing if it should be aligned to 32/16/8/4/2/1 byte boundary, or some other value!

I've found that in practice, most implementations of operator new simply use min(16, size) as the alignment value, and assume that no object would ever require a larger alignment value... This means that objects that have been manually set to use an alignment value of 16 do typically still work ok with new.

However, when you manually set the alignment of a class to some larger value, say, 42 bytes, and then create it with new, these implementations are actually failing to obey the spec... but as you point out, there's no real way for them to be able to comply with the spec!

As i said in a previous post, in practice, new return a memory aligned on 8 ( because the standard require a alignment enough for long long, the largest standard type ) with 32 bits builds and align on 16 with 64 bits build ( not because of the C++ standard, but because of the x64 ABI ).

Of course, because we (programmers) do not like a random behavior, we override the things by ourself, so the allocator will be consistant on PC, X360 or PS3 ( + any other target ).

What if I were to create a class with nothing but an _AABB inside which is aligned to 16 and store this class inside the std::vector, and then access the _AABB inside for the frustum cull, would that work ?

edit: probably not... so how would I have to work around this?

How about storing it inside a non aligned container, and then create a new _AABB on the stack inside the frustum cull from the non aligned container ? Or is that bad/too slow ? edit: also doesn't work :p

std::vector simply has problems with aligned types, especially on MSVC.

Often the solution is to not use it, e.g. the Bullet physics middleware wrote a replacement called btAlignedObjectArray.

@Hodgman

What about making custom allocator for vector with _aligned_malloc/_aligned_free ?

I still don't understand this alignment thing, how does it work. Know any good tutorials for idiots?

This topic is closed to new replies.

Advertisement