Variable size strucs

Started by
15 comments, last by Laval B 8 years, 6 months ago

Hello everyone

I'm trying to optimize the drawitem structure i use for my renderqueue. I use a struct that basically containes the same elements as the one posted by Hodgman in this post http://www.gamedev.net/topic/666419-what-are-your-opinions-on-dx12vulkanmantle/#entry5215127. I'm trying to reduce the footprint of the struct and to get all the members to be contiguous in memory to improve cache efficiency.

Replacing the pointers by 16 bits unsigned integers which are indices into arrays will considerably reduce the size of the fixed-length part of the struct (i compile for 64 bits).

I don't know how to deal with the variable part of the struct. I though about using variable size structs like the one bellow :


struct DrawItem
{
   u16 shaderPermutationId;
   u16 depthStencilStateId;
   // ...
   u16 textureCount;
   u16 textureIds[];
   // Cannot put more arrays or anything else after that.
};

But like the comment says, it's not possible to have any member after a variable size array.

I'm interested to know what could be done.

We think in generalities, but we live in details.
- Alfred North Whitehead
Advertisement
You can write code like this, which adds a small cost to access the pointer:
struct DrawItem
{
   u16 shaderPermutationId;
   u16 depthStencilStateId;
   u16 textureCount;
   u16 cbufferCount;
   u16* TextureIds() { return (u16*)(this+1); }
   u16* CBufferIds() { return (u16*)(TextureIds()+textureCount); }
   u32 Sizeof() const { return sizeof(DrawItem) + sizeof(u16)*textureCount + sizeof(u16)*cbufferCount; }
};
Alignment issues, beware.

Alignment issues, beware.

Yes, i will need to be careful with alignment but this is interesting.

We think in generalities, but we live in details.
- Alfred North Whitehead

You can write code like this, which adds a small cost to access the pointer:




struct DrawItem
{
   u16 shaderPermutationId;
   u16 depthStencilStateId;
   u16 textureCount;
   u16 cbufferCount;
   u16* TextureIds() { return (u16*)(this+1); }
   u16* CBufferIds() { return (u16*)(TextureIds()+textureCount); }
   u32 Sizeof() const { return sizeof(DrawItem) + sizeof(u16)*textureCount + sizeof(u16)*cbufferCount; }
};
Alignment issues, beware.


I would highly recommend overriding new and delete for any struct that stores data past the end (or use a factory function and hide all the constructors). Not to mention disabling copy and move by deleting the copy and move constructors.

Just because a struct like this is far from a POD and is very easy to misuse.

I would highly recommend overriding new and delete for any struct that stores data past the end (or use a factory function and hide all the constructors). Not to mention disabling copy and move by deleting the copy and move constructors.

Just because a struct like this is far from a POD and is very easy to misuse.

Yes absolutely. In this case a factory method might be more appropriate since there will be arguments to pass.

We think in generalities, but we live in details.
- Alfred North Whitehead

After toying with this a little, i don't think it is possible to do that.

The struct i used for testing is the following


struct MultiArray
{

    u16 arr1Count;
    u16 arr2Count;

    u16* arr1()  { return reinterpret_cast<u16*>(this + 1); }
    u16* arr2()  { return reinterpret_cast<u16*>(arr1() + arr2Count); }
};

I allocate the memory using (with arr1Count_ = 2 and arr2Count_ = 3)


::operator new(sizeof(MultiArray) +  sizeof(u16)*arr1Count_ + sizeof(u16)*arr2Count_, std::nothrow) 

which i then cast into a pointer of the type of the struct. Then i can fill the struct using the pointer then access the content without any problem. But when i call ::operator delete(ptr, std::nothrow), i get a runtime check error telling me that i have written pass the end of the heap buffer (which is technically true).

The size given by sizeof(MultiArray) + sizeof(u16)*arr1Count_ + sizeof(u16)*arr2Count_ is the same as the one i get if i call sizeof with a struct containig two u16 values, a u16 [2] and a u16[3] (i.e. 14) so alignment shouldn't be the problem.

It's ok, the size is 14 but the alignment (for this struct) is 2 so i need to allocated 2 more bytes and it works fine. I just need to take care of the alignment properly (and to find how i can get the proper alignment for a given type).

We think in generalities, but we live in details.
- Alfred North Whitehead
It also occurs to me you'll want your counts to be const - there is no reason they should ever be modified once set via the constructor as you can't resize your arrays.

struct MultiArray
{
    u16 arr1Count;
    u16 arr2Count;

    u16* arr1()  { return reinterpret_cast<u16*>(this + 1); }           //vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
    u16* arr2()  { return reinterpret_cast<u16*>(arr1() + arr2Count); } //<--- this should be (arr1() + arr1Count).
};                                                                      //^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

arr1() + arr2Count

This is a bug (should be arr1Count to compute the end pointer of arr1); you are writing past the end of your allocation by 2 bytes!
[edit]ninja'd by sotl!

I would highly recommend overriding new and delete for any struct that stores data past the end (or use a factory function and hide all the constructors).

I don't override new/delete because with these kinds of packed/variable-sized structures, I usually want to put more than one of them into a single allocation. So I usually end up with a function that lookw like a simple constructor, but actually just returns the size that's required. You can then loop through all the objects that you want to create, measure their requirements, perform one single big allocation, then do a 2nd pass where you actually construct them in-place.

I also wouldn't include structures like this in any public header files -- the public API would likely just have a forward declaration of the type ("struct DrawItem;"), a factory for creating them, and functions that ise/consume them. That way all this dangerous variable size stuff isn't even visible to the user of the code.

arr1() + arr2Count

This is a bug (should be arr1Count to compute the end pointer of arr1); you are writing past the end of your allocation by 2 bytes!

Thank you, for some reason, i totally missed it wacko.png

We think in generalities, but we live in details.
- Alfred North Whitehead

This topic is closed to new replies.

Advertisement