Uniform buffer array alignment in HLSL vs SPIR-V

Started by
12 comments, last by GuyWithBeard 6 years, 6 months ago

Hi!

I have the following HLSL shader code:


struct Light
{
    float4  positionWS;
    //-------------------------- ( 16 bytes )
    float4  directionWS;
    //-------------------------- ( 16 bytes )
    float4  positionVS;
    //-------------------------- ( 16 bytes )
    float4  directionVS;
    //-------------------------- ( 16 bytes )
    float4  color;
    //-------------------------- ( 16 bytes )
    float   spotlightAngle;
    float   range;
    float   intensity;
    uint    type;
    //-------------------------- ( 16 bytes )
    bool    enabled;
    float3  padding1;
    //-------------------------- ( 16 bytes )
    //-------------------------- ( 16 * 7 = 112 bytes )
};

layout(set=0,binding=0) cbuffer cbPerRenderOperationData : register(b0)
{
    Light   gLights[MAX_LIGHT_COUNT];
};

MAX_LIGHT_COUNT is 16. On DX11/12 the code gets compiled to D3D bytecode, and on Vulkan I compile it to SPIR-V using the HLSL frontend for glslang. On DX everything works fine. Looking at the buffer in RenderDoc shows me that it occupies 112 * 16 = 1792 bytes as I would expect. However, on Vulkan only the first light is valid. The rest are garbage. RenderDoc shows that the uniform buffer occupies 2048 bytes rather than 1792, which suggests to me that the size of Light in SPIR-V is 128 and not 112.

Is there some power-of-two requirement in SPIR-V I am not aware of? So far I have just made sure every element in an array is a multiple of 16 bytes. Is that not enough?

Thanks!

Advertisement

Are you sure that you are using a 4 byte bool for the enabled property everywhere? I know HLSL bool is 4-byte, but do you use 4 bytes for it on the CPU side as well? And is it 4 bytes in SPIR-V?

Yes, it is an uint32_t on the CPU side. Also note that the struct in SPIR-V is LARGER, not smaller, which it would be if bool in SPIR-V would be less than 4 bytes.

I guess your struct falls into the std140 (but you want and expect 430), see there at 'structs':

 https://github.com/Microsoft/DirectXShaderCompiler/blob/master/docs/SPIR-V.rst

I never understood what 140 is (or was) good for.

 

Thanks JoeJ, that's interesting. However, looking at this document: https://www.khronos.org/opengl/wiki/Interface_Block_(GLSL) , under Buffer Backed - Memory Layout:

Quote

The rules for std140 layout are covered quite well in the OpenGL specification (OpenGL 4.5, Section 7.6.2.2, page 137). Among the most important is the fact that arrays of types are not necessarily tightly packed. An array of floats in such a block will not be the equivalent to an array of floats in C/C++. The array stride (the bytes between array elements) is always rounded up to the size of a vec4 (ie: 16-bytes). So arrays will only match their C/C++ definitions if the type is a multiple of 16 bytes.

Using that struct I should be within the requirements of std140, right? The size of Light is a multiple of 16 bytes.

I think your individual floats might become aligned to 16 bytes inside a single struct, same for int and bool... something like this.

I don't think any current hardware has an advantage from this and wonder why it has been dragged over to VK. OpenGL has functions that help to get the correct offsets. Dealing with this destroyed friendship between me and GL, sniff. :)

Can you enforce std430 somehow? (using GLSL we can set the standart individually for each buffer)

 

Just now, JoeJ said:

Can you enforce std430 somehow?

From the same article I linked above:

Quote

Note that [std430] can only be used with shader storage blocks, not uniform blocks.

 

Oh ok, remember...

I'm too inexperienced with graphics to help further, but what you could do is: Use 7 x float4 instead structs and do indexing / type conversation yourself.

I can confirm that it does work if I change the struct to this:


struct Light
{
    float4  positionWS;
    //-------------------------- ( 16 bytes )
    float4  directionWS;
    //-------------------------- ( 16 bytes )
    float4  positionVS;
    //-------------------------- ( 16 bytes )
    float4  directionVS;
    //-------------------------- ( 16 bytes )
    float4  color;
    //-------------------------- ( 16 bytes )
    float4  stuff1;
    //-------------------------- ( 16 bytes )
    float4  stuff2;
    //-------------------------- ( 16 bytes )
    //-------------------------- ( 16 * 7 = 112 bytes )
};

and then pack the data into the "stuff" vectors. The size is the same, ie. 1792 bytes on both Vulkan and DX. Curious, I will have to dig a bit deeper...

If vulkan rules are anything like opengl, your problem is probably in your last line! You have a bool followed by a float3... the float3 alignment rules would make it to add 3 units (12 bytes) padding following the bool value!

float3 followed by a float = 12 bytes + 4 bytes (float3 are aligned to 16 bytes and float to 4), but...

float followed by float3 = 4 bytes + 12 bytes padding + 12 bytes + 4 bytes padding

This topic is closed to new replies.

Advertisement