constant buffer data layout mismatch between app/shader

Started by
7 comments, last by MJP 10 years, 8 months ago

I started making a basic D3D11 application for Win8 without the Effects framework, everything was ok until I tried to pass constant buffers data to the shader, the data I'm passing to the shader doesn't match the actuall data the shader is reading.

I have a self made Matrix structure, and a struct containing a float value and the Matrix structure:


struct Matrix
{
    float m00, m01, m02, m03;
    float m10, m11, m12, m13;
    float m20, m21, m22, m23;
    float m30, m31, m32, m33;
};


struct ConstantBuffer
{
    float  value1;
    Matrix value2;
} g_CBData;

.

After creating the constant buffer object, y update the data in my shader using:


//---------- update constant buffer ----------//
Matrix m =
{
  1.0f, 1.0f, 1.0f, 1.0f,
  0.0f, 1.0f, 0.0f, 0.0f,
  0.0f, 0.0f, 1.0f, 0.0f,
  0.0f, 0.0f, 0.0f, 1.0f
};

g_CBData.value1 = 1.0f;
g_CBData.value2 = m;

D3D11_MAPPED_SUBRESOURCE mappedResource;
g_pDeviceContext->Map(g_pConstantBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &mappedResource);
CopyMemory(mappedResource.pData, &g_CBData, sizeof(ConstantBuffer) );
g_pDeviceContext->Unmap(g_pConstantBuffer, 0);

g_pDeviceContext->VSSetConstantBuffers( 0, 1, &g_pConstantBuffer );

.

However, if I try to read the data I'm sending to the vertex shader using something like this:


buffer cbPerFrame : register (b0)
{
    float  value1;
    float4x4 value2;
};


struct VSInput
{
    float3 Pos : POSITION;
    float4 Col : COLOR;
};


struct VSOutput
{
    float4 Pos : SV_POSITION;
    float4 Col : COLOR;
};


VSOutput main( VSInput vs_in )
{
    VSOutput vs_out;

    vs_out.Pos = float4( vs_in.Pos, 1 );
    vs_out.Col = vs_in.Col * value2._m11; // <- checking "value1" and the values in "value2"

    return vs_out;
}

.

the data layout:simply does not match to what I'm sending to it!

The weird thing is that if I use in the constant buffer ONLY the "float value1" or "Matrix value2", the shader reads the corresponding data properly (the Matrix gets transposed, as I read it is expected), but if I include in the constant buffer structure both values, it looks like the binary layout simply gets messed up inside the vertex shader.. with apparently no logical order.

so.... what am I doing wrong?... how am I supposed to pass the data to the shader so I can read it properly?

Thanks!

"lots of shoulddas, coulddas, woulddas in the air, thinking about things they shouldda couldda wouldda donne, however all those shoulddas coulddas woulddas ran away when they saw the little did to come"
Advertisement
The padding/packing/alignment rules are different in HLSL and C++, so your two structs aren't equivalent. The HLSL matrix will be aligned to a 16byte boundary, so your 'float' value needs to have something like 'float padding[3];' after it in the C++ version.
The transpose for the matrix is needed since the mul wants it in column-major order. Also, you have a (probably) padding mismatch between your C++ struct and the shader constant. In HLSL, if a (vector) value crosses a float4 border, it will be aligned to the next float4. Your C++ struct should look something like this.



struct ConstantBuffer
{
    float  value1;
    float pad1, pad2, pad3; // dummy pads
    Matrix value2;
} g_CBData;
I haven't checked this. But you can check the offsets with the C++ offsetof macro and on the shader side by looking at the HLSL assembly (or shader reflection). Use fxc.exe with the option /Fx to dump the assembly.

Edit: Ahhh, beaten wink.png

Aahhh!... thankyou Hodgman, unbird.

yeah, it was the padding on the C++ struct the one causing all the trouble.

uhmmm... so I think I will need to find a nice way to force every variable in the C++ structure to be aligned to 16-bytes.

Thanks!

"lots of shoulddas, coulddas, woulddas in the air, thinking about things they shouldda couldda wouldda donne, however all those shoulddas coulddas woulddas ran away when they saw the little did to come"

uhmmm... so I think I will need to find a nice way to force every variable in the C++ structure to be aligned to 16-bytes.

While not very programmatically sofisticated:


struct ConstantBuffer
{
    Matrix value2;
    float  value1;
} g_CBData;

I mean, since the order doesn't really matter (if you switch it in the shader, too), why not switch it around? I always try to sort my constants so that no dummy-padding or else is needed.

Plus, if you don't want the matrix to be transposed, add:


#pragma pack_matrix( row_major )

at the top of every shader file you want the matrix to work as passed in.

just to complete the post (for anyone else who might run into the same problem and founds this post).

After making some experiments with the data layout, it looks like when using more then one data type whose length is less that 16-bytes, the GPU packs them into the same 16-byte "array".

What I mean is:

If you want to pass properly the data to a constant buffer like this in your shader:


buffer cbPerFrame : register (b0)
{
    float value1;
    float value2;
    float4x4 value3;
};

.

I first thought on giving each independent value a padding like this:


struct ConstantBuffer
{
    float value1;
    float value1_dummy[3];
    float value2;
    float value2_dummy[3];
    Matrix value3;
};

.

Expecting the GPU would read the data with a layout like this:


V1 xx xx xx
V2 xx xx xx
V3 V3 V3 V3
V3 V3 V3 V3
V3 V3 V3 V3
V3 V3 V3 V3

.

However, it turned out that what the GPU is expecting is something like this:


V1 V2 xx xx
V3 V3 V3 V3
V3 V3 V3 V3
V3 V3 V3 V3
V3 V3 V3 V3

.

So, you would have to write your C++ struct like this:


struct ConstantBuffer
{
    float value1;
    float value2;
    float value2_dummy[2];
    Matrix value3;
};

=)

just wanted to point it out if anyone runs into the same trouble =P

Cheers!

"lots of shoulddas, coulddas, woulddas in the air, thinking about things they shouldda couldda wouldda donne, however all those shoulddas coulddas woulddas ran away when they saw the little did to come"
I meant to post some links to the docs about this earlier, but posting from a phone makes me lazy ;)

packing rules, packoffset

The issue is that the float4x4 needs to be aligned to 16-bytes to satisfy the constant buffer packing rules (described in the link Hodgman provided). This is why the two float variables are packed normally, and then 8 bytes of padding is inserted. I'd suggesting reading that documentation and becoming familiar with the exact rules.

Note that on the C++ side of things, you can also achieve the same result by having the compiler align the matrix for you using __declspec(align(16)) when defining your Matrix type, or when declaring the Matrix member of your struct.

Plus, if you don't want the matrix to be transposed, add:

#pragma pack_matrix( row_major )

at the top of every shader file you want the matrix to work as passed in.

You can also use row_major when declaring the float4x4 in your constant buffer, or use the /Zpr command-line option for fxc.exe.

This topic is closed to new replies.

Advertisement