[DX12] Constant Buffer Packing

Started by
3 comments, last by ZachBethel 8 years, 2 months ago

Hey all,

I'm trying to wrap my head around some weird behavior I'm seeing in my compute shader. It looks to be related to packing and the nature of float4 vectors on GPUs, but it's super unintuitive.

Basically, I have a constant buffer (HLSL 5.1).


static const int SampleCount = 128;

struct OffsetData
{
    float2 Samples[SampleCount];
};

ConstantBuffer<OffsetData> offsetData : register(b1);

In C++, I have a similar layout:

struct OffsetData
{
    static const size_t SampleCount = 128;

    void Compute(float angle, float width, float height)
    {
        const float CoCMultiplier = CoCSizeMax * 0.05f;
        float x = 0.5f * CoCMultiplier * cosf(angle) * (height / width);
        float y = 0.5f * CoCMultiplier * sinf(angle);

        for (size_t i = 0; i < SampleCount; ++i)
        {
            float t = static_cast<float>(i) / (SampleCount - 1);
            samples[i][0] = Lerp(-x, x, t);
            samples[i][1] = Lerp(-y, y, t);
        }
    }

    float samples[SampleCount][2];
};

I then have a shader that renders the contents of the constant buffer to the screen. Basically, I map the current uv from [0, 1] -> [0, SampleCount - 1] and then return the contents of the constant buffer as the buffer color.

I get really weird results:

If I change everything to use floats (i.e. OffsetData.Samples is an array of SamplesCount floats), I get this:

[attachment=30560:float1.PNG]

This is indexing the constant buffer from 0 to SamplesCount - 1. It's basically skipping the buffer in increments of 4.

For float2:

[attachment=30561:float2.PNG]

Float3 (this one looks really weird, I don't even understand what happened):

[attachment=30562:float3.PNG]

And finally, the "correct" one where everything uses float4's:

[attachment=30563:float4.PNG]

Naturally, it seems like there's something inherent to vec4's going on. But it doesn't make sense. I should be able to index an array of floats, right? What am I missing?

Advertisement

Alright, I just found out something a little more illuminating.

If I change my constant buffer in HLSL to use a float2 array, but then force my C++ code to use a float4 array (where I only put data into the x and y components), everything works fine. What packing semantics am I missing? This just seems obtuse to me.

-Z

Arrays in HLSL constant buffers are stored without any packing and every element is stored as float4. This way there in no extra overhead for array indexing in shader. If you want to pack arrayd more tightly then simply pass them as float4[] and cast to float2[] in shader.

You seem to have discovered for yourself, and Krzysztof mentioned, how arrays are stored in HLSL constant buffers. A possible solution to this would be to just halve your sample count and use the full float4 for storage. Your example would turn into:


static const int SampleCount = 64;

struct OffsetData
{
    float4 Samples[SampleCount];
};

Then just index into it for the value you want:


[unroll]
for(uint i = 0; i < SampleCount; ++i)
{
	[unroll]
	for(uint j = 0; j < 4; ++j)
	{

		float currentDataPoint = Samples[i][j];
	}
}

You can update that loop to extract the float2 value if that's what you're after, the approach is essentially the same.

Edit: Here's some documentation. It's for SM4, but I'm almost positive it hasn't changed since https://msdn.microsoft.com/en-us/library/windows/desktop/bb509632(v=vs.85).aspx

Thanks for the link, I'll read up on it.

This topic is closed to new replies.

Advertisement