Wrong HLSL buffer width

Started by
12 comments, last by komilll 4 years, 8 months ago

It's more of a discussion question than actual problem with code, because I managed to fix it.

I had VS shader with 2 constant buffers:


cbuffer MatrixBuffer : register(cb0)
{
    matrix worldMatrix;
    matrix viewMatrix;
    matrix projectionMatrix;
};

cbuffer ScreenSizeBuffer : register(cb1)
{
    float screenSize;
    float g_padding1;
    float g_padding2;
    float g_padding3;
};

Nsight says that MatrixBuffer is 192 width (correct) and ScreenSizeBuffer is 192 width (wrong, should be 16). screenSize variable is always equal to 1. Removing MatrixBuffer fixes a problem, ScreenSizeBuffer is back 16 width and screenSize value is passed correctly. Why is that? Why previous buffer width is passed to next buffer?

Advertisement

One way to be safe against arcane pack rules is using power of two structure sizes, so try filling MatrixBuffer with a fourth matrix for consistent size padding between CPU and GPU. Even thou the size of MatrixBuffer is already aligned by both float and matrix in an optimal way, C++ and Direct3D are still not guaranteed by standards to handle padding in the same way because any padding can be added to the end for optimization purposes without breaking the rules. A common use of power-of-two padding in hardware is element access using bit-shifts instead of multiplication. If the same size assumption is broken, the data on the GPU side may be packed incorrectly when one is stored after the other. This is completely nuts, but trial and error can prove correctness for a finite number of bits in a trivial upload.

You'll have to read the packing rules closely for clues on the exact reason, because I couldn't find it.
https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-packing-rules

There's no need to start packing any buffer types up to the next power of two. Some of HLSL's packing rules could certainly be described as 'arcane', but nothing ever causes a buffer to get aligned beyond 16 bytes in size.

If NSight thinks the constant buffer called "ScreenSizeBuffer" is 192 bytes then chances are it has a bug. I expect if you analyse the DXBC emitted by FXC then you'll find the metadata at the top of the compiler shader telling you that cb0 is 192 bytes and cb1 is 16 bytes.

Adam Miles - Principal Software Development Engineer - Microsoft Xbox Advanced Technology Group

@Adam Miles

Maybe problem was somewhere else then, maybe buffer was still 16 bytes. However it doesn't change a fact that value in that cbuffer was always set to 1 (Nsight was saying that but also the output made me 100% it has value equal to 1).

What you're describing sounds like a bug in your code that we don't have enough information on to help you try and fix. How did you fix it?

Adam Miles - Principal Software Development Engineer - Microsoft Xbox Advanced Technology Group

I stated in the first post:

Quote

Removing MatrixBuffer fixes a problem, ScreenSizeBuffer is back 16 width and screenSize value is passed correctly.

 

This whole topic is pointless if you don't show the full code. We can't tell what was wrong without seeing the full code.

https://github.com/komilll/LEngine/blob/1a415024affbcb838e3fecf9f5e99eb66477352b/LEngine/src/BlurShaderClass.cpp

https://github.com/komilll/LEngine/blob/1a415024affbcb838e3fecf9f5e99eb66477352b/LEngine/src/BaseShaderClass.cpp

https://github.com/komilll/LEngine/blob/1a415024affbcb838e3fecf9f5e99eb66477352b/LEngine/blurHorizontal.vs

BlurShaderClass inherits from BaseShaderClass.

In BlurShaderClass::SetShaderParameters I am setting matrix buffer (through BaseShaderClass::SetShaderParameters) and ScreenSizeBuffer.

Oh my god, you read the RasterTek tutorials. They are acceptable at explaining Direct3D but really bad at C++.

I can't find what's wrong in the code but I noticed some other stuff:

Save your shaders as .fx files, add them to the project, and let Visual Studio compile them. Then just load the compiled shaders in your code.

Don't name classes Class and structs Type.

Don't store XMMATRIX and XMVECTOR in structs if you don't know what their alignment is. Those types require 16 bytes alignment. You would need to use aligned_malloc for heap allocations. I don't think the buffer returned by Map is guaranteed to be aligned so your program might crash. Edit: It's actually guaranteed to be aligned to 16 bytes for feature level 10.0 and higher. Point is, I didn't know and you probably didn't know either.

I started with RasterTek tutorials but yea, I started noticing that there are many flaws regarding C++ and I was rewritting it to use C++11. But as I begin creating system with that convention, I continue to naming shader classes in that way.

What's the difference between .fx and .vs/.ps? Can I modify .fx on runtime and continue to use it?

I had no idea, I wasn't thinking about it back then. However, do you still recommend using std::array<float> or C-style array instead of storing XMFLOAT4 in buffer struct? Or is it ok for feature level 10.0?

This topic is closed to new replies.

Advertisement