Memory alignment problem (CPU and GPU)

Graphics and GPU Programming Programming DX11

Started by theScore July 13, 2014 06:22 PM

8 comments, last by theScore 9 years, 9 months ago

theScore

158

Author

July 13, 2014 06:22 PM

Hi !

I work with DX11 and memory alignment is not correct (i am under windows 64 bit), this is what I have from CPU's program (variables declarations):


__declspec(align(16))
struct VertexInfo
{
	XMFLOAT4A positions;
	XMFLOAT4A normals ;
	XMFLOAT4A texCoords ;
	
};

And my input layout from cpu side :


D3D11_INPUT_ELEMENT_DESC layout[6];
	layout[indexLayout].SemanticName = "POSITION";
	layout[indexLayout].SemanticIndex = 0;
	layout[indexLayout].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;//16 bytes
	layout[indexLayout].InputSlot = 0;
	layout[indexLayout].AlignedByteOffset = 0;
	layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
	layout[indexLayout].InstanceDataStepRate = 0;
	indexLayout++ ;

	layout[indexLayout].SemanticName = "NORMAL";
	layout[indexLayout].SemanticIndex = 0;
	layout[indexLayout].Format = DXGI_FORMAT_R16G16B16A16_FLOAT;//8 bytes
	layout[indexLayout].InputSlot = 0;
	layout[indexLayout].AlignedByteOffset = D3D11_APPEND_ALIGNED_ELEMENT;
	layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
	layout[indexLayout].InstanceDataStepRate = 0;
	indexLayout++ ;

	layout[indexLayout].SemanticName = "COLOR";
	layout[indexLayout].SemanticIndex = 0;
	layout[indexLayout].Format = DXGI_FORMAT_R8G8B8A8_UNORM;//DXGI_FORMAT_R32G32B32A32_FLOAT a la place ?? //4 bytes
	layout[indexLayout].InputSlot = 0;
	layout[indexLayout].AlignedByteOffset = D3D11_APPEND_ALIGNED_ELEMENT;
	layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
	layout[indexLayout].InstanceDataStepRate = 0;
	indexLayout++ ;

	layout[indexLayout].SemanticName = "TEXCOORD";
	layout[indexLayout].SemanticIndex = 0;
	layout[indexLayout].Format = DXGI_FORMAT_R8G8B8A8_UNORM;//4 bytes
	layout[indexLayout].InputSlot = 0;
	layout[indexLayout].AlignedByteOffset = D3D11_APPEND_ALIGNED_ELEMENT;
	layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
	layout[indexLayout].InstanceDataStepRate = 0;
	indexLayout++ ;

	layout[indexLayout].SemanticName = "SV_POSITION";
	layout[indexLayout].SemanticIndex = 0;
	layout[indexLayout].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;//laisser à 32 ? //16 bytes
	layout[indexLayout].InputSlot = 0;
	layout[indexLayout].AlignedByteOffset = D3D11_APPEND_ALIGNED_ELEMENT;
	layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
	layout[indexLayout].InstanceDataStepRate = 0;
	indexLayout++ ;

	layout[indexLayout].SemanticName = "POSITION";
	layout[indexLayout].SemanticIndex = 1;
	layout[indexLayout].Format = DXGI_FORMAT_R32G32B32A32_FLOAT; //16 bytes
	layout[indexLayout].InputSlot = 0;
	layout[indexLayout].AlignedByteOffset = D3D11_APPEND_ALIGNED_ELEMENT;
	layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
	layout[indexLayout].InstanceDataStepRate = 0;
	indexLayout++ ;

And this is my vertex shader struct, the input paralmeters in hlsl:


struct vsIn
{
	float4 position : POSITION ;
 	float4 normal   : NORMAL ;
 	float4 color    : COLOR0 ;
	float4 texCoord : TEXCOORD0 ; 
};

Somebody can help me about data alignment, it seems to be wrong... If you need more infos/code, don't hesitate to ask me !

MJP

20,295

July 13, 2014 06:54 PM

You seem to have 2 problems:

1. You specify that your positions and normals are using a 16-bit float format, but your vertex struct contains 32-bit floats. You should use DXGI_FORMAT_R32G32B32A32_FLOAT, since that corresponds to the XMFLOAT4 type that you're using.

2. Your input layout has a "COLOR" element, but this element is not present in your VertexInfo struct.

The Blog | The Book

theScore

158

Author

July 13, 2014 07:38 PM

Hi !

Thanks for your answer, when you say "your vertex struct contains 32-bit floats" you are talking about hlsl program ?

if yes, I tried to put half4 instead float for normals (positions are 32 bit, so I let float4), I also put colors back in my code but I have the same result

Since I 'm using 16 bit normals, should I use another type instead of XMFLOAT4A ? (if yes, which one could I use ?)

swiftcoder

18,997

July 13, 2014 08:15 PM

Since I 'm using 16 bit normals, should I use another type instead of XMFLOAT4A ? (if yes, which one could I use ?)

XMHALF4, would likely be what you need.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Zaoshi Kaba

8,470

July 13, 2014 09:02 PM

Why do you have POSITION twice? Why do you even have SV_POSITION there? It might be allowed but really confusing.

21st Century Moose

13,459

July 13, 2014 09:36 PM

Why do you have POSITION twice? Why do you even have SV_POSITION there? It might be allowed but really confusing.

The second POSITION has a SemanticIndex of 1 so that's legal and can be used if you're doing e.g frame interpolation (although it's unusual to not see NORMAL also duplicated). SV_POSITION in the input layout is totally unnecessary. The CPU-side vertex struct, the input layout and the HLSL vertex struct also don't match each other, which gives the impression that this code is probably copy/paste from multiple different sources without understanding what's actually being done in each.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

theScore

158

Author

July 13, 2014 11:15 PM

There is a second POSITION semantic, because the first one is used in order to display the scene in itself, and the second one is used to store positions (I do the same thing for normals) which I get back in my pixel shader to really store them into surfaces. The purpose is to do deferred rendering, which I did before with directx 9 and works perfectly.

I am newer with directx 11 and I am writing this deferred renderer (DX9) to DX11, which has not the same requirements than DX9, notably for memory alignment for which it is more difficult for me to find a documentation which lights me completely about this subject.

So I try to do my own code but i am missing informations, and if I understand well, each type of data (positions, colors,...) must be in the same order ? And if understand well, their sizes occupied in memory have to be respected too ?

theScore

158

Author

July 14, 2014 01:09 AM

Maybe I should show you the vertex shader output structure :


struct vsOut
{
	float4 finalPos :SV_POSITION ;
	half4 normal   : NORMAL   ;
	half4 texCoord : TEXCOORD0 ;
	float4 color : COLOR0 ;
	float4 RTposition : POSITION1 ;
};

iedoc

2,554

July 16, 2014 02:04 PM

Just so you know, SemanticName can be whatever you want it to be, so the second position you have can have a little more descriptive name.

HLSL packs constant buffers into 16 byte chunks, so you need to make sure that you take that into account. for example, if your vertex structure looks like this:


struct VertexInfo
{
    int pos; // 4 bytes  
    XMFLOAT4A color; // i'm not sure, but 16 bytes here?
};

the above would throw an error, because it would take the pos variable (4 bytes), then the first 12 bytes of the color variable to finish the 16 byte block. you can fix it by either putting color first in the structure (12 bytes would be added to the end of the structure to complete the 16 byte alignment), or you can add padding between the pos and color variables:


struct VertexInfo
{
    int pos; // 4 bytes  
    int pad1; // 4 bytes  
    int pad2; // 4 bytes  
    int pad3; // 4 bytes  
    XMFLOAT4A color; // i'm not sure, but 16 bytes here?
};

looking at your code, your vertex structure looks like this:


struct VertexInfo
{
    XMFLOAT4A positions; // 16 bytes
    XMFLOAT4A normals ; // 16 bytes
    XMFLOAT4A texCoords ;    // 16 bytes
};

this is a total of 48 bytes, where each variable is 16 bytes. now look at the input layout you provided:


D3D11_INPUT_ELEMENT_DESC layout[6];
    layout[indexLayout].SemanticName = "POSITION";
    layout[indexLayout].SemanticIndex = 0;
    layout[indexLayout].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;//16 bytes
    layout[indexLayout].InputSlot = 0;
    layout[indexLayout].AlignedByteOffset = 0;
    layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
    layout[indexLayout].InstanceDataStepRate = 0;
    indexLayout++ ;

    layout[indexLayout].SemanticName = "NORMAL";
    layout[indexLayout].SemanticIndex = 0;
    layout[indexLayout].Format = DXGI_FORMAT_R16G16B16A16_FLOAT;//8 bytes
    layout[indexLayout].InputSlot = 0;
    layout[indexLayout].AlignedByteOffset = D3D11_APPEND_ALIGNED_ELEMENT;
    layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
    layout[indexLayout].InstanceDataStepRate = 0;
    indexLayout++ ;

    layout[indexLayout].SemanticName = "COLOR";
    layout[indexLayout].SemanticIndex = 0;
    layout[indexLayout].Format = DXGI_FORMAT_R8G8B8A8_UNORM;//DXGI_FORMAT_R32G32B32A32_FLOAT a la place ?? //4 bytes
    layout[indexLayout].InputSlot = 0;
    layout[indexLayout].AlignedByteOffset = D3D11_APPEND_ALIGNED_ELEMENT;
    layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
    layout[indexLayout].InstanceDataStepRate = 0;
    indexLayout++ ;

    layout[indexLayout].SemanticName = "TEXCOORD";
    layout[indexLayout].SemanticIndex = 0;
    layout[indexLayout].Format = DXGI_FORMAT_R8G8B8A8_UNORM;//4 bytes
    layout[indexLayout].InputSlot = 0;
    layout[indexLayout].AlignedByteOffset = D3D11_APPEND_ALIGNED_ELEMENT;
    layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
    layout[indexLayout].InstanceDataStepRate = 0;
    indexLayout++ ;

    layout[indexLayout].SemanticName = "SV_POSITION";
    layout[indexLayout].SemanticIndex = 0;
    layout[indexLayout].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;//laisser à 32 ? //16 bytes
    layout[indexLayout].InputSlot = 0;
    layout[indexLayout].AlignedByteOffset = D3D11_APPEND_ALIGNED_ELEMENT;
    layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
    layout[indexLayout].InstanceDataStepRate = 0;
    indexLayout++ ;

    layout[indexLayout].SemanticName = "POSITION";
    layout[indexLayout].SemanticIndex = 1;
    layout[indexLayout].Format = DXGI_FORMAT_R32G32B32A32_FLOAT; //16 bytes
    layout[indexLayout].InputSlot = 0;
    layout[indexLayout].AlignedByteOffset = D3D11_APPEND_ALIGNED_ELEMENT;
    layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
    layout[indexLayout].InstanceDataStepRate = 0;
    indexLayout++ ;

that is a total of 64 bytes. the problems is that for the first vertex you pass in, hlsl is expecting 64 bytes for it. you provide only 48. what happens is it will take the first 16 bytes of the next vertex. when you pass in all your vertices, its more like a chunk of data, not an array. hlsl grabs chunks from this data in increments of whatever you specified for the input layout.

so say you have 3 vertices. because of your vertex structure, you are passing in a total of 144 bytes.

HLSL will grab the first 64 bytes for the first vertex, which is your actual first vertex, and 16 bytes of the next vertex. that leaves you with 80 bytes of vertices left after hlsl grabs the first vertex. for the second vertex, it grabs another 64 bytes, which is the rest of your second vertex, and part of the last vertex, leaving you with 16 bytes. for the third vertex, it tries for another 64 bytes, but there is only 16 bytes left.

and your vs input just has to match with your input layout, which it does not here. (you have 6 parts to your input layout, and only 4 parts here)


struct vsIn
{
    float4 position : POSITION ;
    float4 normal : NORMAL ;
    float4 color : COLOR0 ;
    float4 texCoord : TEXCOORD0 ; 
};

So, if your looking for everything to work with the vs input structure you have above, you'll want this code:


struct VertexInfo
{
    XMFLOAT4A positions; // 16 bytes
    XMFLOAT4A normals ; // 16 bytes
    XMFLOAT4A color;    // 16 bytes
    XMFLOAT4A texCoord; // 16 bytes
};


D3D11_INPUT_ELEMENT_DESC layout[6];
    layout[indexLayout].SemanticName = "SV_POSITION";
    layout[indexLayout].SemanticIndex = 0;
    layout[indexLayout].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;//16 bytes
    layout[indexLayout].InputSlot = 0;
    layout[indexLayout].AlignedByteOffset = 0;
    layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
    layout[indexLayout].InstanceDataStepRate = 0;
    indexLayout++ ;

    layout[indexLayout].SemanticName = "NORMAL";
    layout[indexLayout].SemanticIndex = 0;
    layout[indexLayout].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;//16 bytes
    layout[indexLayout].InputSlot = 0;
    layout[indexLayout].AlignedByteOffset = D3D11_APPEND_ALIGNED_ELEMENT;
    layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
    layout[indexLayout].InstanceDataStepRate = 0;
    indexLayout++ ;

    layout[indexLayout].SemanticName = "COLOR";
    layout[indexLayout].SemanticIndex = 0;
    layout[indexLayout].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;//16 bytes
    layout[indexLayout].InputSlot = 0;
    layout[indexLayout].AlignedByteOffset = D3D11_APPEND_ALIGNED_ELEMENT;
    layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
    layout[indexLayout].InstanceDataStepRate = 0;
    indexLayout++ ;

    layout[indexLayout].SemanticName = "TEXCOORD";
    layout[indexLayout].SemanticIndex = 0;
    layout[indexLayout].Format = DXGI_FORMAT_R32G32B32A32_FLOAT;//16 bytes
    layout[indexLayout].InputSlot = 0;
    layout[indexLayout].AlignedByteOffset = D3D11_APPEND_ALIGNED_ELEMENT;
    layout[indexLayout].InputSlotClass = D3D11_INPUT_PER_VERTEX_DATA;
    layout[indexLayout].InstanceDataStepRate = 0;
    indexLayout++ ;

you can leave your vs input structure the way it is.

Also, SemanticName can be whatever you want, but there are a couple SemanticNames that are actually system value semantics, SV_POSITION is one of them. you can read more about them here: http://msdn.microsoft.com/en-us/library/windows/desktop/bb509647(v=vs.85).aspx#System_Value

Braynzar Soft - DirectX Tutorials

theScore

158

Author

July 17, 2014 06:56 PM

Thanks a lot, I just read your post, I am going to try to fix it.

What I find strange is that it is not easy to find the informations you gave me here, it should be easier, but thank you !

Memory alignment problem (CPU and GPU)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Memory alignment problem (CPU and GPU)

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines