Jump to content

  • Log In with Google      Sign In   
  • Create Account


#ActualHodgman

Posted 20 November 2012 - 05:47 AM

The 16-byte aligned version of the matrix probably utilizes the SSE2 instruction set (which is what they mean by P4 optimised).
The matrices are converted into this SSE-friendly format before the two matrix multiplications take place -- that's the significance, not the copying into the buffer afterwards.

Many engines that I've worked with do make use of 16-byte aligned vectors, matrices, and even floats, throughout all math-heavy parts of the code, so that SSE (or equivalent) instruction sets can be used in those parts of the code-base.

The ability to set multiple buffers in a single call to *SSetConstantBuffers is just an optimisation for the times when you would call it 3 times in a row. i.e. You can call it many times to bind resources to many different slots, but if you find that you're making multiple calls in a row, then you can instead pass an array to a single call to reduce the number of API calls you have to make.

2) Yes, updating the contents of a buffer is completely separate to binding that buffer to a slot. If you bind it to a slot, then update it, then it's still bound to the slot.
Basically all your call does is store a pointer to your buffer resource into an array, e.g. conceptually--
device.VertexShaderConstantBuffers[slot] = input;//just storing a pointer
// or
for( int i=0; i != NumBuffers; ++i )
  device.VertexShaderConstantBuffers[i+StartSlot] = input[i];

To match up the slots correctly when binding buffers, you've got 2 choices:
a) After you compile your shader, reflect on the binary to find which slots your named cbuffers have ended up in.
b) When writing your shaders, manually specify which slot a cbuffer is designated with the register keyword
cbuffer MyData : register(b7) // "MyData" buffer should be bound to slot #7
{
};

#2Hodgman

Posted 20 November 2012 - 05:44 AM

The 16-byte aligned version of the matrix probably utilizes the SSE2 instruction set (which is what they mean by P4 optimised).
The matrices are converted into this SSE-friendly format before the two matrix multiplications take place -- that's the significance, not the copying into the buffer afterwards.

Many engines that I've worked with do make use of 16-byte aligned vectors, matrices, and even floats, throughout all math-heavy parts of the code, so that SSE (or equivalent) instruction sets can be used in those parts of the code-base.

The ability to set multiple buffers in a single call to *SSetConstantBuffers is just an optimisation for the times when you would call it 3 times in a row. i.e. You can call it many times to bind resources to many different slots, but if you find that you're making multiple calls in a row, then you can instead pass an array to a single call to reduce the number of API calls you have to make.

2) Yes, updating the contents of a buffer is completely separate to binding that buffer to a slot. If you bind it to a slot, then update it, then it's still bound to the slot.
Basically all your call does is store a pointer to your buffer resource in an array, e.g. conceptually--
device.VSConstantBuffers[0] = g_pConstantBuffer10;//just storing a pointer

To match up the slots correctly when binding buffers, you've got 2 choices:
a) After you compile your shader, reflect on the binary to find which slots your named cbuffers have ended up in.
b) When writing your shaders, manually specify which slot a cbuffer is designated with the register keyword
cbuffer MyData : register(b7) // "MyData" buffer should be bound to slot #7
{
};

#1Hodgman

Posted 20 November 2012 - 05:43 AM

The 16-byte aligned version of the matrix probably utilizes the SSE2 instruction set (which is what they mean by P4 optimised).
The matrices are converted into this SSE-friendly format before the two matrix multiplications take place -- that's the significance, not the copying into the buffer afterwards.

Many engines that I've worked with do make use of 16-byte aligned vectors, matrices, and even floats, throughout all math-heavy parts of the code, so that SSE (or equivalent) instruction sets can be used in those parts of the code-base.

The ability to set multiple buffers in a single call to *SSetConstantBuffers is just an optimisation for the times when you would call it 3 times in a row. i.e. You can call it many times to bind resources to many different slots, but if you find that you're making multiple calls in a row, then you can instead pass an array to a single call to reduce the number of API calls you have to make.

2) Yes, updating the contents of a buffer is completely separate to binding that buffer to a slot. If you bind it to a slot, then update it, then it's still bound to the slot.
Basically all your call does is store a pointer to your buffer resource in an array, e.g.
device.VSConstantBuffers[0] = g_pConstantBuffer10;//just storing a pointer
To match up the slots correctly when binding buffers, you've got 2 choices:
1) After you compile your shader, reflect on the binary to find which slots your named cbuffers have ended up in.
2) When writing your shaders, manually specify which slot a cbuffer is designated with the register keyword
cbuffer MyData : register(b7) // "MyData" buffer should be bound to slot #7
{
};

PARTNERS