I am rendering quad particles that turn around to stare at the camera any way you look. Until now I used a system in which a rotation matrix was the same for all the 1000 particles. The translation matrix to move each particle to its place was computed on the CPU.
In HLSL:
cbuffer PerInstanceBuffer: register(b10)
{
matrix translationMatrix[1000];
}
struct VertexShaderInput
{
float3 pos : POSITION;
uint instanceID : SV_INSTANCEID;
};
main (VertexShaderInput input)
{
float4 pos = mul(model, input.pos); if (instancingOn) { pos = mul(translationMatrix[input.instanceID], pos); }
...
}
Is it a better idea to just send a 1000 position offsets and have a function in the shader which computes (or rather just defines since no computation is required in this case) the translation matrix and then use it because the pipelined nature of the GPU will make it faster and doing it on the CPU?
Like this:
cbuffer PerInstanceBuffer: register(b10)
{
matrix instancePosition[1000];
}
struct VertexShaderInput
{
float3 pos : POSITION;
uint instanceID : SV_INSTANCEID;
};
matrix GetTranslationMatrix(float3 instancePosition)
{
... //calculate the matrix
return translationMatrix;
}
main (VertexShaderInput input)
{
float4 pos = mul(model, input.pos);
if (instancingOn)
{
pos = mul(GetTranslationMatrix(instancePosition[input.instanceID], pos);
}
...
}
Thanks.