• Create Account

### #ActualC0lumbo

Posted 23 December 2012 - 02:45 AM

I do think the software approach will beat the vertex shader approach, so you should probably go ahead and give that a shot.

But - I think you should be able to manage more than 16 cubes per batch. Firstly, you don't need a full 4x4 matrix - you can switch to 4x3 matrices and implicitly assume that the last row is 0, 0, 0, 1. You might need to transpose your matrices to achieve this.

Also, I don't think you need two arrays of matrices. I assume one of the matrices is for your normals, and one for the positions. But you can usually use the same matrix for your positions and your normals and simply ignore the position for the normals.

So, by my reckoning you should be able to manage 128/3=42 cubes each batch. Your GLSL code might end up looking a bit like this (untested, not compiled):

uniform float4 g_vMatrices[42*3];

...

vWorldPos.x = dot(g_vMatrices[iCubeIndexAttribute*3], float4(vPositionAttribute, 1.0));
vWorldPos.y = dot(g_vMatrices[iCubeIndexAttribute*3+1], float4(vPositionAttribute, 1.0));
vWorldPos.z = dot(g_vMatrices[iCubeIndexAttribute*3+2], float4(vPositionAttribute, 1.0));
vWorldNormal.x = dot(g_vMatrices[iCubeIndexAttribute*3], float4(vNormalAttribute, 0.0));
vWorldNormal.y = dot(g_vMatrices[iCubeIndexAttribute*3+1], float4(vNormalAttribute, 0.0));
vWorldNormal.z = dot(g_vMatrices[iCubeIndexAttribute*3+2], float4(vNormalAttribute, 0.0));

Oops - just realised you'd then need to transform the positions by the viewproj matrix which will take up a few more uniforms, so you'll end up with only 41 cubes per batch.

### #2C0lumbo

Posted 23 December 2012 - 02:43 AM

I do think the software approach will beat the vertex shader approach, so you should probably go ahead and give that a shot.

But - I think you should be able to manage more than 16 cubes per batch. Firstly, you don't need a full 4x4 matrix - you can switch to 4x3 matrices and implicitly assume that the last row is 0, 0, 0, 1. You might need to transpose your matrices to achieve this.

Also, I don't think you need two arrays of matrices. I assume one of the matrices is for your normals, and one for the positions. But you can usually use the same matrix for your positions and your normals and simply ignore the position for the normals.

So, by my reckoning you should be able to manage 128/3=42 cubes each batch. Your GLSL code might end up looking a bit like this (untested, not compiled):

uniform float4 g_vMatrices[42*3];

...

vWorldPos.x = dot(g_vMatrices[iCubeIndexAttribute*3], float4(vPositionAttribute, 1.0)));
vWorldPos.y = dot(g_vMatrices[iCubeIndexAttribute*3+1], float4(vPositionAttribute, 1.0)));
vWorldPos.z = dot(g_vMatrices[iCubeIndexAttribute*3+2], float4(vPositionAttribute, 1.0)));
vWorldNormal.x = dot(g_vMatrices[iCubeIndexAttribute*3], float4(vNormalAttribute, 0.0)));
vWorldNormal.y = dot(g_vMatrices[iCubeIndexAttribute*3+1], float4(vNormalAttribute, 0.0)));
vWorldNormal.z = dot(g_vMatrices[iCubeIndexAttribute*3+2], float4(vNormalAttribute, 0.0)));

Oops - just realised you'd then need to transform the positions by the viewproj matrix which will take up a few more uniforms, so you'll end up with only 41 cubes per batch.

### #1C0lumbo

Posted 23 December 2012 - 02:42 AM