Previous thread on this performance test
Highlights:
1) It's .Net/C#
2) It's a 3D game engine.
3) The objects are, as stated, 3D models (boxes, untextured, unlit, although both are possible).
4) The boxes are placed at regular intervals all around the camera.
5) None of them are instanced, although they all share the same mesh. None of them move, although the game engine has to support moving objects.
6) Using DirectX 9 (I want my engine to support from DX 9 up).
7) The speed profiler is SlimeTune.
8) One call is made per model and about 48% of the codes' time is spent in this parallel loop. (The profiler, of course, adds its own overhead, so the number may not be fully accurate). ms timing does not seem to be available.
9) The parallel code is tested faster than the single-threaded code. Singlethreaded gets about 22 FPS, compared to about 48 FPS multithreaded, when x = 27.
10) The MultiThreadedRenderer queues commands in a ConcurrentStack when told to Render and executes them on a call to .Finish( ) (not shown).
Code:
System.Threading.ThreadLocal<Effect> effect =
new System.Threading.ThreadLocal<Effect>( ) ;
var drawn = models.AsParallel( ).AsUnordered( ).Where(
model => IntersectionTests.AABBXFrustum( model.BoundingAABB, frustum )
).ToList( ) ;
// These two parallel loops are each as fast, +/- 1-2 FPS
/* drawn.AsParallel( ).AsUnordered( ).ForAll(
model => */
System.Threading.Tasks.Parallel.For(
0, drawn.Count, t =>
{
Model model = drawn[ t ] ;
if ( effect.Value != model.UseEffect )
{
effect.Value = model.UseEffect ;
foreach (var light in scene.Lights)
Renderer.Render( effect.Value, light ) ;
}
model.Draw( deltaTime, Renderer );
}
) ;
Questions:
1) Where does the slowdown come from?
2) How can I reduce or eleminate it?
Thoughts:
1) Cache coherency issue?
2) Inefficient partitioning?