With a raw draw calls I mean: pure #draw calls, without any optimisations such as instancing or culling.
EDIT: and I mean draw calls with a simple effect and vertexbuffer, like a textured rectangle.
My engine can only handle about 1k draw calls in order to sustain a smooth framerate. I have been profiling the hell out of my engine with intel vtune, amd codeanalyst, pix, and the default VS profiler, but I just can't seem to find the problem!
Another peculiar thing is that an ID3D10EffectPass::Apply() seems to take longer than most draw() calls.
After some tests, ID3D10EffectPass::Apply() doesn't do what msdn says: (Set the state contained in a pass to the device.)
If I apply() before I commit my shader variables, my variables won't be updated. This implies that when a technique only contains 1 pass, we can not apply() per material but are forced to do this per mesh.
If anyone made a pretty performance-concerned rendering engine for PC with Direct3D10+, can you please check how many raw draw calls it can handle, and if the CPU spends more time doing Apply() than Draw()?
Can you apply per material instead of per mesh, when a technique only contains one pass? (not according to my tests, while a lot of people say that this would be an optimization I could make)
And does anyone have any idea why my engine would only do 1k draw calls? My algorithms are all tested for computational complexity etc, so it's probably not that!
Edited by Xcrypt, 17 May 2012 - 09:38 AM.