Using the ARB_multi_draw_indirect command
When it comes to performance and usage etc this is worth a watch Approaching Zero Driver Overhead.
That's an awesome presentation.
As a rule of thumb, if you're blocking on swap/flip/present, you're either waiting for a vblank if you're vsyncing, or you're waiting for the GPU to catch up.
then an additional ~8 ms is spent blocking on buffer swap (= waiting for the driver to complete the queued commands, e.g. C code (or something) in the driver)Measurements of "GPU load" are very misleading. You can be bottlenecked by the GPU without seeing it report 100% load...
Threaded optimization off:
51 FPS
Threaded optimization on:
61 FPS
GPU load: ~71%
I did not change a single line in my program. Threaded optimization on simply moves the cost of the render calls to the driver server thread (see the slides I posted above), and if the server thread lags behind it causes it to block on buffer swaps.
Remember to use timestamp queries for GPU timing. tmason, I think what you want to do is put all of your material parameters together in an array, and load it into a single buffer. Then you want to make a single MultiDraw call, and use DrawID in the shader to choose the correct material. That way you won't have that loop of calls anymore.
Great, worth a shot. And this seems like something I can do even without using "ARB_multi_draw_indirect".
Of course, that command seems awesome but I can experiment slowly.