I'm working on geometry sorting by material, distance, and transparency. So far I have a sort key that I plan to perform a radix sort on to sort by material ID, distance, and any other factors I might want to add. What I want to know though, is once I have this sorted, do I have to manage whether or not draw calls happen on my own? Once the list of renderables is sorted, is there a draw call cost associated with setting the same vertex shader for each object in the sorted list? For example: objects 1,2,3, and 4 all use the same shader. Once I've sorted, do I need to set the vertex shader only when I reach 1, without calling VSSetShader on the next ones to benefit from the sorting? Also, if I do have to prevent duplicate calls to VSSetShader, wouldn't the conditional statement to check if the current shader is equal to the one being set give me similar overhead? I would think there's some sort of safety check in either the DirectX API or the driver that would avoid loading the shader to the GPU if it was already set.
Cost of draw call to set existing state,
Members - Reputation: 1173
Posted 25 April 2014 - 07:06 PM
Moderators - Reputation: 49140
Posted 25 April 2014 - 07:39 PM
Ideally, the driver won't do a safety check because the game already has ;)
There's many ways that you can filter out redundant state. The simplest is just a big cache, where you only make the set call if the new value doesn't match the cache.
In my engine, I end up with a stream of polymorphic state/command objects, which is filtered -- some of the commands are skipped from being executed using a lot of fancy bitmasking. Some of the ones that are executed will then go on to use the caching strategy as well.
However, I would recommend using an #ifdef/etc to enable/disable your state caching code so that you can actually test it's value - whether the cache work is cheaper than the cost of redundant calls.
Crossbones+ - Reputation: 16451
Posted 25 April 2014 - 07:59 PM
Note that while yes, most drivers under the DirectX API (in its current form) will handle state changes as efficiently as it can (it doesn't "apply" state until you draw, usually, and checks for redundancies), you are vastly under-estimating the cost of calling into a DLL if you think it'll be as fast as your own check. Branches are best avoided as much as possible, but not if you're trading them in for something even slower.
Game Developer, C++ Geek, Dragon Slayer - http://seanmiddleditch.com
C++ SG14 "Games & Low Latency" - Co-chair - public forums
Wargaming Seattle - Lead Server Engineer - We're hiring!
Members - Reputation: 1173
Posted 26 April 2014 - 07:15 AM
Thanks for the help. I guess relying on DirectX to do things right would be a bad idea anyways once I want to port to Linux or MacOSX.