Cost of draw call to set existing state

Started by
2 comments, last by 3DModelerMan 9 years, 12 months ago

I'm working on geometry sorting by material, distance, and transparency. So far I have a sort key that I plan to perform a radix sort on to sort by material ID, distance, and any other factors I might want to add. What I want to know though, is once I have this sorted, do I have to manage whether or not draw calls happen on my own? Once the list of renderables is sorted, is there a draw call cost associated with setting the same vertex shader for each object in the sorted list? For example: objects 1,2,3, and 4 all use the same shader. Once I've sorted, do I need to set the vertex shader only when I reach 1, without calling VSSetShader on the next ones to benefit from the sorting? Also, if I do have to prevent duplicate calls to VSSetShader, wouldn't the conditional statement to check if the current shader is equal to the one being set give me similar overhead? I would think there's some sort of safety check in either the DirectX API or the driver that would avoid loading the shader to the GPU if it was already set.

Advertisement
Yes, you should avoid duplicated/redundant state setting.
Ideally, the driver won't do a safety check because the game already has ;)

There's many ways that you can filter out redundant state. The simplest is just a big cache, where you only make the set call if the new value doesn't match the cache.

In my engine, I end up with a stream of polymorphic state/command objects, which is filtered -- some of the commands are skipped from being executed using a lot of fancy bitmasking. Some of the ones that are executed will then go on to use the caching strategy as well.

However, I would recommend using an #ifdef/etc to enable/disable your state caching code so that you can actually test it's value - whether the cache work is cheaper than the cost of redundant calls.
Not only is there overhead in setting state, but of course in the draw calls themselves. It is usually beneficial to group runs of identical objects (material, mesh, etc.) and use instanced rendering to draw them.

Note that while yes, most drivers under the DirectX API (in its current form) will handle state changes as efficiently as it can (it doesn't "apply" state until you draw, usually, and checks for redundancies), you are vastly under-estimating the cost of calling into a DLL if you think it'll be as fast as your own check. Branches are best avoided as much as possible, but not if you're trading them in for something even slower.

Sean Middleditch – Game Systems Engineer – Join my team!

Thanks for the help. I guess relying on DirectX to do things right would be a bad idea anyways once I want to port to Linux or MacOSX.

This topic is closed to new replies.

Advertisement