Sign in to follow this  

Optimizing number of Draw Calls for Hardware Skinning

This topic is 3848 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey there, I've been trying to figure out an optimal solution for creating the least number of draw calls for a hardware skinned mesh. For models that have a lot more bones than can be input into the vertex shader registers for hardware skinning, this is very vital. So far, I've noticed two methods to do this, the first one is really bad at reducing the number of draw calls, the second brute force method is too slow. method1: for each face you see, check the bones that influence it and push those bones onto a stack. When stack gets larger than MAXBONESPERDRAWCALL, take all the faces you've visited and that will be a draw call. This algorithm is very bad because faces may come in randomly forcing you to batch feet and hand and head faces, effectively drawing little pieces of everything. In my experience I've seen this method generate 30 draw calls for a model with only 40 bones... method2: brute force for optimal solution. To do this you would have to take all permutations of MAXBONESPERDRAWCALL and the number of bones in the model, then for each permutation, calculate how any faces those draw calls would be, and the permutation with the most faces would be the ideal candidate. But we're not done here, we have to check what's left over and do the permutation thing again for the left over faces. We in fact have to check in a tree like pattern to choose the correct one. I'll offer an analogy to the reason we can't rely on choosing the first permutation with the biggest number of faces. Let's say you have a bag that holds ten pounds and you want to put into it, items of greatest value. Let's say you have 3 items, one is 8lbs, $9, and two other items, each 5lbs, $5. The best choice item to put in first is the 8lb item since it's cost/weight ratio is higher. But it is not the optimal solution because you'd be left with 2lbs of space with nothing else to put in, making your bag value 9$ instead of putting in both 5lb items to get a total of $10. So this algorithm just takes to long to look at all the permutations and paths that lead to the optimal solution. Is there any other method you guys are using? I think directx does it with meshes, but I'm not sure how optimal their solution is. Regardless, I'm looking for a non-api dependent solution. If you guys can point me to any papers of articles of interest please do. I'll be glad to share any implementation details if I can get this working. Thanks! -Marv

Share this post


Link to post
Share on other sites
It sounds like you could achieve much better locality if you flip your first method on its head. Instead of traversing the faces and putting the bones in the list, traverse the bones and put affected vertices in a list. This way you never need to process a particular bone's influence more than once, and your batches should be larger (and thus you'll have fewer overall batches). This operation can be done as a preprocessing step. You can either do it at startup / mesh load time, or you can do it once at startup and cache the results (until the user changes their GPU or something else that might possibly affect the number of shader constants).

Share this post


Link to post
Share on other sites
Hey Jpetrie, you're right, it wouldn't be as random if I just went by bones first. So I did that quickly earlier and got a general reduction of 3 times. Here's what I did:

create a mapping of bones to faces that each bone influences.

-select 29 bones from the mapping of bones to faces
-loop through all faces linked to each bone
--for each face, check if it can be rendered using the current selecte dlist of 29 bones(all influences must be in that list, I have it as a set so I can do a quick find query). If it can be rendered, unlink it from the bone and add it to the current submesh face list. The face may have been added previously, so perhaps find the face in the list before comparing against the bones.
-Now we have a submesh face list, push this submesh into the list of submeshes to render
-Loop through all bones in the map of bones to faces, find any bones with no more linked faces(all faces have been unlinked and added to a submesh). Remove these bones from the map
-Repeat

I'm getting around 5 Drawcalls now for 70 bone model instead of 15 using a max bones per frame of 29. I think if you select the bones according to their hierarchical order, it may select better bones to render and maybe take 1 draw call out.

Thanks, that should be cool for now.

Share this post


Link to post
Share on other sites

This topic is 3848 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this