Jump to content
  • Advertisement
Sign in to follow this  
mrmurder

gpu skinning +4 bones

This topic is 2336 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello,
i would like to hear from anyone who have ideas on how to implement gnu skinning with more than 4 bones.
i've been working with 4 bones per vertex but now i would like to know if i could remove this limitation in someway?

thanks in advance

Share this post


Link to post
Share on other sites
Advertisement
Just pass more than 4 indices/weights per vertex to the shader?

As a side note, 4 bones per vertex is enough in most games. Edited by Waaayoff

Share this post


Link to post
Share on other sites
Hey Wayoff,
After posting it came to mind that i could send another pack of 4 components. I will give this a try and see how this works out!
thanks for the reply!

Share this post


Link to post
Share on other sites
You should be able to pack as many bones + weights as you want in your vertex. If you're on relatively modern hardware (DX10-class or higher) you may want to branch on the weight being > 0 before fetching the bone matrix as an optimization.

Share this post


Link to post
Share on other sites
Hey MJP,
The hardware is dx10+. currently I'm sending 2 vec4 along with the vertex layout, e.g. TEXTURE2, TEXTURE3 that gives me 8 bones influences plus 2 others for indices.

Would there be a better way to do this? in a way i could save the 4 texture units?

Share this post


Link to post
Share on other sites
May be you should send the bone matrix and other datas to GPU as a texture and then blend them with shader ...(Fragment Shader) ..
As you know there're many extra work to do with vectex data..

And it's true that there's no need for 4+ bones ....

Share this post


Link to post
Share on other sites

You should be able to pack as many bones + weights as you want in your vertex. If you're on relatively modern hardware (DX10-class or higher) you may want to branch on the weight being > 0 before fetching the bone matrix as an optimization.


eh?, i was under the impression that overall it's cheaper to do the mathematics, then to do such branching? granted modern hardware is more capable of dealing with such branching, i was simply under the impression that it's overall cheaper for the gpu to do the matrix math, then to do any branching on dynamic data sets. Edited by slicer4ever

Share this post


Link to post
Share on other sites

eh?, i was under the impression that overall it's cheaper to do the mathematics, then to do such branching? granted modern hardware is more capable of dealing with such branching, i was simply under the impression that it's overall cheaper for the gpu to do the matrix math, then to do any branching on dynamic data sets.


This depends on the granularity of the work.

In this case the choices are 'do work' or 'skip'; if all the threads in an execution group can skip the work then you get a net win as doing no work is faster than looping over 4 zero influence bones. If only some skip the work then you are, more or less, no worse off as the 'else' branch is skipping work.

The problems can happen when you have an 'if' and 'else' block which both require work to be done; if all your threads can take one path or the other then you'll get a win as you'll only do the work you need to do. However if you have a situation where say 50% of your threads go down the 'if' path and 50% down the 'else' path you'll end up doing both chunks of work.

So when dealing with branching you have to look at how the branches are going to be used and the chance of having to take both paths. Pixel processing tends to be the biggest problem here, as vertices tend to all take the same path across a model.

For example on a console game we had roads being rendered with a texture which had an alpha'd edge. This could lead to large segments of polygons not contributing to the final output but, originally, having to do all the work. By doing a simple 'if' check on the alpha early in the shader we could elimate most of that work load as the pixels broke down into three groups;
- thread groups which did all the maths
- thread groups which could early out
- thread groups which had some threads doing all the work

In this case most of the thread groups were in group 1 and 2 so the border case didn't effect the over all runtime of the shader granting us a net win for performance.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!