general shader advice - shaders generated per-model?

Started by
7 comments, last by Hodgman 11 years, 8 months ago
So I finally got my FBX model renderer working, complete with VBO's and shaders, running blazing fast! :]

At the moment, however, I have my shader custom written for the model I've been testing. The maximum number of bone influences on a single vertex is 3 for this model, so I have 3 bone indices (referencing matrices in a uniform mat4 array) and 3 bone weights as vertex attributes.

To generalize my model renderer, I figure I have two options.
1. Write a fixed shader, assuming a reasonably high number of influenes per vertex (i.e. 8?). Advantage is I have a single shader program for all models. Disadvantage is extra attribues floating around that won't be used. May waste memory and speed, and some models may exceed the presumed max.
Or:
2. Have each model generate its own shader. Advantage is each shader has exactly enough attributes for any possible model. Disadvantage is I need to maintain consistency among the model shaders, and the shaders for the world and terrain, etc.

Which method would you recommend?

One more question I had.

Would it be smarter to make the bone indices and weights into ivec4, and vec4 attributes instead, so they can be passed in packets of 4? I feel like there might be a speed advantage here.
Advertisement

So I finally got my FBX model renderer working, complete with VBO's and shaders, running blazing fast!

Congratulations, I know how much effort it is! And the joy is enormous the fist time you get something animated to move around.

As always, the situation depends on how different your models are going to be. If you are using Assimp to import models, then you can set an automatic max on the number of joints. But then you of course need to recalibrate them to a total weight of 1.

My advice would be to start with the same shader. Start easy, and consider how to expand when required. Using multiple shaders like that requires a lot of extra control logic.

Isn't it enough using three joints per vertex?
[size=2]Current project: Ephenation.
[size=2]Sharing OpenGL experiences: http://ephenationopengl.blogspot.com/
I mean for this model, sure! But who knows how many bones-per-vertex I might end up needing for a model! Especially since I'm not drawing them myself. ( I get them from Turbosquid, or from local artists)

Yeah "extra control logic" is exactly what I'm worried about. Nothing I can't handle but perhaps best avoided.

(edit)

Congratulations, I know how much effort it is! And the joy is enormous the fist time you get something animated to move around.
[/quote]
Thanks for the encouragement, Larspensjo! :) It truely is a tremendous amount of effort, and an equally tremendous feeling to get it working!
"Who knows". I think that's the key. You're worrying about something that may never happen. Deal with it if/when it happens. Or write a single shader with a little headroom (e.g. 5 bones per vertex) if you're worried. You can always cut it down later if you don't need the extra 2.
What would you guys say is a reasonable upper bound for the number of bone influences per vertex? Let's say, under the assumption that the models are of typical game creatures, such as humans, quadripeds, mythic beasts, etc. Does 4 sound reasonable?
Why don’t you consider option #3?
http://www.gamedev.n...36#entry4974836

I won’t repeat what I said in that post about the benefits of maintainability and scalability.

What you need to know next is that each model maintains several sets of flags for different types of renders (one for normal renders, one for creating shadow-maps, etc.)
For each type of render, some kind of model shader manager accepts these flags as input, translates them into macros to be passed to the shaders, and returns the compiled shaders.
If the flags have not changed since the last render, no requests for a shader are issued.
Once a request for a shader is issued, the manager first does a binary search to see if the given set of flags has already been translated into a shader, in which case a shared pointer to that shader is returned.
And if this is the first time the manager has seen that set of flags, the shader is created and again a shared pointer is returned.

The structure used to hold the flags is not limited to just bitwise flags. It can contain values such as the number of bones (however there is never a reason to use anything but 0 or 4 bones—this does not need to be a value on the structure and you don’t need to worry about this issue at all).
What values might be in the structure are for example the slot into which the cubemap texture is put, or the slot of the specular map.
The slots for textures can be computed at load-time.


This means you are maintaining only a small set of actual shader files which can be preprocessed into the smallest set of features, and any combination of said features, necessary for any models you are loading.
This is the system I am using for all the results you will find on my site: http://lspiroengine.com/?p=485

OpenGL does not support #include so you would be doing yourself a major favor in writing your own preprocessor that can handle at minimum #include (remember to output #line pragma’s where necessary).
But this system still works without #include, just forces you to copy/paste all your lighting models into each new shader file you create.


While you may feel that this system can lead to an explosion of shader permutations, rendering all of the images in that link above took a total of about 15 different shaders, including all permutations for reflections vs. non-reflective surfaces, normal-mapped and non-normal-mapped surfaces, textured and non-textured surfaces, solid and transparent surfaces, and combinations of all of these, in addition to shaders for creating the shadow maps.
Plus the ground shader, which uses a different lighting model from the cars’.



Would it be smarter to make the bone indices and weights into ivec4, and vec4 attributes instead, so they can be passed in packets of 4? I feel like there might be a speed advantage here.


That is how it’s done. And you never need more than one “packet”.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

Thanks for the detailed reponse, L. Spiro! You have some good ideas!

I'm curious though. You say I will never need more than 4 bones per vertex. Is that rule imposed by model editors? (i.e. maya, 3dsmax, etc)
It is not as much a rule as practical guideline that is virtually always strictly observed.
Since 4-element vectors can be processed in parallel you will always send 4 at a time. This means you can send 0, 4, 8, etc.
Because 8 is excessive and 4 (with renormalization) virtually never produces noticeably incorrect results, 4 is all you will ever need.

In my entire professional career I have never encountered a case where 4 was simply not enough. By that point, the artists need to rethink their skinning.

That being said, the only reason to impose restrictions inside frameworks or engines is if you simply can’t make a better/dynamic/fully flexible system.
I don’t believe an engine should really impose the 0 or 4 options I mentioned before, but you don’t have enough experience yet to make a truly flexible system.
You should go forward with a few restrictions here-and-there until you have more experience and then you can think about what could have been better and make a better design for your next framework/engine.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

For regular character animation purposes, I've also never seen more than 4 needed. If the artists output a file with more weights than that, we throw away the smallest ones and renormalize the remaining 4 weights (so they add up to 1.0).

It may be that for some advanced animation techniques for special objects, you may actually need some excessive amount, but I'd cross that bridge when you come to it.

This topic is closed to new replies.

Advertisement