• Advertisement
Sign in to follow this  

Theory of Shader Use

This topic is 3827 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

There is clearly a lot of power in shaders, effects...a lot of options and ways of doing things...but the manuals and tutorials sort of skimp out when it comes to explaining good use practice. I think I am starting to understand the proper way to use these in a large project so let me summarize and let me know if there is anything I am overlooking, something that will come around to bite me in the ass later, because I don't want to discover that I've been doing about things in a dead-end way that's going to force a redesign after several months. 1) It seems that for every unique type of "material", I should have a single effect that does the vertex and pixel processing. This should be an "uber" effect that does all the vertex and pixel processing effects I want at once. 2) All uber effects should be in an effect pool and each one should take as inputs the world/view/projection matrices as well as the maximum number of scene lights...and they should be written so that they can deal with all those lights at once. Well, if they dont need some of those matrices, they wouldnt have to include them...but basically the idea is that they share all those things. 3) All of this should be rendered to an off-screen render target, and then I have one "uber" post effects shader that applies all my post effects such as HDR, bloom, lens flare, brightness/contrast, etc and finally puts it into the back buffer. Is that about right?

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by yahastu
There is clearly a lot of power in shaders, effects...a lot of options and ways of doing things...but the manuals and tutorials sort of skimp out when it comes to explaining good use practice. I think I am starting to understand the proper way to use these in a large project so let me summarize and let me know if there is anything I am overlooking, something that will come around to bite me in the ass later, because I don't want to discover that I've been doing about things in a dead-end way that's going to force a redesign after several months.
Don't be scared to much by this. It's just normal for systems to evolve matching the various needs.
There's not general "theory" with shaders besides their syntax and such: you grasp them mainly by practicing.
It isn't a coincidence you see shaders more powerful on complex applications: one could say they were designed exactly with this goal. The complexity in managing the FFP in large projects grows scaringly fast.
Quote:
Original post by yahastu
1) It seems that for every unique type of "material", I should have a single effect that does the vertex and pixel processing. This should be an "uber" effect that does all the vertex and pixel processing effects I want at once.
Good point. This opens a can of worms. As far as I've seen, people dealing with content will call "shader" whatever it looks different.
By a programmer standpoint however, a shader may have multiple appearances: if you change just a texture it's still the same shader code, yet somebody will consider it two different shaders.
The first problem here is to estabilish a understood method to express those concepts for example I use shader (code and settings), shader chain (code and settings, for all required stages), kernel (code only), kernel chain and such.

Now, coming to the point, you must absolutely have my::shader chain for each effect, however it is possible to build these from a common set of my::kernels... take care: I am not saying this to be the best approach, as most .fx files in usage will show.
Quote:
Original post by yahastu
2) All uber effects should be in an effect pool and each one should take as inputs the world/view/projection matrices as well as the maximum number of scene lights...and they should be written so that they can deal with all those lights at once. Well, if they dont need some of those matrices, they wouldnt have to include them...but basically the idea is that they share all those things.
Yes, the engine will have to track some uniform settings "more closely" than others. A parser or a few polls will help you in understanding what you need and what you don't. In D3D9 D3DXConstantTable (I'm not sure of the wording) will be your best friend. If memory serves however, those hold only info on the "active" uniforms, which may not always be what you want. Not-so-nice-issue with API-driven systems: both in GL2x and D3D9 (not speaking about D3D10 here) those needs to have a compile-able shader code, which may sometimes be problematic.

Lights are definetly a problem with today' shading systems: theorically one may want to have a "light shader" (how the light will land on surface, this exists in RenderMan), various "light-surface interaction shaders" (how the light will be reflected/refracted by surface), a "light-surface-appearance compositing shader" (how the previous two are merged togheter) and stuff like that.
More pratically those details are usually considered irrelevant and all the above is reduced to summing or multiplying a few terms. :)

The pooling problem is related to the platform. If constant buffers are available then it's a different line of thinking: changing uniform settings is slow (expection: samplers, which are "sort-of" uniforms), it will be faster to change the buffer offset or the buffer itself. Take this with salt since I'm just writing here what CB docs say. If CBs are not supported there's not much to pool and the worse thing is that you cannot change them with ease. A "dirty" flag may suffice.
Quote:
Original post by yahastu
3) All of this should be rendered to an off-screen render target, and then I have one "uber" post effects shader that applies all my post effects such as HDR, bloom, lens flare, brightness/contrast, etc and finally puts it into the back buffer.
Yes and no... If you think a bit at it, this is not (or it is, depending on the context) strictly necessary by shaders themselves.
On a really complete shader system, there must be a way to allow this HDR mangling to happen using the same shading system, let's call it "engine shaders". I am obviously stretching it too much here: I'm not even sure this degree of programmability pays off.
Quote:
Original post by yahastu
Is that about right?
I don't feel confident enough to say it's "right" (often abused concept) but it looks a well tought, promising perspective.

Share this post


Link to post
Share on other sites
Software design patterns and architectures around a programmable pipeline are still relatively new. It's only really since SM2/SM3 that they've been powerful enough to start doing some complex and interesting stuff.

Ultimately you'll find conflicting camps - there is no "one ring to rule them all" yet.

You have to consider your target hardware when designing this - are you going for a SM2 target and use SM3 as optimization where available, or are you going to target SM3 and scale back to SM2 where possible? Or are you going completely with SM4?

Also it greatly depends on your tool chain and content pipeline - Do you want artists to have control over this sort of thing without programmer interaction? Or is the programmer going to be the artist and therefore not mind using slightly clumsy methods..?

You should be able to architect your code so that you don't have to use uber shaders - they're not always good for performance and you'll be doing yourself a favour to architect so that you can break them down to multi-pass equivalents.

hth
Jack

Share this post


Link to post
Share on other sites
Thanks for the feedback Krohm. I'm not sure what you mean about the different stages of light shaders...I understand your point conceptually, but how would you chain together shaders for those separate aspects?

Jollyjeffers, my question is about the same...you say that the uber shaders are bad for performance and that multi-pass equivalents may be better. First of all, it seems that the uber shaders would be a great performance boon...if you can reduce passes, and swaps to the GPU. I understand how you could swap the shader for rendering different sub-meshes of an object, and how you could achieve special effects by rendering to different targets and then combining them, but it seems impossible to me to break apart aspects of a shader into independent components such as lighting, normal mapping, opacity, etc...which is why I suggested the uber shader.

I mulled it over some more last night and I am pretty confident in what I want my "standard pipeline" to be. This would be the pipeline that most of my objects are rendered via:

* Tangent normal bump map
* Specular intensity map
* Specular exponent (scalar)
* Opacity map
* Diffuse color map (burned with AO)
* Ambient light level (scalar)
* HDR with adapting light level (post)
* Distortion map (post)

This will be just 3 shaders, one VS, one PS, and another PS for post. Actually the post PS might turn out to be several for different passes required by the HDR, but thats the idea.

Since I have to support a fixed number of lights I'll probably send in the nearest 10 lights to the player.

I'm still trying to break down and understand the example normal bump map shader...they set a lot of variables in the VS that are used in the bump map PS which confuses me because it seems that for bump mapping, all the tangent calculations need to be done per pixel...

Basically, I'm thinking that my VS should just multiply the verts by viewProjWorld and thats it. Then in my PS, I would check for opacity, then cycle through all 10 lights and for each one thats enabled I would sum up the lighting contribution using the normal/specular/diffuse/ambient information.

I would render this to a off screen target, then run the distortion PS, then do the HDR.

Share this post


Link to post
Share on other sites
Quote:
Original post by yahastu
Jollyjeffers, my question is about the same...you say that the uber shaders are bad for performance and that multi-pass equivalents may be better. First of all, it seems that the uber shaders would be a great performance boon...if you can reduce passes, and swaps to the GPU.
GPU performance is never black-and-white [smile]

Yes, an uber shader has some favourable characteristics but it has some subtleties that might still hurt you. For example, highly complex arithmetic or very long shaders can greatly increase register pressure, which may result in the compiler having to go to great lengths (=extra instructions) to stay within the limits and not break your code. The other likelihood is that you're going to make heavy use of constants - for example, you mention sending 10 lights at a time. Depending on the architecture this might severely limit the number of threads in flight; for example a GPU with 200 slots and a shader that requires 100 of them per invocation can only run 2 threads at at time, whereas a specialises shader needing 20 could run 10 threads at a time...

So you end up with this balance between highly optimized bare-minimum shaders that require more application involvement which typically generates more overhead but the individual processing steps might be lightening fast; alternatively you have only a couple of highly generic uber shaders that are easy to work with and have low app/api overhead but are potentially slower to execute.

Chances are only rigourous performance testing will reveal which is the right answer. I personally prefer compile-time scalability here as I can easily write one fragment of code but easily scale it across a variable number of passes.

Quote:
Original post by yahastu
it seems impossible to me to break apart aspects of a shader into independent components such as lighting, normal mapping, opacity, etc...which is why I suggested the uber shader.
The key here is that lighting is additive so you can easily break that apart. Per pixel techniques like normal mapping are just ways of setting up inputs into the lighting model, so it's only meaningful to split that apart when you're looking at deferred shading.

Quote:
Original post by yahastu
Since I have to support a fixed number of lights I'll probably send in the nearest 10 lights to the player.
Why a fixed number of lights? I seem to remember Tom Forsyth writing in his blog about how most games only really used 3-5 proper lights and all others could be approximated in some way...

Quote:
Original post by yahastu
I'm still trying to break down and understand the example normal bump map shader...they set a lot of variables in the VS that are used in the bump map PS which confuses me because it seems that for bump mapping, all the tangent calculations need to be done per pixel...
All vectors must be in the same coordinate space or comparisons such as inner product are simply wrong. If you have 50 pixels of a normal map covering a rasterized triangle you could convert them all to world space where your light/camera vectors are, or you could convert the light/camera vectors to tangent space every pixel, or you could just convert the light/camera vectors once in the VS and interpolate the results in the PS. It's simply a performance question - why do more work than you have to?


hth
Jack

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement