Shader System Questions

Started by
113 comments, last by cyansoft 19 years, 11 months ago
Like many, I''ve been reading and rereading (and re-rereading) the famous "Materials and Shaders Implementation" thread, where the great Yann L. describes his engine''s pluggable effect/shader system. I think I''m starting to understand how the system works. But in the process of designing my own shader system, I came up with many questions that were never addressed in the original thread, are needed to provide more insight and ideas, or I''ve mistakenly overlooked because of the shear amount of discussion over this topic. 1) In Yann''s implementation, the actual shader class does not contain any render method. He says that the rendering occurs outside of the shader. I thought that the purpose of the shader is to implement all or part of an effect. Why keep the API specific rendering calls (i.e. glDrawElements) outside of the shader? What are the advantages/disadvantages of having the shader render the GC verses having the GC render itself? 2) For those whom implemented a shader system (or in the process of implementation), do you have every possible renderable object consist of GC''s that are piped into the renderer? Such objects include the HUD, the in-game GUI, the command console, and particle systems. What''s the advantages and disadvantages of having *everything* renderable go through renderer system? 3) If you were supporting multiple API''s/hardware (i.e. OpenGL, DX9, PS2, Game Cube, etc.), I would assume you would create a different set of shaders for each API, while carefully loading the correct set of shaders for the target API/hardware. This comes back to the first question, why keep the API specifics for rendering outside the shader? 4) When resolving the best shader for an effect based on the hardware capabilities and the user quality preferences, I assume when a GC is loaded from disk, it can ignore loading vertex stream data it does not need. For example, if bump mapping is not supported, then the tangents do not need to be in the vertex stream. 5) For those whom implemented effects, what kind of effects did you come up with and how did the implementation work out? What absolute minimum effects should a system provide? Here are the very basics others have mentioned in other threads: a) Diffused Texture .vertexstreamdata XYZ | UV(diffused) .gcdata TextureID(diffused) b) Lightmapped Diffused Texture .vertexstreamdata XYZ | UV(diffused) | UV(lightmap) .gcdata TextureID(diffused) | TextureID(lightmap) c) Gouraud Shaded .vertexstreamdata XYZ | RGBA | NxNyNz .gcdata LightID 6) How liberal are you about making a whole new effect verses adding an additional parameter to an effect? What’s your criterion in deciding? For example, say you want to render some geometry in wire frame mode. Would you create a new effect for this or add an additional parameter to an existing effect? 7) How do you provide parameters and vertex stream data to the shader via shader_params? Do you pass an object with methods to return pointers to the param data in the GC? Or do you store a single buffer in the GC, with the params and vertex stream data in a predefined order based on the required format for the effect? What are the advantages and disadvantages to each method? 8) How do you store your vertex data stream and the parameters in the GC? Do you use a single buffer as discussed in question 7, or do you use a specific GC class that''s inherited from a base GC, to be used by specific effect? 9) If you store your vertex data stream outside of the GC, in a buffer (i.e. OpenGL''s VAR/VBO/Vertex Array or DX9''s VB) do you have one buffer per GC (seams very inefficient), one per shader, one for the entire scene, or multiples buffers for optimized performance (I believe Yann once said that 1000 to 4000 triangles per buffer is the most optimal)? If you do use multiple buffers, how do you minimize switching buffers? 10) For level of detail purposes, imposters may need to be created. How would you design an effect to render a bunch of GC''s to a texture, in a preprocess pass, then render the generated texture on a quad in the main pass? Would this be as simple as passing a list of GC''s as a parameter for the effect? 11) How do you resolve using multiple shaders for multi-pass rendering? More specifically, how does the shader tell the GC it can only do part of the effect? And more importantly, how does the shader know it can only do part of the effect? Yann didn''t go into details on how the whole messaging system would work. Any ideas on how to implement this in a clean and efficient manner? 12) Besides the obvious SPGC sorting criteria such as render target, render pass, shader, texture(s), and depth from camera(transparent pass only), what other factors/state changes should be looked at and sorted by? 13) How can a shader be implemented to take advantage of OpenGL''s display lists? For obvious performance/memory usage reasons, display lists are perfect for static geometry that is repeated dozens or more times per frame. Compiling and executing a display cannot happen every frame, since compilation would outweigh any speed benefit. Wow! That was a lot of questions. I look forward in reading what others have done and to what suggestions they have for implementing such a complex system. Bob
Advertisement
I can provide answers to some of your questions although not all, some of these are still issues in my own design that I need to resolve.

1) The advantages of having the actual render call outside the shader is that you are guarenteed to render somethng, it would be all to easy to forget to put in the render call. The purpose of the shader is to set up all the necesary states that are requierd to make the geometry look at certain way(an effect).

2) Yes I put everything though the same pipeline. My reasoning behind this is I didn''t want any special purpose code, the 2D elements are the same as 3D elements but with a constant depth. This is easily achieved with a matrix. The disadvantages are that it becomes harder to directly use SDK examples directly in your code and you need to adapt some of the techniques to be able to use the system.

3) Remember that the shaders aren''t the only part of the render system - there''s also stuff like texture/vertex buffer management, window management that is specific to an API that isn''t covered by the shaders. What a different API boils down to is a different set of concrete classes to the abstract interface that you have defined, be that for the shaders, texture manager or vertex buffer systems. Basically the shaders are probably the only place where you have freedom to make calls to (almost) any API functions. In all other places outside of the shader you will have a predefined function that needs to be implemented e.g. loading a texture and then you will be restricted as to the API calls that can be made, meaning you are bound by the interface that is provided by the abstract renderer, this is why the shader system is so powerful.

4) That''s correct, although you only need to ignore the streams when copying the GC''s into Vertex Buffers, trying to drop streams in the GC''s(which should be loaded independently from the renderer) would require the engine to have knowledge of which shaders are linked to each effect and how they work, in other words you lose the layer between the renderer and the engine. So the answer is that the GC''s provide the maximum data that could be required by a shader and then the shader selectively copies what it needs into the vertex buffer for rendering.

5) What are you aiming to achieve? There are many different possible effects, you could implement one for each and every example program in the many SDK''s for opengl/Directx. Ultimately, the thing is if you need a new effect in your engine and you can''t produce it with your current feature set, then add it. This is the point where you can do the learning. Use the initial simple shaders( diffuse, gouraud etc..) as practise in implementing the shaders. Then pick something a little more challenging say a dynamic per-pixel specular effect and try to implement that. Every once in a while review your effects to make sure you aren''t duplicating anything that could be achieved by simplifying or generalising one of the other shaders.

6) I would say that if the effect gives a different overall outward appearance (eg. a wireframe vs. diffuse shader) then create a new effect. Use the meta-bouncing idea as a criteria for deciding, even if you haven''t implemented this in your engine. If the overall appearance is different or it takes a different set of input streams(that can''t be derived from another) then you will probably need a new effect/shader combination.

9) The GC''s are not at all aware of API specific implementations such as VAR, VBO etc.. The GC is simply an array of data that the renderer can access. It is the shaders that then access this data and copy it into VRAM, i.e. VAR, VBO, VAO etc.. The way you go about this is to have a manager that is responsible for allocating and sharing buffers between VRAM structures. So you can have multiple VRAM entries in a single buffer. The VRAM structure controls access to the range of the data in the specific buffer it can access.

Wow, that''s quite long already, and I''m pretty tired I will finish answering the questions when I have some more time.

James
I've been working on a system like this in my own engine for a while, though it's design differs quite radically from YannL's. Where you came to these forums to ask your questions, I sat and thought about it... Anyway, if you would like some potential ideas, check this page out.

7 + 8) I answered these questions with some sample code of mine that handles geometry chunks a while back. Here's the link.

10) Well, what I did was create a 'render list' class, that can manage the rendering (and shader management) of a whole lot of geometry chunks (it will also optimize their rendering, but that's not important here). To make the solution as clean and to conform to our shader model, I build a shader that creates/destroys a render surface when the shader is first created. Then you can pass it a geometry object or whatever, it will set the render target in it's 'enterShader' function (and clear it in 'exitShader') and the engine will handle the rendering of the polygonal data. Then, you create another geometry chunk and set the diffuse texture to the output of the other geometry chunk (the rendered 'imposter' map that the shader stores in the geometry chunk). Therefore, the input to this billboarded quad is the imposter texture created by the imposter shader, but the billboarded quad can be an entirely different (and generic) shader.

11) The setup phase in my engine is REALLY complex. I wrote it blindly the first time through (just trying to get it to work) and it's really hard to get it to choose the right shader combinations. Be sure to clearly define what you want your system to do; for example, if you want your system to use as few passes as possible (duh...), make sure to use a heuristic to determine how many passes a shader combination will take. What I ended up doing was build a 'potential shader run' for each potential combination of shaders that would not break the effect (so the effect might not be 100% valid, but it wouldn't have texturing if it's a diffuse effect, for example), then looping through each of these 'potential' shader runs and building a heuristic out of information from the run (number of shaders, number of invalidities, number of local/global ops [I'll explain in a second], etc.). This heuristic is then used to sort the runs, and I use the best one as the final effect shader list.

Local/Global Blending Operations
When I was designing my system, I realized that I had no control over fallback multi-pass techniques, and that my control over shaders that controlled multiple visual qualifiers was very limited; i.e. I couldn't say, multiply the diffuse color by the first texture, then add the second texture to that and expect the effect to work whether I use a single pass implementation or multi-pass implementation. I then made the concept of blending ops apply to this effect/shader system. When declarting an effect, you specify what blending ops you want for the color (and if supported, alpha). A blending script looks something like this:

effect tex.diffuse[0]/tex.diffuse[1]/color.diffuse
{
.description "two diffuse textures and diffuse vertex color"

.visual
{
.color.diffuse
.texture.diffuse(unit0)
.texture.diffuse(unit1)
}

// this is the color blending script
// the effective result is:
// (tex.unit0 * tex.unit1) + color.diffuse
//
.out.color = add{ mul{ .texture.diffuse(unit0), .texture.diffuse(unit1) }, .color.diffuse }
}

These are parsed using standard script parsing (I wrote it from scratch and it wasn't too hard) and are classified as local ops or global ops. Local ops are performed by a single shader; I have a shader that can handle two diffuse textures. The term in the script that says (tex.unit0 * tex.unit1) would be passed to the shader as a local op and the shader would be told to interpret the op on it's own. Global ops are multi-pass operations, and can be easily translated into the engine setting alpha blending renderstates. NOTE: if you use this system even for ideas you must remember: you cannot allow a single shader to be stretched (if it's 'instanced', thats ok) across a Global operation, as that is not possible with a single render target (the math just doesn't allow it most of the time, so I just called it invalid and said DON'T DO IT). It's not a big deal and it's easy to get around, but DON'T FORGET or you'll hit a brick wall and won't know what to do!

Hope this helps answers a few questions for ya and gives you some good ideas. Ciao,

EDIT: spelled 'ciao' wrong...

Chris Pergrossi
My Realm | "Good Morning, Dave"

[edited by - c t o a n on April 4, 2004 10:16:48 PM]
Chris PergrossiMy Realm | "Good Morning, Dave"
My rendersystem use some different setup, I discarded the effects-file and the effect stuff only using a shader and an ID of different material properties

// different material properties so farenum eMaterialFormat { MF_AMBIENT = 1<<0, MF_DIFFUSE = 1<<1, MF_SPECULAR = 1<<2, MF_SHININESS = 1<<3, MF_TEXTURE = 1<<4};// material classclass cMaterial {  int m_iFormat; // the above format of this material cColor4f Ambient; cColor4f Diffuse; cColor4f Specular; float    Shininess; // an array of texture handles... cTextureHandle TextureHandles[MAX_TEXTURES];public: void SetAmbient( cColor4f a ); void SetDiffuse( cColor4f d ); void SetSpecular( cColor4f s ); void SetShininess( float s ); cColor4f SetAmbient( void ); cColor4f SetDiffuse( void ); cColor4f SetSpecular( void ); float SetShininess( void ); // and so on...};


well.. if the material has the Ambient color set, then the material format is MF_AMBIENT if it also has the diffuse color then it has the format MF_AMBIENT | MF_DIFFUSE and this way I sort different shaders, the shader knows wich formats it can render and thats how each GeometryChunk gets it''s shaders

choosing shader
e.g.

gc = geometrychunk
do {
for( )<br> {<br> mf = gc->GetMatrial()->GetFormat();<br><br> if( s->Supports( mf ) )<br> {<br> // if it does check if the priority is higher then the last<br> // found shader, if it is does this shader cover more <br> // stuff in the material then the last one if so set this as<br> // best shader<br> }<br> }<br> <br> // remove the bestshaders material support format from mf<br> // and continue search for a shader for the rest of the format<br><br>// this will continue to loop so it can choose<br>// several shaders to display one material<br>// (it has a count so it won''t get stuck in the loop)<br>}while( mf != 0 ); <br><br>explanation: well.. the way I code a shader is that I first have a shader that set the color, then a shader that set a texture or several, then if the material needs both shaders then both is choosed becouse the material format has the correct flags for them to be choosed.<br><br>for example:<br><br>if I have this shaders with this formats:<br>1) MF_AMBIENT | MF_DIFFUSE<br>2) MF_SPECULAR | MF_SHININESS<br>3) MF_AMBIENT | MF_DIFFUSE | MF_SPECULAR | MF_SHININESS<br>4) MF_TEXTURE<br><br>and I have a GC with this format MF_AMBIENT | MF_DIFFUSE | MF_TEXTURE, then it will choose shader 1 and 4 if I add the MF_SPECULAR | MF_SHININESS to the GC''s material format then it will choose the third and fourth shader becouse the shader 3 cover more of the material then the first 2 shaders does that one will be picked, unless it has lower priority level<br>
I'll just anwser a few.

9) I use single VBO for each GC. The VBO bind is not much of a speed loss in latest drivers.

11) Shader doesn't have to know it doesn't render the whole effect. And message passing between 2 shaders is not such a problem (at least for now). I just bind the same shaderd object to both shaders. For example: water effect. It's made form 2 shaders. One to render scene to texture and other to render water. I just bind the same "reflection" texture to both shaders and that's it.

12) Those are pretty much all you need. If you want to take another step further, you could use some kind of dynamic sorting criteria. Something to dynamicly balance the infuence the individual parts depending on curent scene.

13) Display Lists. Shader doesn't have to do anything how the GC data is submited to GPU. DL are just another way of doing it, just like VA, VBO,... Where it can come in use in shaders? You might gain something if you put a batch of state changes (Like some very long NV_RC setup) in a DL. Other than that you don't have much use in shaders themselfs.

You should never let your fears become the boundaries of your dreams.

[edited by - _DarkWIng_ on April 5, 2004 12:06:51 PM]
You should never let your fears become the boundaries of your dreams.
quote:Original post by jamessharpe
The purpose of the shader is to set up all the necesary states that are requierd to make the geometry look at certain way(an effect).


So a shader doesn''t render an effect, but rather acts as a state change micromanager for an effect.

quote:Original post by jamessharpe
GC''s provide the maximum data that could be required by a shader and then the shader selectively copies what it needs into the vertex buffer for rendering.


Since the shader copies the data it needs from the GC to the hardware, I could store the GC''s data in any way I want regardless of the underlying API the shader uses. Doing this, I could load the GC''s right off the disk with no conversion needed. I can even use a compressed format, as long as my shader knows how to decompress it.

quote:Original post by jamessharpe
It is the shaders that then access this data and copy it into VRAM, i.e. VAR, VBO, VAO etc.. The way you go about this is to have a manager that is responsible for allocating and sharing buffers between VRAM structures.


This is where I start to worry. To support the multitude of vertex drawing methods that OpenGL provides (VAR, VBO, regular vertex arrays, compiled vertex arrays, etc), this means I would have to create an additional shader per method and an additional "vertex drawing manager" to implement each method.
quote:Original post by c t o a n
7 + 8) I answered these questions with some sample code of mine that handles geometry chunks a while back. Here''s the link.


I like the idea of using a property manager for the GC''s data instead of using an array. It would make adding new data formats a lot easier.

quote:Original post by c t o a n
10)Then, you create another geometry chunk and set the diffuse texture to the output of the other geometry chunk (the rendered ''imposter'' map that the shader stores in the geometry chunk). Therefore, the input to this billboarded quad is the imposter texture created by the imposter shader, but the billboarded quad can be an entirely different (and generic) shader.


Again, a property manager like you mentioned would make this fairly simple.
quote:Original post by McZ
well.. if the material has the Ambient color set, then the material format is MF_AMBIENT if it also has the diffuse color then it has the format MF_AMBIENT | MF_DIFFUSE and this way I sort different shaders, the shader knows wich formats it can render and thats how each GeometryChunk gets it''s shaders


Interesting idea. So in your system, when parsing the effects, you go beyond just assigning a priority which indicates if the effect is fully do-able or not. By using bit flags, the effects say what they want, and each shader says what it supports. With this, the priority really doesn''t need to indicate if the effect is complete, but rather the visual quality of the effect.

quote:Original post by cyansoft

Since the shader copies the data it needs from the GC to the hardware, I could store the GC''s data in any way I want regardless of the underlying API the shader uses. Doing this, I could load the GC''s right off the disk with no conversion needed. I can even use a compressed format, as long as my shader knows how to decompress it.



Exactly that''s where you gain flexibility.

quote:
This is where I start to worry. To support the multitude of vertex drawing methods that OpenGL provides (VAR, VBO, regular vertex arrays, compiled vertex arrays, etc), this means I would have to create an additional shader per method and an additional "vertex drawing manager" to implement each method.


No, you don''t need any extra shaders - what you have is a structure - a VRAMSlot that simply has a method that returns a pointer to a stream that you can write to fill the vertex data. Using a Lock/Unlock pair of calls you can do this quite easily, almost directly with most of the extensions. You may eventually need some kind of vertex management system since VRAM available is finite and varies across different cards, and you''re going to hit the limit sooner or later, and need some way of paging the data in and out. Search for ''sliding slot'' to see Yann explain his vertex cache management system.

James
quote:Original post by _DarkWIng_
12) Those are pretty much all you need. If you want to take another step further, you could use some kind of dynamic sorting criteria. Something to dynamicly balance the infuence the individual parts depending on curent scene.


This can be an interesting optimization.

quote:Original post by _DarkWIng_
Shader doesn''t have to do anything how the GC data is submited to GPU.


If so, they why did Yann have a fill_cache method for his shader system? Jamessharpe also mentions the shader being used to fill VRAM.

This topic is closed to new replies.

Advertisement