Material, Shaders, Shader variants and Parameters

Started by
5 comments, last by csisy 7 years, 11 months ago

Hi,

I've seen some topics about this theme but I'm not sure I've found that I'm looking for.

I'm sorry, it will be a bit long, hope someone will read this. And sorry for my broken english! :)

Let me write about my current design/implementation. I've tried to mix some ideas from the UE4 and Unity.

First, here is a list of the classes with small descriptions.

Shader:

The lowest level class in this group. It can be Vertex or Fragment/Pixel shader. Nothing more or less. In the OpenGL implementation, it contains the shader object id and the shader compilation function.

Effect:

Slightly higher level; this class is a group of Shaders. This class can be used to set parameters (uniforms). In the OpenGL implementation, it contains the program object id and the linking function (using the compiled shaders). So this is a single, valid effect which can be used to render objects. Also this class is interpreting the full source code and generates some things, like the used attributes and uniforms. This way a Mesh can be checked if it has enough vertex data (attribute) to be rendered by this effect.

Material:

This is a mid/high-level class; This class could be called as EffectVariantCollection, because it is: The shaders can be written with defines, ifdefs and so on. So a single Effect can be compiled into multiple variants. A good example is the Unity's Standard shader, which is a big über-shader (or Effect in my case) and is compiled into multiple variants based on keywords. This is what my Material class does: it can compile an effect multiple times with different defines added to the beginning of the source code.

MaterialInstance:

This is the highest level class; it simply contains a reference to a valid Material and can be attached to Meshes. A MaterialInstance contains a list of keywords which are concatenated with the keywords provided by the renderer. The combined keywords are used to "select" the proper Effect variant, using the referenced Material.

So comparing with the Unity's terminology:

Unity Shader = Material

Unity Material = MaterialInstance

I've written a simple text parser and this is how I can define Materials. This text file is similar to the Unity's Shader file.

I've just started implementing parameter handling and I'm not sure about some things. Here is the cropped code:


class MaterialParameter
{
public:
    enum Type
    {
        Float = 0,
        Vec2,
        ...
    }; // enum Type

    MaterialParameter(const std::string& name, const Type type);

    const std::string& GetName() const;
    Type GetType() const;

    template <typename T>
    static Type TypeOf(); // specialized for float32, Vector2, etc.

protected:
    const std::string   name;
    const Type          type;
}; // class MaterialParameter

typedef std::shared_ptr<MaterialParameter> MaterialParameterPtr;

template <typename T>
class MaterialParameterValue final : public MaterialParameter
{
public:
    T   value;

    MaterialParameterValue(const std::string& name, const T& defaultValue);

    const T& GetDefaultValue() const;

private:
    const T defaultValue;
}; // class MaterialParameterValue

typedef MaterialParameterValue<float32> MatParamFloat;
typedef MaterialParameterValue<Vector2> MatParamVec2;
// ...

typedef std::shared_ptr<MatParamFloat> MatParamFloatPtr;
typedef std::shared_ptr<MatParamVec2> MatParamVec2Ptr;
// ...

I also have a MaterialParameterCollection class which contains a map<string, parameter> and some functions:


class MaterialParameterCollection
{
public:
    template <typename T>
    std::shared_ptr<T> GetParameter(const std::string& name) const;

    template <typename T>
    bool AddParameter(const std::string& name, const T& defaultValue);

    void ApplyToEffect(const IEffectPtr& effect) const;

private:
    typedef std::map<std::string, MaterialParameterPtr> Parameters;

    Parameters  parameters;

    MaterialParameterPtr GetParameter(const std::string& name, const MaterialParameter::Type type) const;
}; // class MaterialParameterCollection

And the biggest problem is with the ApplyToEffect function. Of course I can iterate over the map, switch on the type, cast and call the set uniform function, like in the current implementation:


void MaterialParameterCollection::ApplyToEffect(const IEffectPtr& effect) const
{
    for (auto it = parameters.begin(); it != parameters.end(); ++it)
    {
        const MaterialParameterPtr& param = it->second;

        IShaderUniformPtr uniform = effect->GetUniform(param->GetName());
        if (!uniform)
            continue;

        switch (param->GetType())
        {
        case MaterialParameter::Float:
            effect->SetUniform(uniform, static_cast<MatParamFloat*>(param.get())->value);
            break;

This first solution I've used was a virtual function in the base MaterialParameter class and the template class just overrided it and called the effect's SetUniform fcuntion. However, the reason why I'm using this approach are the textures. Because I cannot set a texture directly (to OpenGL at least) just a "sampler id". But the parameter itself does not know anything about the other parameters so a TextureParameter does not know which id it should use.

- The renderable objects can now only be sorted by effect first then textures (but I think it's okay, I would do this anyway).

- Is this an acceptable approach?

- Any tips?

Let me know If anyone interested in some details.

Thanks

sorry for my bad english
Advertisement

Don't follow Unity's and UE4's exact approach because they're overengineered techs born out of DX9-style rendering which had to evolve and adapt over time.

If you design your material system that way, you're going to inherit the same slowness that plagues those engines.

There's no need for so many classes.

All you have is:

  1. Shaders. Make a representation that simply encapsulates the file and compiles it according to input parameters.
  2. Materials. A collection of shaders with per-material parameters that affect how the shader will be compiled, what parameters will be passed during draw instead of compile time, and what textures will be bound.
  3. MaterialManager. Aside from creating materials, it's responsible for keeping shared per-pass parameters (such as view matrices, fog parameters) in a different place (i.e. different const buffer). It also is aware of Materials and Renderable objects so that it can match inputs that are per-object during rendering (such as the world matrix, bone matrices in the case of skinning)

That's all you need. Also stop thinking in terms of parameters, that's a DX9-style thing that nowadays only works well for postprocessing effects and some compute shaders. Start thinking in terms of memory layouts (buffers) and frequency of updates (there's generally going to be 3 buffers: 1 is updated per pass; 1 is per material, updated when a material stored in that buffer changes; 1 is updated per object)

Yeah the model of shader-instances/effects/etc actually being able to hold parameters (uniform values) is an abstraction that makes sense for 2006's GeForce 7900 GT... but not for anything newer. The reason it makes sense for that old era of GPUs, is that many didn't actually have hardware support for uniform values at all, but they did support literal / hard-coded values... so uniforms were implemented by actually patching the shader code itself - meaning the uniform value was stored inside the program object.

The way the hardware actually works after that point in time, is that you bind shader-programs to the context, which are just code (no uniform values), and you bind resources to the context, such as textures, vertex-buffers, uniform-buffers, etc...

So, I would definitely recommend making a system based around UBO's, not individual uniforms. A UBO-based design will also port easily to D3D/Vulkan/etc, while a uniform-based design will require emulation on these other APIs.

What I've got, roughly:

Shader (pixel/vertex/etc) -- not exposed as part of the API / this is an internal implementation detail.

ProgramBinding -- a set of vertex + pixel/hull/domain/geo shaders. This is at a low-level not really visible to users - it's the "shader object" that the low-level renderer works with.

Technique -- this is the "shader object" that the user actually works with, it's has a collection of passes, that the user can refer to by name or index, but there's no class associated with them.

Pass -- this is one "aspect" or "stage" of a shader -- e.g. ForwardLightingPass, GBufferAttributePass, TranslucentLightingPass, DepthOnlyPass. A technique can have multiple passes, because in a modern engine, the same object will be drawn in multiple stages -- e.g. in a Z-pre-pass followed by a forward-lighting pass. A pass also has a list of permutations. Each permutation is a combination of option values (key) mapped to a ProgramBinding (value).

Options -- When authoring shaders, you've got a 32bit field that you can allocate "shader options" within -- e.g. bit #0 could be normal-mapping on/off, bits #1/2 could be "light count [0,3]". The shader compiler iterates through every permutation of options for each pass and compiles all of them into the appropriate ProgramBindings. At runtime, the user can (indirectly) select shader permutations by supplying option values alongside their technique.

i.e. If the user has a technique, a pass index, and a set of shader-option values, then they can look up a ProgramBinding.

Material - not necessarily a single class - different systems can implement their own material solutions. A material will be a technique, a set of shader options, fixed-function state overrides (e.g. blend modes) and resource bindings (textures, UBO's, etc). Techniques provide reflection tables, where a material system can find uniforms by name (i.e. UBO-slot/location and offset into that UBO), texture-slots/locations by name, etc... Material classes can use this data to allow the user to set values by name, while they're actually maintaining UBO's of values.

RenderPass - a set of destination textures (FBO's) and a shader Pass index. The pass index is used to find the appropriate ProgramBinding's from each Technique.

Thanks for the replies! I forgot to mention I'm currently targeting OGL 2.0 and DX9 where UBOs are not available, that's why I've chosen this ugly, old-school design. :) However I'm thinking on "upgrading" to OGL >= 3.0 and DX >= 11.0 since I'm using Deferred Shading in the engine. And if the hardware can handle deferred shading it probably supports at least OGL 3.0 as well.

@Matias Goldberg:

I don't know how slow their system is, but I like the flexibility this approach gives to me. :)

@Hodgman:

It seems our design is similar, I'm just using different names for the classes. And a slower pass and techique selection.

My Material is similar to your Technique. Your Pass is a collection of my Effects. My Effect is a single permutation of your Pass.

However I'm currently using an ugly <string, Effect> map, where the string is generated from the pass name and the given keywords (like "NORMAL_MAPPING_ON"). This isn't a permament solution, I will change something similar to your solution.


// ... when looking for a specific variant ...
const CacheId cacheId = GenCacheId(allKeywords, pass);

// this is ugly, but whatever... it works for now :)
Material::CacheId Material::GenCacheId(const Keywords& keywords, const std::string& pass) const
{
    std::string result = "PASS=" + pass + ";";

    for (const Keyword& keyword : keywords)
        result += keyword + ";";

    return result;
}
sorry for my bad english

Thanks for the replies! I forgot to mention I'm currently targeting OGL 2.0 and DX9 where UBOs are not available, that's why I've chosen this ugly, old-school design. :) However I'm thinking on "upgrading" to OGL >= 3.0 and DX >= 11.0 since I'm using Deferred Shading in the engine. And if the hardware can handle deferred shading it probably supports at least OGL 3.0 as well.

I support D3D9 under my API by emulating CBuffers on top of it :)

My permutation selection function looks like below.
The algorithm is not very intuitive at first, but it does guarantee that you never deliver a permutation that has options that were not requested, and also guarantees you do deliver the permutation that most closely matches the request, which is exactly what we want.
e.g. imagine someone asks for normal mapping to be enabled, but this technique/material does not support such a feature -- your string lookup will fail in this case, whereas this algorithm simply ignores that invalid request.
It does require you to pre-sort your list of permutations/bindings/effects by the number of bits set in their options mask, from highest to lowest, and to always end the list with a permutation with no bits set (an 'all options disabled' permutation), which will always be a valid fallback result (meaning that return -1; at the bottom should never be reached). Unfortunately it's an O(N) search, but N is the number of permutations in the pass, which is usually very small, so that's not really an issue. Your dictionary lookup could in theory be O(1) or O(N logN), yet it's likely waay slower than this -- e.g. 8 of these Permutation objects will fit in a single cache line, which is a single RAM transfer, so if you've got a handful of permutations the linear search may as well be O(1) :wink: You should also do this searches once (ahead of time) and reuse the result every frame until your material is edited -- if you follow that advice it doesn't really matter how expensive the permutation lookup is! :D


int internal::ShaderPackBlob::SelectProgramsIndex( u32 techniqueId, u32 passId, u32 featuresRequested )
{
	eiASSERT( techniqueId < numTechniques );
	Technique&         technique    = techniques[techniqueId];
	List<Pass>&        passes       = *technique.passes;
	if( passId >= passes.count )
		return -1;
	Pass&              pass         = passes[passId];
	List<Permutation>& permutations = *pass.permutations;
	u32 prevPermutationFeatureCount = 64;
	for( u32 k = 0, end = permutations.count; k != end; ++k )
	{
		Permutation& permutation = permutations[k];
		eiASSERT( prevPermutationFeatureCount >= CountBitsSet(permutation.features) );
		prevPermutationFeatureCount = CountBitsSet(permutation.features);//debug code only - check the array is sorted correctly

		if( (featuresRequested & permutation.features) == permutation.features )//if this does not provide features that weren't asked for, we'll take it
		{
			return permutation.bindingIdx;
		}
	}
	return -1;
}

Thanks for the replies! I forgot to mention I'm currently targeting OGL 2.0 and DX9 where UBOs are not available, that's why I've chosen this ugly, old-school design. :) However I'm thinking on "upgrading" to OGL >= 3.0 and DX >= 11.0 since I'm using Deferred Shading in the engine. And if the hardware can handle deferred shading it probably supports at least OGL 3.0 as well.

I support D3D9 under my API by emulating CBuffers on top of it :)

I'll check that, thanks! :)

The algorithm is not very intuitive at first, but it does guarantee that you never deliver a permutation that has options that were not requested, and also guarantees you do deliver the permutation that most closely matches the request, which is exactly what we want.

e.g. imagine someone asks for normal mapping to be enabled, but this technique/material does not support such a feature -- your string lookup will fail in this case, whereas this algorithm simply ignores that invalid request.

It's true, however for some reason this was a design choice from my side. This is the reason:

Let's talk about the deferred renderer. I have a "big" (it's actually not as big :)) shader for rendering objects to the gbuffer. This shader contains many defines, here is the full list:


// Available defines:
// ANIMATED
// REQUIRE_VIEW_DIR
// REQUIRE_SCREEN_POS
// REQUIRE_WORLD_POS
// FRAGMENT_WRITES_NORMAL
// CUSTOM_VERTEX_PROCESSING

Based on these defines, a different variant is compiled.

When a user (actually it's just me :)) defines a new Material (or Technique in your terminology), the code is concatenated with this base source code. So when defining a new Material, the user must provide valid defines if wants to use a feature.

Here is an example: I've created a simple normal mapping material which - in this case - must define eg. FRAGMENT_WRITES_NORMAL. If it's not defined, the shader won't compile because the function will be slightly different (just an "in" instead of "inout" for the normal)


Parameters
{
    Parameter
    {
        Name { "baseColor" }
        Type { "Color" }
        Default { "1.0, 1.0, 1.0, 1.0" }
    }
    Parameter
    {
        Name { "baseTexture" }
        Type { "Texture2D" }
        Default { "Engine/default_diffuse.png" }
    }
    Parameter
    {
        Name { "normalTexture" }
        Type { "Texture2D" }
        Default { "Engine/default_normal.png" }
    }
}
Shader
{
    Queue { Opaque }

    Pragma
    {
        CODE
        #define CUSTOM_VERTEX_PROCESSING
        #define CUSTOM_FRAGMENT_PROCESSING
        #define FRAGMENT_WRITES_NORMAL
        ENDCODE
    }
    
    Pass
    {
        Name { "Deferred" }
        
        Vertex
        {
            CODE
            varying vec2 Texcoord;

            void CustomVertex()
            {
                Texcoord = inTexcoord_0;
            }
            ENDCODE
        }
        
        Fragment
        {
            CODE
            varying vec2 Texcoord;

            uniform vec4 baseColor;
            uniform sampler2D baseTexture;
            uniform sampler2D normalTexture;

            void CustomFragment(inout vec3 oBaseColor, inout vec3 oNormal, inout float oSpecularIntensity, inout float oShininess, inout vec3 oEmission)
            {
                oBaseColor = texture2D(baseTexture, Texcoord).rgb * baseColor.xyz;
                oNormal = UnpackNormal(texture2D(normalTexture, Texcoord).rgb);
            }
            ENDCODE
        }
    }
}

This could be automatized and I think the UE4 does something similar under the hood. And the Unity as well. If we attach a texture to the "Normal" input of the material, a different variant will be compiled. Well, this ugly define-solution is the same, just... ugly. :)

But of course I agree with you, the string concatenation and using for lookup things is not the best solution and I will probably use something similar to your code. Thanks for sharing that!

EDIT:

The MaterialInstance can also have defines. This way I can create a big Material (like the Standard shader in Unity) and use the keywords of the MaterialInstance to create a different permutation. So the previous normal mapping material would be:


Parameters
{
    [...]
}
Shader
{
    Queue { Opaque }

    Pragma
    {
        CODE
        #define CUSTOM_VERTEX_PROCESSING
        #define CUSTOM_FRAGMENT_PROCESSING
        #ifdef NORMAL_MAPPING
            #define FRAGMENT_WRITES_NORMAL
        #endif
        ENDCODE
    }
    
    Pass
    {
        [...]
        
        Fragment
        {
            CODE
            varying vec2 Texcoord;

            uniform vec4 baseColor;
            uniform sampler2D baseTexture;
            uniform sampler2D normalTexture;

            void CustomFragment(inout vec3 oBaseColor, inout vec3 oNormal, inout float oSpecularIntensity, inout float oShininess, inout vec3 oEmission)
            {
                oBaseColor = texture2D(baseTexture, Texcoord).rgb * baseColor.xyz;
            #ifdef NORMAL_MAPPING
                oNormal = UnpackNormal(texture2D(normalTexture, Texcoord).rgb);
            #endif
            }
            ENDCODE
        }
    }
}

Then a MaterialInstance is free to add or not a "NORMAL_MAPPING" keyword.

So the defines in the Material controls the underlying (the big shader) code and the defines in the MaterialInstance controls the Material's code.

To use defines for these things was an optimization choice to create faster shader if eg. the view dir is not needed in any computation.

sorry for my bad english

Okay, let's turn back to UBOs a bit. Let me use the word "uniform" for data passed to the shader.

Let's suppose that I'll change to OpenGL 3.1 so the UBOs become available. I guess I got the basics:

The GPU side:

In the shader define some blocks, eg. based on update frequency: a "global block" for camera and ambient uniforms, an "object block" for everything common in objects (like world matrix) and a "material block" which can be custom for each material.

The CPU side:

First of all, I have to create the buffers (like any other buffers, eg. VBO, IBO, etc.) with the required size. This can be acquired from the program.

Then use global indices for each buffer. Eg. 1 will be the global block's binding index.

Bind the buffers to the binding point/id.

Also bind the program's block to the same binding point.

This way when I update the data in the buffer, the data is passed to the shader automatically and no extra work is required.

If I got the basics right, here comes some "questions" from me :)

- so the "uniform data" is transmitted for the GPU only when I update the buffer? (like it would if I use VBO)

- if a shader uses only the viewProj matrix of the "global block" the other data in that block is unnecessary - but it's not a big deal if the "transmitting" is done only when the buffer is updated.

- for different light types should I use different UBOs or pack them together? Eg. the point light has an extra parameter "range" or "invSqrRadius" or something like that. The directional light does not have position but direction and vice-versa. I can pack all of these values into a single UBO or I can split them and make a "CommonLight block" and another specific for the given light.

- is there any performance hit when binding a program's block to a binding point but the block is not used in the shader?

- is the shader developer's responsibility to use the same block definition as the other? I guess I would check the "equality" anway when a new variant is compiled.

sorry for my bad english

This topic is closed to new replies.

Advertisement