Jump to content
  • Advertisement
Sign in to follow this  
csisy

OpenGL Material, Shaders, Shader variants and Parameters

This topic is 793 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi,

 

I've seen some topics about this theme but I'm not sure I've found that I'm looking for.

I'm sorry, it will be a bit long, hope someone will read this. And sorry for my broken english! :)

 

Let me write about my current design/implementation. I've tried to mix some ideas from the UE4 and Unity.

First, here is a list of the classes with small descriptions.

 

Shader:

The lowest level class in this group. It can be Vertex or Fragment/Pixel shader. Nothing more or less. In the OpenGL implementation, it contains the shader object id and the shader compilation function.

 

Effect:

Slightly higher level; this class is a group of Shaders. This class can be used to set parameters (uniforms). In the OpenGL implementation, it contains the program object id and the linking function (using the compiled shaders). So this is a single, valid effect which can be used to render objects. Also this class is interpreting the full source code and generates some things, like the used attributes and uniforms. This way a Mesh can be checked if it has enough vertex data (attribute) to be rendered by this effect.

 

Material:

This is a mid/high-level class; This class could be called as EffectVariantCollection, because it is: The shaders can be written with defines, ifdefs and so on. So a single Effect can be compiled into multiple variants. A good example is the Unity's Standard shader, which is a big über-shader (or Effect in my case) and is compiled into multiple variants based on keywords. This is what my Material class does: it can compile an effect multiple times with different defines added to the beginning of the source code.

 

MaterialInstance:

This is the highest level class; it simply contains a reference to a valid Material and can be attached to Meshes. A MaterialInstance contains a list of keywords which are concatenated with the keywords provided by the renderer. The combined keywords are used to "select" the proper Effect variant, using the referenced Material.

 

So comparing with the Unity's terminology:

Unity Shader = Material

Unity Material = MaterialInstance

 

I've written a simple text parser and this is how I can define Materials. This text file is similar to the Unity's Shader file.

 

I've just started implementing parameter handling and I'm not sure about some things. Here is the cropped code:

class MaterialParameter
{
public:
    enum Type
    {
        Float = 0,
        Vec2,
        ...
    }; // enum Type

    MaterialParameter(const std::string& name, const Type type);

    const std::string& GetName() const;
    Type GetType() const;

    template <typename T>
    static Type TypeOf(); // specialized for float32, Vector2, etc.

protected:
    const std::string   name;
    const Type          type;
}; // class MaterialParameter

typedef std::shared_ptr<MaterialParameter> MaterialParameterPtr;

template <typename T>
class MaterialParameterValue final : public MaterialParameter
{
public:
    T   value;

    MaterialParameterValue(const std::string& name, const T& defaultValue);

    const T& GetDefaultValue() const;

private:
    const T defaultValue;
}; // class MaterialParameterValue

typedef MaterialParameterValue<float32> MatParamFloat;
typedef MaterialParameterValue<Vector2> MatParamVec2;
// ...

typedef std::shared_ptr<MatParamFloat> MatParamFloatPtr;
typedef std::shared_ptr<MatParamVec2> MatParamVec2Ptr;
// ...

I also have a MaterialParameterCollection class which contains a map<string, parameter> and some functions:

class MaterialParameterCollection
{
public:
    template <typename T>
    std::shared_ptr<T> GetParameter(const std::string& name) const;

    template <typename T>
    bool AddParameter(const std::string& name, const T& defaultValue);

    void ApplyToEffect(const IEffectPtr& effect) const;

private:
    typedef std::map<std::string, MaterialParameterPtr> Parameters;

    Parameters  parameters;

    MaterialParameterPtr GetParameter(const std::string& name, const MaterialParameter::Type type) const;
}; // class MaterialParameterCollection

And the biggest problem is with the ApplyToEffect function. Of course I can iterate over the map, switch on the type, cast and call the set uniform function, like in the current implementation:

void MaterialParameterCollection::ApplyToEffect(const IEffectPtr& effect) const
{
    for (auto it = parameters.begin(); it != parameters.end(); ++it)
    {
        const MaterialParameterPtr& param = it->second;

        IShaderUniformPtr uniform = effect->GetUniform(param->GetName());
        if (!uniform)
            continue;

        switch (param->GetType())
        {
        case MaterialParameter::Float:
            effect->SetUniform(uniform, static_cast<MatParamFloat*>(param.get())->value);
            break;

This first solution I've used was a virtual function in the base MaterialParameter class and the template class just overrided it and called the effect's SetUniform fcuntion. However, the reason why I'm using this approach are the textures. Because I cannot set a texture directly (to OpenGL at least) just a "sampler id". But the parameter itself does not know anything about the other parameters so a TextureParameter does not know which id it should use.

 

- The renderable objects can now only be sorted by effect first then textures (but I think it's okay, I would do this anyway).

- Is this an acceptable approach?

- Any tips?

 

Let me know If anyone interested in some details.

 

Thanks

Edited by csisy

Share this post


Link to post
Share on other sites
Advertisement

Don't follow Unity's and UE4's exact approach because they're overengineered techs born out of DX9-style rendering which had to evolve and adapt over time.

 

If you design your material system that way, you're going to inherit the same slowness that plagues those engines.

 

There's no need for so many classes.

All you have is:

  1. Shaders. Make a representation that simply encapsulates the file and compiles it according to input parameters.
  2. Materials. A collection of shaders with per-material parameters that affect how the shader will be compiled, what parameters will be passed during draw instead of compile time, and what textures will be bound.
  3. MaterialManager. Aside from creating materials, it's responsible for keeping shared per-pass parameters (such as view matrices, fog parameters) in a different place (i.e. different const buffer). It also is aware of Materials and Renderable objects so that it can match inputs that are per-object during rendering (such as the world matrix, bone matrices in the case of skinning)

That's all you need. Also stop thinking in terms of parameters, that's a DX9-style thing that nowadays only works well for postprocessing effects and some compute shaders. Start thinking in terms of memory layouts (buffers) and frequency of updates (there's generally going to be 3 buffers: 1 is updated per pass; 1 is per material, updated when a material stored in that buffer changes; 1 is updated per object)

Share this post


Link to post
Share on other sites

Thanks for the replies! I forgot to mention I'm currently targeting OGL 2.0 and DX9 where UBOs are not available, that's why I've chosen this ugly, old-school design. :) However I'm thinking on "upgrading" to OGL >= 3.0 and DX >= 11.0 since I'm using Deferred Shading in the engine. And if the hardware can handle deferred shading it probably supports at least OGL 3.0 as well.

 

@Matias Goldberg:

I don't know how slow their system is, but I like the flexibility this approach gives to me. :)

 

@Hodgman:

It seems our design is similar, I'm just using different names for the classes. And a slower pass and techique selection.

My Material is similar to your Technique. Your Pass is a collection of my Effects. My Effect is a single permutation of your Pass.

 

However I'm currently using an ugly <string, Effect> map, where the string is generated from the pass name and the given keywords (like "NORMAL_MAPPING_ON"). This isn't a permament solution, I will change something similar to your solution.

// ... when looking for a specific variant ...
const CacheId cacheId = GenCacheId(allKeywords, pass);

// this is ugly, but whatever... it works for now :)
Material::CacheId Material::GenCacheId(const Keywords& keywords, const std::string& pass) const
{
    std::string result = "PASS=" + pass + ";";

    for (const Keyword& keyword : keywords)
        result += keyword + ";";

    return result;
}
Edited by csisy

Share this post


Link to post
Share on other sites

Thanks for the replies! I forgot to mention I'm currently targeting OGL 2.0 and DX9 where UBOs are not available, that's why I've chosen this ugly, old-school design. :) However I'm thinking on "upgrading" to OGL >= 3.0 and DX >= 11.0 since I'm using Deferred Shading in the engine. And if the hardware can handle deferred shading it probably supports at least OGL 3.0 as well.

I support D3D9 under my API by emulating CBuffers on top of it :)

My permutation selection function looks like below.
The algorithm is not very intuitive at first, but it does guarantee that you never deliver a permutation that has options that were not requested, and also guarantees you do deliver the permutation that most closely matches the request, which is exactly what we want.
e.g. imagine someone asks for normal mapping to be enabled, but this technique/material does not support such a feature -- your string lookup will fail in this case, whereas this algorithm simply ignores that invalid request.
It does require you to pre-sort your list of permutations/bindings/effects by the number of bits set in their options mask, from highest to lowest, and to always end the list with a permutation with no bits set (an 'all options disabled' permutation), which will always be a valid fallback result (meaning that return -1; at the bottom should never be reached). Unfortunately it's an O(N) search, but N is the number of permutations in the pass, which is usually very small, so that's not really an issue. Your dictionary lookup could in theory be O(1) or O(N logN), yet it's likely waay slower than this -- e.g. 8 of these Permutation objects will fit in a single cache line, which is a single RAM transfer, so if you've got a handful of permutations the linear search may as well be O(1) :wink: You should also do this searches once (ahead of time) and reuse the result every frame until your material is edited -- if you follow that advice it doesn't really matter how expensive the permutation lookup is! :D

int internal::ShaderPackBlob::SelectProgramsIndex( u32 techniqueId, u32 passId, u32 featuresRequested )
{
	eiASSERT( techniqueId < numTechniques );
	Technique&         technique    = techniques[techniqueId];
	List<Pass>&        passes       = *technique.passes;
	if( passId >= passes.count )
		return -1;
	Pass&              pass         = passes[passId];
	List<Permutation>& permutations = *pass.permutations;
	u32 prevPermutationFeatureCount = 64;
	for( u32 k = 0, end = permutations.count; k != end; ++k )
	{
		Permutation& permutation = permutations[k];
		eiASSERT( prevPermutationFeatureCount >= CountBitsSet(permutation.features) );
		prevPermutationFeatureCount = CountBitsSet(permutation.features);//debug code only - check the array is sorted correctly

		if( (featuresRequested & permutation.features) == permutation.features )//if this does not provide features that weren't asked for, we'll take it
		{
			return permutation.bindingIdx;
		}
	}
	return -1;
}
Edited by Hodgman

Share this post


Link to post
Share on other sites

Thanks for the replies! I forgot to mention I'm currently targeting OGL 2.0 and DX9 where UBOs are not available, that's why I've chosen this ugly, old-school design. :) However I'm thinking on "upgrading" to OGL >= 3.0 and DX >= 11.0 since I'm using Deferred Shading in the engine. And if the hardware can handle deferred shading it probably supports at least OGL 3.0 as well.

 

I support D3D9 under my API by emulating CBuffers on top of it :)

I'll check that, thanks! :)

 

The algorithm is not very intuitive at first, but it does guarantee that you never deliver a permutation that has options that were not requested, and also guarantees you do deliver the permutation that most closely matches the request, which is exactly what we want.

e.g. imagine someone asks for normal mapping to be enabled, but this technique/material does not support such a feature -- your string lookup will fail in this case, whereas this algorithm simply ignores that invalid request.

It's true, however for some reason this was a design choice from my side. This is the reason:

Let's talk about the deferred renderer. I have a "big" (it's actually not as big :)) shader for rendering objects to the gbuffer. This shader contains many defines, here is the full list:

// Available defines:
// ANIMATED
// REQUIRE_VIEW_DIR
// REQUIRE_SCREEN_POS
// REQUIRE_WORLD_POS
// FRAGMENT_WRITES_NORMAL
// CUSTOM_VERTEX_PROCESSING

Based on these defines, a different variant is compiled.

When a user (actually it's just me :)) defines a new Material (or Technique in your terminology), the code is concatenated with this base source code. So when defining a new Material, the user must provide valid defines if wants to use a feature.

 

Here is an example: I've created a simple normal mapping material which - in this case - must define eg. FRAGMENT_WRITES_NORMAL. If it's not defined, the shader won't compile because the function will be slightly different (just an "in" instead of "inout" for the normal)

Parameters
{
    Parameter
    {
        Name { "baseColor" }
        Type { "Color" }
        Default { "1.0, 1.0, 1.0, 1.0" }
    }
    Parameter
    {
        Name { "baseTexture" }
        Type { "Texture2D" }
        Default { "Engine/default_diffuse.png" }
    }
    Parameter
    {
        Name { "normalTexture" }
        Type { "Texture2D" }
        Default { "Engine/default_normal.png" }
    }
}
Shader
{
    Queue { Opaque }

    Pragma
    {
        CODE
        #define CUSTOM_VERTEX_PROCESSING
        #define CUSTOM_FRAGMENT_PROCESSING
        #define FRAGMENT_WRITES_NORMAL
        ENDCODE
    }
    
    Pass
    {
        Name { "Deferred" }
        
        Vertex
        {
            CODE
            varying vec2 Texcoord;

            void CustomVertex()
            {
                Texcoord = inTexcoord_0;
            }
            ENDCODE
        }
        
        Fragment
        {
            CODE
            varying vec2 Texcoord;

            uniform vec4 baseColor;
            uniform sampler2D baseTexture;
            uniform sampler2D normalTexture;

            void CustomFragment(inout vec3 oBaseColor, inout vec3 oNormal, inout float oSpecularIntensity, inout float oShininess, inout vec3 oEmission)
            {
                oBaseColor = texture2D(baseTexture, Texcoord).rgb * baseColor.xyz;
                oNormal = UnpackNormal(texture2D(normalTexture, Texcoord).rgb);
            }
            ENDCODE
        }
    }
}

This could be automatized and I think the UE4 does something similar under the hood. And the Unity as well. If we attach a texture to the "Normal" input of the material, a different variant will be compiled. Well, this ugly define-solution is the same, just... ugly. :)

 

But of course I agree with you, the string concatenation and using for lookup things is not the best solution and I will probably use something similar to your code. Thanks for sharing that!

 

EDIT:

The MaterialInstance can also have defines. This way I can create a big Material (like the Standard shader in Unity) and use the keywords of the MaterialInstance to create a different permutation. So the previous normal mapping material would be:

Parameters
{
    [...]
}
Shader
{
    Queue { Opaque }

    Pragma
    {
        CODE
        #define CUSTOM_VERTEX_PROCESSING
        #define CUSTOM_FRAGMENT_PROCESSING
        #ifdef NORMAL_MAPPING
            #define FRAGMENT_WRITES_NORMAL
        #endif
        ENDCODE
    }
    
    Pass
    {
        [...]
        
        Fragment
        {
            CODE
            varying vec2 Texcoord;

            uniform vec4 baseColor;
            uniform sampler2D baseTexture;
            uniform sampler2D normalTexture;

            void CustomFragment(inout vec3 oBaseColor, inout vec3 oNormal, inout float oSpecularIntensity, inout float oShininess, inout vec3 oEmission)
            {
                oBaseColor = texture2D(baseTexture, Texcoord).rgb * baseColor.xyz;
            #ifdef NORMAL_MAPPING
                oNormal = UnpackNormal(texture2D(normalTexture, Texcoord).rgb);
            #endif
            }
            ENDCODE
        }
    }
}

Then a MaterialInstance is free to add or not a "NORMAL_MAPPING" keyword.

So the defines in the Material controls the underlying (the big shader) code and the defines in the MaterialInstance controls the Material's code.

 

To use defines for these things was an optimization choice to create faster shader if eg. the view dir is not needed in any computation.

Edited by csisy

Share this post


Link to post
Share on other sites

Okay, let's turn back to UBOs a bit. Let me use the word "uniform" for data passed to the shader.

 

Let's suppose that I'll change to OpenGL 3.1 so the UBOs become available. I guess I got the basics:

 

The GPU side:

In the shader define some blocks, eg. based on update frequency: a "global block" for camera and ambient uniforms, an "object block" for everything common in objects (like world matrix) and a "material block" which can be custom for each material.

 

The CPU side:

First of all, I have to create the buffers (like any other buffers, eg. VBO, IBO, etc.) with the required size. This can be acquired from the program.

Then use global indices for each buffer. Eg. 1 will be the global block's binding index.

Bind the buffers to the binding point/id.

Also bind the program's block to the same binding point.

 

This way when I update the data in the buffer, the data is passed to the shader automatically and no extra work is required.

 

If I got the basics right, here comes some "questions" from me :)

- so the "uniform data" is transmitted for the GPU only when I update the buffer? (like it would if I use VBO)

- if a shader uses only the viewProj matrix of the "global block" the other data in that block is unnecessary - but it's not a big deal if the "transmitting" is done only when the buffer is updated.

- for different light types should I use different UBOs or pack them together? Eg. the point light has an extra parameter "range" or "invSqrRadius" or something like that. The directional light does not have position but direction and vice-versa. I can pack all of these values into a single UBO or I can split them and make a "CommonLight block" and another specific for the given light.

- is there any performance hit when binding a program's block to a binding point but the block is not used in the shader?

- is the shader developer's responsibility to use the same block definition as the other? I guess I would check the "equality" anway when a new variant is compiled.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!