# DX11 DX11 Shader Generation

## Recommended Posts

Migi0027    4628

This is just a question of curiosity by the way.

So some few weeks ago I was curious on how to render an object multiple times with different to archive different results, and the only problem is that if I try to use additive blending, I get a black object.  Now to the point,  because of this, a guy said to me:

"You could also just write a single shader that performs the appropriate computations for the diffuse term, direction lighting term, and reads the color from the read and modulates it by the lighting terms."

Now this is a good idea, but i wanted to give my engine the freedom to render any shaders together (render 1, render 2, etc...), and by that it would be easier! But he is right...

So I though, why not generate a shader, like:

String Diff  = "color = somecolor"
String Light = "...color += something..."

WholeShader = Buffers + Layout + Vertex + Pixel;

Compile WholeShader

Now could something like this be made, due to flexibility, and is this how real material editors work? (E.g. UDK Material Editor)

Thank You

##### Share on other sites
MJP    19754

Yes, that is the basic idea behind node-based shader editors like the one in UDK. Once you start implementing it though I think you'll find that it can get pretty hairy, due to complex interactions between the various features that you support. Personally I think it's easier (and better overall) to go with an ubershader approach where you just turn features on and off.

##### Share on other sites
Migi0027    4628

Yes, that is the basic idea behind node-based shader editors like the one in UDK. Once you start implementing it though I think you'll find that it can get pretty hairy, due to complex interactions between the various features that you support. Personally I think it's easier (and better overall) to go with an ubershader approach where you just turn features on and off.

How exacly does this ubershader work(I've heard of it, but never wen't into it)? Is it a framework?

##### Share on other sites
MJP    19754

"Ubershader" is just a name for a general pattern for authoring shader permutations. Typically you will write one large vertex and pixel shader with all possible features implemented, and then you will wrap the corresponding code for a certain features in "#ifdef" preprocessor statements. This allows you to compile different permutations of your shader with various features enabled or disabled by passing different macro definitions when compiling the shader. Here's a really simple example:

#ifdef ENABLE_COLOR_TEXTURE
Texture2D ColorTexture;
SamplerSTate ColorTextureSampler;
#endif

struct PSInput
{
float4 VertexColor : VTXCOLOR;

#ifdef ENABLE_COLOR_TEXTURE
float2 UV          : UV0;
#endif
};

float4 PSMain(in PSInput input) : SV_Target0
{
float4 color = input.VertexColor;

#ifdef ENABLE_COLOR_TEXTURE
color *= ColorTexture.Sample(ColorTextureSampler, input.UV);
#endif

return color;
}


This is pixel shader for an ubershader that has 2 permutations: one with a color texture enabled, and one without. So you would compile one where you'd specify that you want ENABLE_COLOR_TEXTURE defined as an extra preprocessor macro, and one where you wouldn't. Then you can switch between the two at runtime depending on what you need for a particular mesh. Or alternatively you can have your features be options when authoring a material, and then you can specifically compile a shader permutation for that material. Typically you'll have lots of options that you want to disable, such as lighting, normal mapping, skinning, etc. Supporting all possible permutations of a bunch of on/off options then requires you to compile 2^N shaders, where N is the number of options you want to support. A common way index into all of these shaders is with a bitfield, where each the value of each bit corresponds to a feature being turned on or off. This gives you a simple hash that you can use to lookup into std::map or a similar data structure.

You can also support enabling or disabling features by wrapping them in regular if statements, and passing a bool through a constant buffer to specify whether you want them on or off. This saves you from having to compile different shaders and switch between them at runtime, which can save you build time and can also potentially save you some CPU time due to driver overhead from switching shaders. However in generally it will result in less optimal compiled shader code, since the compiler will be unable to fully optimize out the code in a branch that's not taken (or any code that produces results required for the branch not taken). This can result in extra instructions being executed, and potentially higher register usage which reduces thread occupancy. There's also some cost to actually executing a branch instruction, although this is generally small on modern DX11-capable hardware. The one major limitation of branches is that you can't use them to disable vertex shader inputs. Here's a simple example showing what I mean:

cbuffer VSConstants
{
float4x4 World;
float4x4 WorldViewProj;
}

#ifdef ENABLE_SKINNING
static const uint MaxBones = 256;

cbuffer SkinningConstants
{
float4x4 Bones[MaxBones];
}

struct VSInput
{
float3 Position : POSITION;
float3 Normal : NORMAL;

#ifdef ENABLE_SKINNING
uint4 SkinIndices : SKININDICES;
float4 SkinWeights : SKINWEIGHTS;
#endif
};

struct VSOutput
{
float4 Position : SV_Position;
float3 Normal : NORMAL;
};

VSOutput VSMain(in VSInput input)
{
VSOutput output;

#ifdef ENABLE_SKINNING
float3 position = 0.0f;
float3 normal = 0.0f;

[unroll]
for(uint i = 0; i < 4; ++i)
{
float4x4 bone = Bones[input.SkinIndices[i]];
float weight = input.SkinWeights[i];
position += mul(float4(input.Position, 1.0f), bone).xyz * weight;
normal = mul(float4(normal, 0.0f), bone).xyz * weight;
}
#else
float3 position = input.Position;
float3 normal = input.Normal;
#endif

output.Position = mul(float4(position, 1.0f), WorldViewProj).xyz;
output.Normal = mul(float4(normal, 0.0f), World).xyz;

return output;
}


If you were to try to implement this feature using runtime branching instead of preprocessor macros, you'd have the problem that you wouldn't be able to disable the SkinWeights and SkinIndices vertex inputs. This means that even when skinning is disabled the shader would still expect those elements to be provided in a vertex buffer, so you'd either need to pad out your vertex buffers or pass dummy vertex buffers as a separate stream. It also means that the shader will expect the "Bones" constant buffer to be bound, which means you'll get spammed with warnings from the debug device (although the shader will still work fine as long as you don't actually use anything from that constant buffer). Edited by MJP

## Create an account or sign in to comment

You need to be a member in order to leave a comment

## Create an account

Sign up for a new account in our community. It's easy!

Register a new account

• ### Similar Content

• In DirectX 11 we have a 24 bit integer depth + 8bit stencil format for depth-stencil resources ( DXGI_FORMAT_D24_UNORM_S8_UINT ). However, in an AMD GPU documentation for consoles I have seen they mentioned, that internally this format is implemented as a 64 bit resource with 32 bits for depth (but just truncated for 24 bits) and 32 bits for stencil (truncated to 8 bits). AMD recommends using a 32 bit floating point depth buffer instead with 8 bit stencil which is this format: DXGI_FORMAT_D32_FLOAT_S8X24_UINT.
Does anyone know why this is? What is the usual way of doing this, just follow the recommendation and use a 64 bit depthstencil? Are there performance considerations or is it just recommended to not waste memory? What about Nvidia and Intel, is using a 24 bit depthbuffer relevant on their hardware?
Cheers!

• By gsc
Hi! I am trying to implement simple SSAO postprocess. The main source of my knowledge on this topic is that awesome tutorial.
But unfortunately something doesn't work... And after a few long hours I need some help. Here is my hlsl shader:
float3 randVec = _noise * 2.0f - 1.0f; // noise: vec: {[0;1], [0;1], 0} float3 tangent = normalize(randVec - normalVS * dot(randVec, normalVS)); float3 bitangent = cross(tangent, normalVS); float3x3 TBN = float3x3(tangent, bitangent, normalVS); float occlusion = 0.0; for (int i = 0; i < kernelSize; ++i) { float3 samplePos = samples[i].xyz; // samples: {[-1;1], [-1;1], [0;1]} samplePos = mul(samplePos, TBN); samplePos = positionVS.xyz + samplePos * ssaoRadius; float4 offset = float4(samplePos, 1.0f); offset = mul(offset, projectionMatrix); offset.xy /= offset.w; offset.y = -offset.y; offset.xy = offset.xy * 0.5f + 0.5f; float sampleDepth = tex_4.Sample(textureSampler, offset.xy).a; sampleDepth = vsPosFromDepth(sampleDepth, offset.xy).z; const float threshold = 0.025f; float rangeCheck = abs(positionVS.z - sampleDepth) < ssaoRadius ? 1.0 : 0.0; occlusion += (sampleDepth <= samplePos.z + threshold ? 1.0 : 0.0) * rangeCheck; } occlusion = saturate(1 - (occlusion / kernelSize)); And current result: http://imgur.com/UX2X1fc
I will really appreciate for any advice!
• By isu diss
I'm trying to code Rayleigh part of Nishita's model (Display Method of the Sky Color Taking into Account Multiple Scattering). I get black screen no colors. Can anyone find the issue for me?

• I made my obj parser
and It also calculate tagent space for normalmap.
it seems calculation is wrong..
any good suggestion for this?
I can't upload my pics so I link my question.
https://gamedev.stackexchange.com/questions/147199/how-to-debug-calculating-tangent-space
and I uploaded my code here