Dynamic branching in shader not working. Keeps jumping out.

Started by
9 comments, last by Zoner 11 years, 9 months ago
I'm trying to implement a branching inside my uber shader to achieve different material types using deferred shading but keep getting problems with the branching.

Now this is where it seems to cause the problem:


float4 PS_Directional(VSO input) : SV_TARGET
{
float4 output = float4(0.0f, 0.0f, 0.0f, 1.0f);

// Get Material ID
float materialID = tex2D(AlbedoBuffer, input.UV).a;

// Get Normals
half4 encodedNormal = tex2D(NormalBuffer, input.UV);
half3 Normal = mul(normalize(decode(encodedNormal)), inverseView);


if(materialID == 0.02f)
{
output = 1.0f;
}


return output;
}



As you can see I'm doing an if on the materialID to set a pixel color specifically for each objects material.
Now when I run this I just get a black screen, as it's (output) initialized at the beginning.
Running this in PIX I can see him going inside the if statement, setting output to the correct color and then going back to the if statement and returning 0.0f again.
What is going on here ? Why does it set the correct color and jump back afterwards to undo it ??
Advertisement
First of all try to avoid doing dynamic branching like this, and look into less naive approaches for rendering multiple materials in a deferred setup. This thread has some good ideas for such approaches: http://www.gamedev.n...erred-renderer/
Have a look at hodgman's post about using a material mask to reduce branching to an absolute minimum (I've built a variant of this technique which does deferred rendering with 3 different BRDFs myself)

Second, I believe your issue here has to do with floating point error. Due to the way floating point values are stored in memory you can never assume a statement like 'materialId == 0.02f' to return true, because the actual value of your floating point number would probably be something lower than 0.02f
Always use a small delta range for checking equality between floating point numbers

I gets all your texture budgets!

Radikalzm is most likely correct.

An easy way to find out would be:
float materialID = tex2D(AlbedoBuffer, input.UV).a;
double d = materialID;

Check the value of 'd'. It'll be something like 0.019999999552965164
I'm not so sure about the advice to avoid dynamic branching at all costs here. Not so long ago I would have been, and would have given much the same advice, but times change and more info comes in.

The key thing is that the HLSL code you write does not necessarily need to bear any relationship to the code generated by the shader compiler. It's incredibly instructive to look at the generated asm code in PIX or another shader debugging tool - you may find that the HLSL code with branching has been converted into code that has no branching whatsoever. The D3D shader compiler is extremely good at doing this, and the whole experience can be quite an eye-opener (not to mention changing your attitude towards shader branching).

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

I see that is quite an interesting solution there @Radikalizm.
On topic: You are right I just tried dividing the materialID by 10 and multiplying it by 10 again after reading it back and it suddenly works.

I'm not so sure about the advice to avoid dynamic branching at all costs here. Not so long ago I would have been, and would have given much the same advice, but times change and more info comes in.

The key thing is that the HLSL code you write does not necessarily need to bear any relationship to the code generated by the shader compiler. It's incredibly instructive to look at the generated asm code in PIX or another shader debugging tool - you may find that the HLSL code with branching has been converted into code that has no branching whatsoever. The D3D shader compiler is extremely good at doing this, and the whole experience can be quite an eye-opener (not to mention changing your attitude towards shader branching).


In this simple case with just a really small amount of branches I agree with you that it won't really have an impact (on more modern GPUs that is), however when it comes to a setup which can support a lot of materials it will become a problem. I believe pixel shaders get executed in some sort of thread groups (terminology?), and as long as all the pixels in such a group take the same path of the branch there won't be much of a problem. This does become a problem however when multiple paths are executed, since the pixel shader will run through all the options and pick the right result afterwards for all pixels in said group (please do correct me if I'm wrong here, my knowledge of the matter isn't completely solid).

IMO it's still best to avoid use of branching, some small branches here and there won't hurt, but for things like lighting calculations I believe it'd be best to find an alternative when it comes to a greater amount of branch options.

I gets all your texture budgets!

Ugh now I'm having a new problem.

Using code like this:

if(materialID < 2.1f &amp;&amp; materialID > 1.9f)
{
return float4(1.0f, 0.0f, 0.0f, 1.0f);
}
else if(materialID < 1.1f &amp;&amp; materialID > 0.9f)
{
return 1.0f;
}


gives me the object in the right color (red) but with a white border...

I guess it comes back to the same problem with precision, right ?
What value range would I use to define this more clearly for the pixel shader ?
PIX tells me the pixel of the white border uses a materialID of 1.0
Could you maybe provide a screenshot of the problem? Could be a problem with texture filtering when sampling your texture for a material ID, or maybe your material ID isn't being written to your g-buffer correctly

I gets all your texture budgets!

Looks like this: http://cl.ly/082u1i2O3s3X43391x0W

I think it's the overlapping materialID values that cause this.
Using a greater distance between one and another e.g. 0-1 (ID 1) and 2-3 (ID 2) seems to work.

edit: You might be right, haven't thought of the idea that the linear filtering might interpolate those materialID values quite a bit.

Update: It was indeed the texture filtering that was causing this precision issue. Using point clamp works perfectly.
I think conditional branching in shaders only provides a performance hit if each SIMD instance might take a unique branch. If so, you will get a SIMD 'stall'. The shared branch instances will execute in the current active working set, then the other branch set will execute, and then eventually they will 'sync' up afer all branch sets and once again grinf away efficiently as SIMD across the entire working set.

But just because you take a hit doesn't mean you can't do it. Just be aware there is a hit. Instrument and measure the hit, compare normal cases with extremes. It might not be so bad, that totally depends on the logic.

This issue becomes more front and center with openCL but it also applies to shaders.

This topic is closed to new replies.

Advertisement