Jump to content
  • Advertisement
Sign in to follow this  

[HLSL] Can I improve this somehow?

This topic is 2864 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Here, I need to call this function many times (About 8 or 9) in my shader:

float GetLayer(PixelShaderInput Input)
{
//return 1;
switch(Color.x)
{
case 0:
return 1;

case 1:
return Tex(TerLayers).r;

case 2:
return Tex(TerLayers).g;

case 3:
return Tex(TerLayers).b;

case 4:
return Tex(TerLayers).a;
default:
return 0;
}
}




And it kicks my FPS from 50 down to 30. I also compile my shaders at the highes optimization level, so do I really do 8-9 texture lookups? "Color" is a global variable set by my application.

Share this post


Link to post
Share on other sites
Advertisement
make color a float4


example:

// Color is float4(0, 0, 0, 1)

return Tex(TerLayers) * Color;

// returns Tex(TerLayers).a;

edit: sorry, just noticed you want to return a float only. but maybe you can somehow make it usable.

Or this quick fix:

float4 result = Tex(TerLayers) * Color;

return result.x + result.y + result.z + result.a; // maybe theres a function for this

[Edited by - scope on July 16, 2010 6:54:38 PM]

Share this post


Link to post
Share on other sites
hm... That would restrict me to 4 layers. Currently I am able to have 6 but I only implemented 4 of them. But good Idea, I will try that,but not yet. It's 1AM here and I'll go to bed now. [smile]

Share this post


Link to post
Share on other sites
Maybe this snippet can help you:


//...
float retColor[4] = Tex(TerLayers);

if (Color.x) //check it's non-zero
return retColor[Color.x - 1];
else
return 1;
//...


Share this post


Link to post
Share on other sites
If you can change Color, then how about

float4 mask = float4( 0, 1, 0, 0 );
...
float4 layer = Tex(TerLayers);
return any( mask ) ? layer.r * mask.r + layer.g * mask.g + layer.b * mask.b + layer.a * mask.a : 1;

This way you can also blend, for example float4( 0.5, 0, 0.5, 0 ) gives 50% layer.r and 50% layer.b

Share this post


Link to post
Share on other sites
Currently I use this piece of code to blend between textures in my terrain. It allows me use 5 layers, 1 base layer and 4 layers determined by the rgba channels of the weights texture.


float4 weights = WeightsAt(IN.Pos3D.xz);
float totalWeight = weights.x + weights.y + weights.z + weights.w;
if(totalWeight > 1)
{
weights /= totalWeight;
totalWeight = 1;
}

OUT.ColourSpec.rgb = col1 * weights.x + col2 * weights.y + col3 * weights.z + col4 * weights.w + col0 * (1 - totalWeight);
OUT.NormalHard.rgb = norm1 * weights.x + norm2 * weights.y + norm3 * weights.z + norm4 * weights.w + norm0 * (1 - totalWeight);
OUT.ColourSpec.a = specAmount1 * weights.x + specAmount2 * weights.y + specAmount3 * weights.z + specAmount4 * weights.w + specAmount0 * (1 - totalWeight);
OUT.NormalHard.a = specHard1 * weights.x + specHard2 * weights.y + specHard3 * weights.z + specHard4 * weights.w + specHard0 * (1 - totalWeight);


Share this post


Link to post
Share on other sites
@scope: I wonder if that is still fast if I do all those adds
@programci_84: This is a nice thing, but it seems to fail at compilation time:
Quote:

TerrainPixelShader.fxh(28,9): error X3017: cannot convert from 'float4' to 'float[4]'


I wonder if I can do this conversion somehow. That seems to fit my needs really good [smile]

@Promethium, Darg: I don't have all my layers in that one shader. So I can't do this [sad]

What I need is indeed a multi-pass terrain renderer. That all happens in screenspace if that helps...

Share this post


Link to post
Share on other sites
Generally if you're new to shaders, and you're using if or switch, then it can probably be optimized ;)
Rule 1 of shaders is don't branch unless it's really necessary, or you can prove it's an optimization.

I haven't tested this, but it should be equivalent to your original GetLayer function (assuming Color.x is a float and is always positive), but with the nasty branching replaced with 4 dots, 2 steps, some swizzling and an add.
float GetLayer(PixelShaderInput Input)
{
//generate all the possible results
float4 result1234 = Tex(TerLayers);
float2 result05 = float2(1.0, 0.0);

//generate a 0/1 value for each case to say whether it can be true
float4 case1234 = step( float4(1.0, 2.0, 3.0, 4.0), Color.x );
float2 case05 = step( float2(0.0, 4.004), Color.x );

//at this point more than one case may be true, because 'step' does >= instead of ==, so lets fix that.
//1 can't be true if 2 is true
//2 can't be true if 3 is true
//3 can't be true if 4 is true
//4 can't be true if 5 is true
not1234 = float4( case1234.yzw, case05.y );
//0 can't be true if 1 is true
//5 can always be true
not05 = float4( case1234.x, 0.0 );
//AND the conditions with their 'nots'
case1234 = dot( case1234, 1.0-not1234 );
case05 = dot( case05, 1.0-not05 );
//Now, case1234 and case05 should be all 0's with a single 1 somewhere in them.
//multiply the conditions (0/1's) with their associated return values
return dot( case1234, result1234 ) + dot( case05, result05 );
}


[Edited by - Hodgman on July 17, 2010 9:39:09 AM]

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!