# [HLSL] Can I improve this somehow?

This topic is 2864 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Here, I need to call this function many times (About 8 or 9) in my shader:
float GetLayer(PixelShaderInput Input){	//return 1;	switch(Color.x)	{		case 0:		return 1;				case 1:		return Tex(TerLayers).r;				case 2:		return Tex(TerLayers).g;				case 3:		return Tex(TerLayers).b;				case 4:		return Tex(TerLayers).a;		default:		return 0;	}}

And it kicks my FPS from 50 down to 30. I also compile my shaders at the highes optimization level, so do I really do 8-9 texture lookups? "Color" is a global variable set by my application.

##### Share on other sites
make color a float4

example:

// Color is float4(0, 0, 0, 1)

return Tex(TerLayers) * Color;

// returns Tex(TerLayers).a;

edit: sorry, just noticed you want to return a float only. but maybe you can somehow make it usable.

Or this quick fix:

float4 result = Tex(TerLayers) * Color;

return result.x + result.y + result.z + result.a; // maybe theres a function for this

[Edited by - scope on July 16, 2010 6:54:38 PM]

##### Share on other sites
hm... That would restrict me to 4 layers. Currently I am able to have 6 but I only implemented 4 of them. But good Idea, I will try that,but not yet. It's 1AM here and I'll go to bed now. [smile]

##### Share on other sites
maybe describe what you'd like to achieve. is it some sort of multipass terrain texturing thing?

##### Share on other sites

//...float retColor[4] = Tex(TerLayers);if (Color.x) //check it's non-zero    return retColor[Color.x - 1];else    return 1;//...

##### Share on other sites
If you can change Color, then how about
float4 mask = float4( 0, 1, 0, 0 );...float4 layer = Tex(TerLayers);return any( mask ) ? layer.r * mask.r + layer.g * mask.g + layer.b * mask.b + layer.a * mask.a : 1;
This way you can also blend, for example float4( 0.5, 0, 0.5, 0 ) gives 50% layer.r and 50% layer.b

##### Share on other sites
Currently I use this piece of code to blend between textures in my terrain. It allows me use 5 layers, 1 base layer and 4 layers determined by the rgba channels of the weights texture.

float4 weights = WeightsAt(IN.Pos3D.xz);float totalWeight = weights.x + weights.y + weights.z + weights.w;if(totalWeight > 1){	weights /= totalWeight;	totalWeight = 1;}OUT.ColourSpec.rgb = col1 * weights.x + col2 * weights.y + col3 * weights.z + col4 * weights.w + col0 * (1 - totalWeight);OUT.NormalHard.rgb = norm1 * weights.x + norm2 * weights.y + norm3 * weights.z + norm4 * weights.w + norm0 * (1 - totalWeight);OUT.ColourSpec.a = specAmount1 * weights.x + specAmount2 * weights.y + specAmount3 * weights.z + specAmount4 * weights.w + specAmount0 * (1 - totalWeight);OUT.NormalHard.a = specHard1 * weights.x + specHard2 * weights.y + specHard3 * weights.z + specHard4 * weights.w + specHard0 * (1 - totalWeight);

##### Share on other sites
@scope: I wonder if that is still fast if I do all those adds
@programci_84: This is a nice thing, but it seems to fail at compilation time:
Quote:
 TerrainPixelShader.fxh(28,9): error X3017: cannot convert from 'float4' to 'float[4]'

I wonder if I can do this conversion somehow. That seems to fit my needs really good [smile]

@Promethium, Darg: I don't have all my layers in that one shader. So I can't do this [sad]

What I need is indeed a multi-pass terrain renderer. That all happens in screenspace if that helps...

##### Share on other sites
Generally if you're new to shaders, and you're using if or switch, then it can probably be optimized ;)
Rule 1 of shaders is don't branch unless it's really necessary, or you can prove it's an optimization.

I haven't tested this, but it should be equivalent to your original GetLayer function (assuming Color.x is a float and is always positive), but with the nasty branching replaced with 4 dots, 2 steps, some swizzling and an add.
float GetLayer(PixelShaderInput Input){	//generate all the possible results	float4 result1234 = Tex(TerLayers);	float2 result05 = float2(1.0, 0.0);	//generate a 0/1 value for each case to say whether it can be true	float4 case1234 = step( float4(1.0, 2.0, 3.0, 4.0), Color.x );	float2 case05   = step( float2(0.0, 4.004),         Color.x );	//at this point more than one case may be true, because 'step' does >= instead of ==, so lets fix that.	//1 can't be true if 2 is true	//2 can't be true if 3 is true	//3 can't be true if 4 is true	//4 can't be true if 5 is true	not1234 = float4( case1234.yzw, case05.y );	//0 can't be true if 1 is true	//5 can always be true	not05   = float4( case1234.x, 0.0 );	//AND the conditions with their 'nots'	case1234 = dot( case1234, 1.0-not1234 );	case05   = dot( case05,   1.0-not05 );	//Now, case1234 and case05 should be all 0's with a single 1 somewhere in them.	//multiply the conditions (0/1's) with their associated return values	return dot( case1234, result1234 ) + dot( case05, result05 );}

[Edited by - Hodgman on July 17, 2010 9:39:09 AM]

##### Share on other sites
I have no idea why this works, but it works, thank you! [smile]

• 10
• 16
• 14
• 18
• 15