Texture Blending Shaders

Started by
19 comments, last by Plerion 14 years, 4 months ago
Hello! After using blending on the device directly

m_3dMgr->SetTexture(m_textures, 0);
		dev->SetTextureStageState(0, D3DTSS_COLOROP, D3DTOP_SELECTARG1); // color
		dev->SetTextureStageState(0, D3DTSS_COLORARG1, D3DTA_TEXTURE);
		dev->SetTextureStageState(0, D3DTSS_COLORARG2, D3DTA_TEXTURE);

		dev->SetTexture(1, m_alphaMaps);
		dev->SetTextureStageState(1, D3DTSS_COLOROP, D3DTOP_SELECTARG2); // colorstage
		dev->SetTextureStageState(1, D3DTSS_COLORARG1, D3DTA_TEXTURE);
		dev->SetTextureStageState(1, D3DTSS_COLORARG2, D3DTA_CURRENT);

		dev->SetTextureStageState(1, D3DTSS_ALPHAOP, D3DTOP_SELECTARG2); // alphastage
		dev->SetTextureStageState(1, D3DTSS_ALPHAARG1, D3DTA_CURRENT);
		dev->SetTextureStageState(1, D3DTSS_ALPHAARG2, D3DTA_TEXTURE);
and the resulting low performance i decided to change to a pixelshader as som epeople told me. This shader now looks as following:

sampler2D g_samSrcColor;

texture alpha2Tex;
texture stage2Tex;

texture alpha3Tex;
texture stage3Tex;

texture alpha4Tex;
texture stage4Tex;


sampler Sampler1_1 = sampler_state { texture = <alpha2Tex> ; magfilter = LINEAR; minfilter = LINEAR; mipfilter=LINEAR; AddressU = wrap; AddressV = wrap;};
sampler Sampler1_2 = sampler_state { texture = <stage2Tex> ; magfilter = LINEAR; minfilter = LINEAR; mipfilter=LINEAR; AddressU = wrap; AddressV = wrap;};

sampler Sampler2_1 = sampler_state { texture = <alpha3Tex> ; magfilter = LINEAR; minfilter = LINEAR; mipfilter=LINEAR; AddressU = wrap; AddressV = wrap;};
sampler Sampler2_2 = sampler_state { texture = <stage3Tex> ; magfilter = LINEAR; minfilter = LINEAR; mipfilter=LINEAR; AddressU = wrap; AddressV = wrap;};

sampler Sampler3_1 = sampler_state { texture = <alpha4Tex> ; magfilter = LINEAR; minfilter = LINEAR; mipfilter=LINEAR; AddressU = wrap; AddressV = wrap;};
sampler Sampler3_2 = sampler_state { texture = <stage4Tex> ; magfilter = LINEAR; minfilter = LINEAR; mipfilter=LINEAR; AddressU = wrap; AddressV = wrap;};


float4 MyShader( float2 Tex : TEXCOORD0, float2 Tex2 : TEXCOORD1 ) : COLOR0
{
    float4 Color;
    float4 aColor;
    float4 stage;
    float4 aColor2;
    float4 stage2;
    float4 aColor3;
    float4 stage3;

    stage = tex2D(Sampler1_2, Tex.xy);
    aColor = tex2D(Sampler1_1, Tex2.xy);
    aColor2 = tex2D(Sampler2_1, Tex2.xy);
    stage2 = tex2D(Sampler2_2, Tex.xy);
    aColor3 = tex2D(Sampler3_1, Tex2.xy);
    stage3 = tex2D(Sampler3_2, Tex.xy);
    Color = tex2D( g_samSrcColor, Tex.xy);
    aColor.r = aColor2.r = aColor3.r = aColor.g = aColor2.g = aColor3.g = aColor.b = aColor2.b = aColor3.b = 1.0;
    
    float4 oneminus1 = 1.0 - aColor.a;
    float4 oneminus2 = 1.0 - aColor2.a;
    float4 oneminus3 = 1.0 - aColor3.a;

    float4 r1 = aColor.a * stage + oneminus1 * Color;
    float4 r2 = aColor2.a * stage2 + oneminus2 * r1;
    float4 ret = aColor3.a * stage3 + oneminus3 * r2;
    return ret;
}


technique EntryPoint
{
    pass p1
    {
        VertexShader = null;
        PixelShader = compile ps_2_0 MyShader();
    }

}
This is the very first shader i have ever created in my life. I dont know, if its perfect, but it works! There is actually one problem with that thing: Performance didnt change at all. Its still very slow when blending a lot of terrain. What could be the reason for that? Is it because the shader or do i have to search somewhere else? Greetings Plerion [Edited by - Plerion on December 3, 2009 5:36:33 AM]
Advertisement
There's absolutely no reason for it to be faster - you've just implemented what D3D was doing internally with the fixed function pipeline anyway.

You're probably limited by texture bandwidth; try making your textures smaller and see if that speeds things up. Or, better - use a GPU profiler like NVPerfHUD.
Ok, i will have a look at GPU profiling when as soon as possible. What do you mean with decreasing the size of the texture?
Quote:Original post by Plerion
Ok, i will have a look at GPU profiling when as soon as possible. What do you mean with decreasing the size of the texture?
If you have a 1024x1024, 32-bit texture, then each row of texels is 4KB of memory apart. That means that for the shader to do bilinear filtering, it has to read data that's at least 4KB apart. That means multiple memory accesses, which could potentially be a bottleneck.
If you make your textures 256x256 instead, then rows of texels are only 1KB apart, which means less of a potential memory hit.
Ok, i think thats not the problem. I checked the textures and they all have 256 or less as biggest mip-layer.

Ill check the GPU-profiler and maybe its not a problem of the renderer but of my style how objects are handled.

What kind of hardware you are working on? I don't know exactly what is your way to blend different textures together, but one of the tricks you may use is packing blending values in a single RGBA texture. I can see in your code that you are using only the alpha channel of the alpha textures.

With one packed alpha texture you may use up to 5 different terrain textures. If you need more, then you can use another texture and so on. Of course, this system doesn't scale well.

Here is some pseudo code to present the idea of using the packed texture with multiple alpha values. With this technique, you need 6 texture reads for 5 terrain textures.

float3 alpha = tex2D(sampler_alpha, i.xy);
float3 texture1 = tex2D(sampler_1, i.xy);
float3 texture2 = tex2D(sampler_2, i.xy);
float3 texture3 = tex2D(sampler_3, i.xy);
float3 texture4 = tex2D(sampler_4, i.xy);
float3 texture5 = tex2D(sampler_5, i.xy);

color = texture1;
color = lerp(color,texture2,alphatexture.a);
color = lerp(color,texture3,alphatexture.r);
color = lerp(color,texture4,alphatexture.g);
color = lerp(color,texture5,alphatexture.b);


Good luck!

Hey!

Thanks for the tipp! That improved the performance slightly (not really much, but at least a bit :)).

Without blending the textures (only the base texture) i got ~50 FPS and with the 4 textures blended over each other using Pixelshader i get ~15 FPS.

Greetings
Plerion

[Edited by - Plerion on December 3, 2009 5:30:55 AM]
What hardware are you running on? It might just be too much for your CPU / GPU to handle (Particularly if it's an integrated card).
Hello!

Yes that may be!

Im running on the following:
http://ati.amd.com/products/MobilityRadeonhd3400/index.html
Intel Core Duo T9400 2 * 2.53 GHZ
4 GB Ram (2.46 useable)
Win7 32 bit


Greetings
Plerion

[Edited by - Plerion on December 3, 2009 5:14:21 AM]

The Radeon 3400 series integrated chip is a bit slow. I have one in my laptop and I'm feeling a bit of regret of bying the thing. Since the GPU is the bottle neck, only thing you can do is to ease the load of the GPU by reducing the amount of texture reads in the shader. Also, using a smaller window or smaller screen resolution will increase the performance.

Even with the limitation of GPU performance, you should be able to make a terrain system with multiple textures. You might consider using precreated tile sets, which will increase the amount drawing calls, but will reduce the amount of texture reads.

Cheers!

This topic is closed to new replies.

Advertisement