• Advertisement
Sign in to follow this  

Texture Blending Shaders

This topic is 2999 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello! After using blending on the device directly
m_3dMgr->SetTexture(m_textures, 0);
		dev->SetTextureStageState(0, D3DTSS_COLOROP, D3DTOP_SELECTARG1); // color
		dev->SetTextureStageState(0, D3DTSS_COLORARG1, D3DTA_TEXTURE);
		dev->SetTextureStageState(0, D3DTSS_COLORARG2, D3DTA_TEXTURE);

		dev->SetTexture(1, m_alphaMaps);
		dev->SetTextureStageState(1, D3DTSS_COLOROP, D3DTOP_SELECTARG2); // colorstage
		dev->SetTextureStageState(1, D3DTSS_COLORARG1, D3DTA_TEXTURE);
		dev->SetTextureStageState(1, D3DTSS_COLORARG2, D3DTA_CURRENT);

		dev->SetTextureStageState(1, D3DTSS_ALPHAOP, D3DTOP_SELECTARG2); // alphastage
		dev->SetTextureStageState(1, D3DTSS_ALPHAARG1, D3DTA_CURRENT);
		dev->SetTextureStageState(1, D3DTSS_ALPHAARG2, D3DTA_TEXTURE);
and the resulting low performance i decided to change to a pixelshader as som epeople told me. This shader now looks as following:
sampler2D g_samSrcColor;

texture alpha2Tex;
texture stage2Tex;

texture alpha3Tex;
texture stage3Tex;

texture alpha4Tex;
texture stage4Tex;


sampler Sampler1_1 = sampler_state { texture = <alpha2Tex> ; magfilter = LINEAR; minfilter = LINEAR; mipfilter=LINEAR; AddressU = wrap; AddressV = wrap;};
sampler Sampler1_2 = sampler_state { texture = <stage2Tex> ; magfilter = LINEAR; minfilter = LINEAR; mipfilter=LINEAR; AddressU = wrap; AddressV = wrap;};

sampler Sampler2_1 = sampler_state { texture = <alpha3Tex> ; magfilter = LINEAR; minfilter = LINEAR; mipfilter=LINEAR; AddressU = wrap; AddressV = wrap;};
sampler Sampler2_2 = sampler_state { texture = <stage3Tex> ; magfilter = LINEAR; minfilter = LINEAR; mipfilter=LINEAR; AddressU = wrap; AddressV = wrap;};

sampler Sampler3_1 = sampler_state { texture = <alpha4Tex> ; magfilter = LINEAR; minfilter = LINEAR; mipfilter=LINEAR; AddressU = wrap; AddressV = wrap;};
sampler Sampler3_2 = sampler_state { texture = <stage4Tex> ; magfilter = LINEAR; minfilter = LINEAR; mipfilter=LINEAR; AddressU = wrap; AddressV = wrap;};


float4 MyShader( float2 Tex : TEXCOORD0, float2 Tex2 : TEXCOORD1 ) : COLOR0
{
    float4 Color;
    float4 aColor;
    float4 stage;
    float4 aColor2;
    float4 stage2;
    float4 aColor3;
    float4 stage3;

    stage = tex2D(Sampler1_2, Tex.xy);
    aColor = tex2D(Sampler1_1, Tex2.xy);
    aColor2 = tex2D(Sampler2_1, Tex2.xy);
    stage2 = tex2D(Sampler2_2, Tex.xy);
    aColor3 = tex2D(Sampler3_1, Tex2.xy);
    stage3 = tex2D(Sampler3_2, Tex.xy);
    Color = tex2D( g_samSrcColor, Tex.xy);
    aColor.r = aColor2.r = aColor3.r = aColor.g = aColor2.g = aColor3.g = aColor.b = aColor2.b = aColor3.b = 1.0;
    
    float4 oneminus1 = 1.0 - aColor.a;
    float4 oneminus2 = 1.0 - aColor2.a;
    float4 oneminus3 = 1.0 - aColor3.a;

    float4 r1 = aColor.a * stage + oneminus1 * Color;
    float4 r2 = aColor2.a * stage2 + oneminus2 * r1;
    float4 ret = aColor3.a * stage3 + oneminus3 * r2;
    return ret;
}


technique EntryPoint
{
    pass p1
    {
        VertexShader = null;
        PixelShader = compile ps_2_0 MyShader();
    }

}
This is the very first shader i have ever created in my life. I dont know, if its perfect, but it works! There is actually one problem with that thing: Performance didnt change at all. Its still very slow when blending a lot of terrain. What could be the reason for that? Is it because the shader or do i have to search somewhere else? Greetings Plerion [Edited by - Plerion on December 3, 2009 5:36:33 AM]

Share this post


Link to post
Share on other sites
Advertisement
There's absolutely no reason for it to be faster - you've just implemented what D3D was doing internally with the fixed function pipeline anyway.

You're probably limited by texture bandwidth; try making your textures smaller and see if that speeds things up. Or, better - use a GPU profiler like NVPerfHUD.

Share this post


Link to post
Share on other sites
Ok, i will have a look at GPU profiling when as soon as possible. What do you mean with decreasing the size of the texture?

Share this post


Link to post
Share on other sites
Quote:
Original post by Plerion
Ok, i will have a look at GPU profiling when as soon as possible. What do you mean with decreasing the size of the texture?
If you have a 1024x1024, 32-bit texture, then each row of texels is 4KB of memory apart. That means that for the shader to do bilinear filtering, it has to read data that's at least 4KB apart. That means multiple memory accesses, which could potentially be a bottleneck.
If you make your textures 256x256 instead, then rows of texels are only 1KB apart, which means less of a potential memory hit.

Share this post


Link to post
Share on other sites
Ok, i think thats not the problem. I checked the textures and they all have 256 or less as biggest mip-layer.

Ill check the GPU-profiler and maybe its not a problem of the renderer but of my style how objects are handled.

Share this post


Link to post
Share on other sites

What kind of hardware you are working on? I don't know exactly what is your way to blend different textures together, but one of the tricks you may use is packing blending values in a single RGBA texture. I can see in your code that you are using only the alpha channel of the alpha textures.

With one packed alpha texture you may use up to 5 different terrain textures. If you need more, then you can use another texture and so on. Of course, this system doesn't scale well.

Here is some pseudo code to present the idea of using the packed texture with multiple alpha values. With this technique, you need 6 texture reads for 5 terrain textures.

float3 alpha = tex2D(sampler_alpha, i.xy);
float3 texture1 = tex2D(sampler_1, i.xy);
float3 texture2 = tex2D(sampler_2, i.xy);
float3 texture3 = tex2D(sampler_3, i.xy);
float3 texture4 = tex2D(sampler_4, i.xy);
float3 texture5 = tex2D(sampler_5, i.xy);

color = texture1;
color = lerp(color,texture2,alphatexture.a);
color = lerp(color,texture3,alphatexture.r);
color = lerp(color,texture4,alphatexture.g);
color = lerp(color,texture5,alphatexture.b);


Good luck!

Share this post


Link to post
Share on other sites
Hey!

Thanks for the tipp! That improved the performance slightly (not really much, but at least a bit :)).

Without blending the textures (only the base texture) i got ~50 FPS and with the 4 textures blended over each other using Pixelshader i get ~15 FPS.

Greetings
Plerion

[Edited by - Plerion on December 3, 2009 5:30:55 AM]

Share this post


Link to post
Share on other sites
What hardware are you running on? It might just be too much for your CPU / GPU to handle (Particularly if it's an integrated card).

Share this post


Link to post
Share on other sites
Hello!

Yes that may be!

Im running on the following:
http://ati.amd.com/products/MobilityRadeonhd3400/index.html
Intel Core Duo T9400 2 * 2.53 GHZ
4 GB Ram (2.46 useable)
Win7 32 bit


Greetings
Plerion

[Edited by - Plerion on December 3, 2009 5:14:21 AM]

Share this post


Link to post
Share on other sites

The Radeon 3400 series integrated chip is a bit slow. I have one in my laptop and I'm feeling a bit of regret of bying the thing. Since the GPU is the bottle neck, only thing you can do is to ease the load of the GPU by reducing the amount of texture reads in the shader. Also, using a smaller window or smaller screen resolution will increase the performance.

Even with the limitation of GPU performance, you should be able to make a terrain system with multiple textures. You might consider using precreated tile sets, which will increase the amount drawing calls, but will reduce the amount of texture reads.

Cheers!

Share this post


Link to post
Share on other sites
Hey!

Yes, the card isnt one of the best, but i actually dont need it as its only a working computer ;).

What i forgot to mention about: It can not be because of my system or because of the material i render. I use every model and every i texture i render from World of Warcraft. And as WoW is running at 40+ FPS with muuuch more terrain at once and dozens of particle effects and hundreds of models i guess that there is something wrong in my code, when i get about 16 FPS only with the terrain. I tried to get NPerfHUD working but it isnt. Is there any alternative?

Greetings
Plerion

[Edited by - Plerion on December 3, 2009 5:00:46 AM]

Share this post


Link to post
Share on other sites

You can try Pix for profiling.
Your performance sounds a bit low. Can you please show a bit of the code which is used for drawing the terrain?

Share this post


Link to post
Share on other sites
Hey!

Thats the code i use to render a terrainchunk:
http://pastebin.com/m13949e91

At the moment i render about 2000 such chunks with a framerate of ~15 FPS.

I will have a look at Pix.

Greetings
Plerion

Share this post


Link to post
Share on other sites
If you disable texture blending, or make the shader simpler, does it execute faster? If not, then no amount of shader optimisation will help.

I suspect the several thousand DrawIndexedPrimitive() calls is much more of a problem than blending multiple textures - you really don't want more than a few hundred DrawIndexedPrimitive calls per frame...

Again, a GPU profiler will actually tell you where the bottleneck is.

Share this post


Link to post
Share on other sites
Reducing draw calls is important here.

Some points of your code:

You may use handles instead of strings to set shader parameters
ie. get handle with the string and use the handle to set the texture

m_blendEffect[2]->SetTexture("alpha2Tex", m_alphaTex[0]);

Does each of the terrain block have it's own vertex buffer?
Reducing the amount of vertex buffers and vertex buffer switches should give you more performance.

pDev->SetStreamSource(0, m_vbuffer, 0, 28);
pDev->SetIndices(m_ibuffer);

As mentionned before, use a profiler to locate the real bottle neck. My suggestions may well be just minor things.

Cheers!

Share this post


Link to post
Share on other sites
Hello!

I tried now Pix and it shows me some information. I actually dont really know for what i have to look for.

Here is what ive found
Number of DrawIndexPrimitive calls: 2308

I guess thats a problem, right? But I dont really know how to solve it differently. Every chunk may have its own 4 texture layers. How can i combine those if they dont have the same texture?

Greetings
Plerion

Share this post


Link to post
Share on other sites
Hi,

First thing you could try is to draw just half of the chunks and see if your performance scales accordingly.

I can see that having this kind of terrain system is actually pretty complicated to optimize. Combining the patches is complicated because of the alphamaps.

However, are you sure that your terrain actually may use almost unlimited number of textures (as long as there is 4 per terrain patch)?

Following things things may help you a bit:
- reduce state changes
- reduce texture changes (ie. sort your terrain blocks according to their material)
- reduce vertex/index buffer changes, create one big vertex buffer/index buffer instead of many small ones.

- create a different shader for blocks with fewer than 4 textures.

Do you use frustum culling for your blocks?

[Edited by - kauna on December 3, 2009 8:34:29 AM]

Share this post


Link to post
Share on other sites
Hi. I will collect some ideas how to decrease the number of renders! Here are some values if i change the number of chunks:
1536 DIP -> 20 FPS
768 DIP -> 35 FPS
256 DIP -> 61 FPS

Share this post


Link to post
Share on other sites
Profiling WoW gave me the final point, that im doing it wrong :D

Number of DIP-Calls: 503

So i need to find a way to reduce the calls, though i have no idea at all how to do that :P

Share this post


Link to post
Share on other sites

This may be a stupid question, but you are doing frustum culling?

In order to combine the drawing calls, they need to use common resources (ie. textures). Perhaps a texture atlas for the texture alphas would help in that sense. If I was facing that kind of a situation, I would look in to the data and see if the system actually has an artificial limit for the textures.

Good luck!

Share this post


Link to post
Share on other sites
I love stupid questions :D. I had commented the frustum-culling things out because they didnt work. Now i had a look over it and seen that i only had a wrong order in the coordinates, thats why it didnt work. Now it works and is active -> 50+ FPS.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement