Alpha Blending Slow

Started by
3 comments, last by feal87 14 years, 4 months ago
I was testing a game i've done with my engine (a 3d Vertical Shooter) on my laptop (with an Intel X3100 card) and i've found this discrepancies between DX9 and DX10... 1024x768 Fullscreen DX10 : 140 FPS without shield, 58 FPS with shield (7 ms vs 17 ms) 1024x768 Fullscreen DX9 : 200 FPS without shield, 160 FPS with shield. (5 ms vs 6.25 ms) The "shield" is a sphere drawed over the player (the player is large 1/9 of the screen) with a texture almost all transparent (except for a mark) with a color as diffuse light. The texture is 64x64, the sphere is very little in polygon count. (well, the scene is generally 10000-15000 triangles, the sphere is like 200. :D) On the area blended there are other than the player like 10-15 particles from the space behind. Every mesh except for the shield are drawed without blending active. Why the blending on DX10 is causing a major slowdown? (10 ms of increase is a big hit for blending 1/9 of the screen one time. :|) Here's the code related to the PS and Blend state in DX9 and DX10. DX10

	AlphaToCoverageEnable = FALSE;
    BlendEnable[0] = TRUE;
    SrcBlend = SRC_ALPHA;
    DestBlend = INV_SRC_ALPHA;
    BlendOp = ADD;
    SrcBlendAlpha = SRC_ALPHA;
    DestBlendAlpha = INV_SRC_ALPHA;
    BlendOpAlpha = ADD;
    RenderTargetWriteMask[0] = 0x0F;

/////////////////////////////////////////////////////////////

	float4 outputColor = g_texture.Sample( samLinear, input.Tex ) * input.color;
	return outputColor;

DX9

	AlphaBlendEnable = true;
	SrcBlend = 5;
	DestBlend = 6;
	BlendOp = ADD;
	SrcBlendAlpha = 5;
	DestBlendAlpha = 6;
	BlendOpAlpha = ADD;

/////////////////////////////////////////////////////////////

	float4 outputColor = g_texture.Sample( samLinear, input.Tex ) * input.color;
	return outputColor;

Thanks for the help.
Advertisement
Here's some possibilities:

1. DX10 doesn't do alpha testing via a render state where DX9 does. That'll have a big influence on performance with mostly transparent textures. The solution is instead of using a quad use some more complex geometry that leaves holes where the texture is completely transparent. You could also try using the clip() hlsl instruction to do alpha testing in the shader.

2. The DX10 driver for that card is slow.

3. Something else is different. Use profiling tools to find out what.
I think there is a problem with my code even thought i can't find it...

If i use 2 shield instead of one (regardless of position) everything works perfectly fine at the same speed lose of DX9...the shield is drawed with instancing...with only one the speed loss is that...
The strange thing is that there is no difference in cpu load while its slowed down...

I have to ask a friend to test on another DX10 machine to be sure its not a driver bug or something. XD

EDIT : Tested with the WARP rasterizer and got the same speed (42 FPS) with one or two shield... (45 FPS without any shield)...

Use a GPU perf tool to profile the scene and see what that says about what's taking up that extra 10ms
Tested with Intel GPA, lol...

When with two shield it goes with

3,89 ms each frame gpu side (the biggest draw is the particle system with 1,222 ms)

when with only one shield it goes to

13,914 ms each frame gpu side (the biggest draw is always the particle system with the same values)

But i don't know for what reason everything, from Shapes to Sprite (my own quad class) goes up to 500-600 from 50-60...

I'll try to investigate further on the profiler (wow, really nice. I didn't know intel had such a fine tool XD)

This topic is closed to new replies.

Advertisement