Inefficient Manual Alpha Blending

Started by
4 comments, last by lauris71 14 years, 8 months ago
Hey everyone. I'm doing some splat mapping. Basically I have 3 common diffuse textures, and each section of my terrain has 3 corresponding alpha maps. Instead of doing this at runtime, I'm precomputing it for speed reasons - the splatting is there for my sake so I can interchange base-textures at will. The issue is that it takes about 0.3-0.5s for each 1000x1000pixel output texture, which quickly adds up. Of course I realise that there are 1 million pixels involved in each texture but considering splatting is often done at runtime, how does OpenGL blend it so fast? I allocate memory for each texture and then multiply them according to the formula
output = tex1 * alpha + tex2 * (1 - alpha)
I can't think how to be any simpler, though I was never good at efficient code. Any suggestions would be greatly appreciated!
- Haptic
Advertisement
Quote:Original post by Haptic
how does OpenGL blend it so fast?



It uses dedicated hardware (The GPU) to do it.

You might be able to do it slightly faster using SSE. (I'm nowhere near good enough at that so someone else will have to help you then)
[size="1"]I don't suffer from insanity, I'm enjoying every minute of it.
The voices in my head may not be real, but they have some good ideas!
Thanks for the reply Simon. I should have known that :).

I don't really want to put too much effort into optimising this (especially if it wont go very far), so I might just build an external utility that I can use to generate new in-game textures when I want to swap a base-texture.
- Haptic
SSE will definitely help, but don't expect a massive performance improvement:

#include <xmmintrin.h>__m128 v_one = _mm_set1_ps(1.0f);__m128 v_alpha = _mm_load_ps(&alpha);__m128 v_tex1 = _mm_load_ps(&tex1);__m128 v_tex2 = _mm_load_ps(&tex2);__m128 v_output = _mm_add_ps(_mm_mul_ps(v_tex1, v_alpha), _mm_mul_ps(v_tex2, _mm_sub_ps(v_one, v_alpha)));_mm_store_ps(&output, v_output);


The fastest way to blend textures is to take advantage of 3D hardware acceleration provided by OpenGL. You can blend textures using the GPU and save the result in another texture (using Framebuffer Objects or glReadPixels).
deathkrushPS3/Xbox360 Graphics Programmer, Mass Media.Completed Projects: Stuntman Ignition (PS3), Saints Row 2 (PS3), Darksiders(PS3, 360)
Quote:Original post by deathkrush
SSE will definitely help, but don't expect a massive performance improvement:

*** Source Snippet Removed ***

The fastest way to blend textures is to take advantage of 3D hardware acceleration provided by OpenGL. You can blend textures using the GPU and save the result in another texture (using Framebuffer Objects or glReadPixels).


glReadPixels is awfully slow though and forces a flush (Which means the CPU will be forced to wait for the GPU to finnish). If he needs the data in RAM it might even be slower than just doing it on the CPU.

Since he is blending 2 equally sized textures he could probably take advantage of multiple threads aswell, (splitting the images into 2 or 4 stripes and blending those should provide a rather massive performance boost)
[size="1"]I don't suffer from insanity, I'm enjoying every minute of it.
The voices in my head may not be real, but they have some good ideas!
Quote:Original post by Haptic
I allocate memory for each texture and then multiply them according to the formula
output = tex1 * alpha + tex2 * (1 - alpha)

0.3-0.5s is awfully slow. I get about 13ms per 1024*1024 blit on my 1.7GHz laptop (using 10 textures so cache is definitely trashed as well).

What are the variable types of your pixels? Are you sure you are not doing unnecessary conversions etc.

The formula for unsigned char is:
color = ((255 - (fa)) * (bc) + (fc) * (fa) + 127) / 255
Using SSE or MMX you can do these in parallel, which helps too.
Lauris Kaplinski

First technology demo of my game Shinya is out: http://lauris.kaplinski.com/shinya
Khayyam 3D - a freeware poser and scene builder application: http://khayyam.kaplinski.com/

This topic is closed to new replies.

Advertisement