Sign in to follow this  
Haptic

OpenGL Inefficient Manual Alpha Blending

Recommended Posts

Hey everyone. I'm doing some splat mapping. Basically I have 3 common diffuse textures, and each section of my terrain has 3 corresponding alpha maps. Instead of doing this at runtime, I'm precomputing it for speed reasons - the splatting is there for my sake so I can interchange base-textures at will. The issue is that it takes about 0.3-0.5s for each 1000x1000pixel output texture, which quickly adds up. Of course I realise that there are 1 million pixels involved in each texture but considering splatting is often done at runtime, how does OpenGL blend it so fast? I allocate memory for each texture and then multiply them according to the formula
output = tex1 * alpha + tex2 * (1 - alpha)
I can't think how to be any simpler, though I was never good at efficient code. Any suggestions would be greatly appreciated!

Share this post


Link to post
Share on other sites
Quote:
Original post by Haptic
how does OpenGL blend it so fast?[/b]


It uses dedicated hardware (The GPU) to do it.

You might be able to do it slightly faster using SSE. (I'm nowhere near good enough at that so someone else will have to help you then)

Share this post


Link to post
Share on other sites
Thanks for the reply Simon. I should have known that :).

I don't really want to put too much effort into optimising this (especially if it wont go very far), so I might just build an external utility that I can use to generate new in-game textures when I want to swap a base-texture.

Share this post


Link to post
Share on other sites
SSE will definitely help, but don't expect a massive performance improvement:


#include <xmmintrin.h>

__m128 v_one = _mm_set1_ps(1.0f);
__m128 v_alpha = _mm_load_ps(&alpha);
__m128 v_tex1 = _mm_load_ps(&tex1);
__m128 v_tex2 = _mm_load_ps(&tex2);
__m128 v_output = _mm_add_ps(_mm_mul_ps(v_tex1, v_alpha), _mm_mul_ps(v_tex2, _mm_sub_ps(v_one, v_alpha)));
_mm_store_ps(&output, v_output);



The fastest way to blend textures is to take advantage of 3D hardware acceleration provided by OpenGL. You can blend textures using the GPU and save the result in another texture (using Framebuffer Objects or glReadPixels).

Share this post


Link to post
Share on other sites
Quote:
Original post by deathkrush
SSE will definitely help, but don't expect a massive performance improvement:

*** Source Snippet Removed ***

The fastest way to blend textures is to take advantage of 3D hardware acceleration provided by OpenGL. You can blend textures using the GPU and save the result in another texture (using Framebuffer Objects or glReadPixels).


glReadPixels is awfully slow though and forces a flush (Which means the CPU will be forced to wait for the GPU to finnish). If he needs the data in RAM it might even be slower than just doing it on the CPU.

Since he is blending 2 equally sized textures he could probably take advantage of multiple threads aswell, (splitting the images into 2 or 4 stripes and blending those should provide a rather massive performance boost)

Share this post


Link to post
Share on other sites
Quote:
Original post by Haptic
I allocate memory for each texture and then multiply them according to the formula
output = tex1 * alpha + tex2 * (1 - alpha)

0.3-0.5s is awfully slow. I get about 13ms per 1024*1024 blit on my 1.7GHz laptop (using 10 textures so cache is definitely trashed as well).

What are the variable types of your pixels? Are you sure you are not doing unnecessary conversions etc.

The formula for unsigned char is:
color = ((255 - (fa)) * (bc) + (fc) * (fa) + 127) / 255
Using SSE or MMX you can do these in parallel, which helps too.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this