Sign in to follow this  

I made some bloom, has some performance issues.

This topic is 3596 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Here is my recipe for bloom: Ingredients: 3 FBOs, all with mip-mapping and depth rendering enabled (not exactly requried for all of them I guess). 1 darken/saturate shader (algorithm stolen from somewhere, can't remember where, though.):
uniform sampler2D tex;

const float lum = 0.08f;
const float grey = 0.18f;
const float white = 0.9f;

void main(void)
{
    vec4 col = texture2D(tex,gl_TexCoord[0].st);
    col *= grey/lum;
    col *= (1.0 + (col / (white*white)));
    col -= 5.0;
    col = max(col,0.0);
    col /= (10.0 + col);
    gl_FragColor = col * 2.0;
}
1 horizontal blur shader:
uniform float width;
uniform sampler2D tex;

vec4 sampleMap(vec2 uv,float offset,int level) {
    vec4 sample = vec4(0.0,0.0,0.0,0.0);
    sample += texture2DLod(tex,uv - vec2(2*offset,0),level) * 0.0625;
    sample += texture2DLod(tex,uv - vec2(1*offset,0),level) * 0.2500;
    sample += texture2DLod(tex,uv                   ,level) * 0.3750;
    sample += texture2DLod(tex,uv + vec2(1*offset,0),level) * 0.2500;
    sample += texture2DLod(tex,uv + vec2(2*offset,0),level) * 0.0625;
    return sample;
}

void main(void)
{
    vec2 uv = gl_TexCoord[0].st;
    vec4 sample = vec4(0.0,0.0,0.0,0.0);
    float offset = 1.0/width;

    sample = sampleMap(uv,offset*4 ,2)+
             sampleMap(uv,offset*8 ,3)+
             sampleMap(uv,offset*16,4)+
             sampleMap(uv,offset*32,5);

    gl_FragColor = sample/4;
}
1 vertical blur and mix-with-original-image shader:
uniform sampler2D tex;
uniform sampler2D orig;
uniform float height;

vec4 sampleMap(vec2 uv,float offset,int level) {
    vec4 sample = vec4(0.0,0.0,0.0,0.0);
    sample += texture2DLod(tex,uv - vec2(0,2*offset),level) * 0.0625;
    sample += texture2DLod(tex,uv - vec2(0,1*offset),level) * 0.2500;
    sample += texture2DLod(tex,uv                   ,level) * 0.3750;
    sample += texture2DLod(tex,uv + vec2(0,1*offset),level) * 0.2500;
    sample += texture2DLod(tex,uv + vec2(0,2*offset),level) * 0.0625;
    return sample;
}

void main(void)
{
    vec2 uv = gl_TexCoord[0].st;
    vec4 sample = 0.0;
    float offset = 1.0/height;

    sample = sampleMap(uv,offset*4 ,2)+
             sampleMap(uv,offset*8 ,3)+
             sampleMap(uv,offset*16,4)+
             sampleMap(uv,offset*32,5);
    vec4 col = sample/4;

    gl_FragColor = col + texture2D(orig,uv);
}
If you don't want to read through and understand the above (who would? =P), the blurring is basically what is described here - taking samples from mipmaps (or smaller versions) of the texture to "fake" a large gaussian kernel, and splitting it into horizontal and vertical parts to take it down from n^2 samples to 2n. Directions: Render the scene to FBO[0]. Use darken shader to render a darkened FBO[0] onto a quad, this is all rendered into FBO[1]. Use horizontal blur to render FBO[1] onto a quad, captured by FBO[2]. Use vertical blur and mix shader to blur FBO[2] and mix with original image in FBO[0]. Render onto a quad and serve with gibs and gore. Concerns: It's pretty fast (500fps with a simple scene) on my desktop with an 8800GTS, but on my lappy with a Go7600 it goes down to around 15fps regardless of how much geometry is on screen. I really do need to get the performance up radically, any pointers? Is there some sort of hardware limit I'm reaching with the 7600 that the 8800 doesn't hit? I noticed biggest hit was when I turned on mipmapping on the FBOs, is this normal? Even though the darkening shader is pretty trivial, taking that out increases the framerate quite a bit, leading me to believe it's the FBO's fault again. .. Just did a test without enabling mipmaps on the FBOs and it ran at 85 instead of 15 fps on the laptop... Gah!? Can anyone shed some light onto my FBO predicament? Any other tips towards optimising the methodology/shaders? Thanks.

Share this post


Link to post
Share on other sites
Are your temporary/blur render-targets the same resolution as the back-buffer?

AFAIK, most bloom implementations at least do the blurring steps in a low-rez render target (about 1/4 the size of the back-buffer, which is 1/16th the amount of pixels). This way you're a lot less fill-rate bound, and it still looks good.

Share this post


Link to post
Share on other sites
No my blurring targets are screen resolution, I guess shrinking them would give the same quality since the lowest mipmap level I use in the blur shaders is 2. Thanks for the great suggestion.
On a related note, given my "summing" shader includes a blur pass, yet requires full resolution, I guess adding a 4th pass just to do the summing at full resolution would probably be quicker than doing the second half of the blurring at full resolution?

I added some functionality so I can choose whether to include mipmapping and/or depth to the FBOs per post-effect "layer" and somehow managed to get the framerate up to 85 by disabling mipmapping in the first one and depth in the other two, didn't think it would have been anywhere near that much of an improvement, but I am glad to have been mistaken =).

EDIT: Wow! what an improvement! Thanks so much for that heads up on the smaller targets, now my laptop is running it at 160fps - 10x what I started with! =D
My current setup is:
1. pass through: depth=yes mipmap=no res=full
2. darken: depth=no mipmap=no res=1/4
3. blur 1: depth=no mipmap=yes res=1/4
4. blur 2: depth=no mipmap=yes res=1/4
5. sum: depth=no mipmap=no res=full

Separating the sum shader gave another 15-20fps, separating the full resolution input from the darken shader gave another 5 or so fps. All in all, a great success.

[Edited by - DeathCarrot on February 7, 2008 8:46:17 PM]

Share this post


Link to post
Share on other sites

This topic is 3596 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this