# Efficient Bloom algorithm?

This topic is 3775 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

OK I have a project and I'm already doing post-processing for glow and motion blur effects, so I thought that I'd be able to add bloom in pretty simply. And I can, and it works. The problem is it's pretty slow. My basic algorithm is this: * Sample the source image. I sample in a cross shape, so for N steps (I usually use 2-4) I sample 4*N + 1 times. This is the same way I sample the glow map to do glow effects. * Get the intensity of the pixel, and calculate a bloom factor. I've tried: a. Bloom factor = intensity (highest performance, but there's no thresholding at all -- each pixel contributes bloom proportional to its apparent brightness.) b. Bloom factor = (intensity > threshold) ? 1 : 0; (better results, lower performance) c. Bloom factor = a sigmoidal threshold such as an arctan (interesting results, quite slow) * Multiply the sample pixel by its bloom factor by the global bloom weight and add it to my glow calculated for that point. Relevant HLSL. For brevity's sake I cut out the parts relating to the glow map (for incandescent objects) and parts relating to the other post-processing effects. So you can rest assured that variables are properly set up and such; the effect works, it's just pretty slow.

float4 Threshold(float4 pixel)
{
float intensity = dot(pixel, float4(0.30, 0.59, 0.11, 0));  // Linear intensity

//intensity = (atan((intensity-bloomThreshold)*1024) / 3.141592) + 0.5;  // Sigmoidal threshold
//intensity = (intensity > bloomThreshold)? 1 : 0;                       // Simple threshold
return pixel * intensity;
}

float4 PSGlow(float2 TexCoord:TEXCOORD0) : COLOR0 {

// Some setting up variables here

if (bloomWeight > 0)
glow += Threshold(pic) * bloomWeight;

for (int i=0;i < glowSamples;i++) {
offset = i * (glowDisplacement / glowSamples);
float2 samp1, samp2, samp3, samp4;

samp1 = float2(saturate(TexCoord.x+offset),TexCoord.y);
samp2 = float2(saturate(TexCoord.x-offset),TexCoord.y);
samp3 = float2(TexCoord.x,saturate(TexCoord.y + offset));
samp4 = float2(TexCoord.x,saturate(TexCoord.y - offset));

// Some glow calculations here

if (bloomWeight > 0)
{
glow += Threshold(tex2D(inTexture1, samp1)) * bloomWeight;
glow += Threshold(tex2D(inTexture1, samp2)) * bloomWeight;
glow += Threshold(tex2D(inTexture1, samp3)) * bloomWeight;
glow += Threshold(tex2D(inTexture1, samp4)) * bloomWeight;
}

}

// Other processing effects here
}

Is there a better/faster way to do bloom? I think one way in which I can potentially improve it would be by making bloom only from certain textures (such as a sky texture), and just render it straight into the glowmap, so I don't have to sample the glow map and then sample the source frame both (the glow map is sampled by the same kind of loop as the source frame). [Edited by - Goil on July 18, 2008 7:04:48 PM]

##### Share on other sites
You can also render only what blooms in a small texture, or use the alpha component in your target to determine what will bloom (only materials that should bloom will write into the alpha of your target).

##### Share on other sites
It's very common to use a downsampled texture for bloom. It can save a lot of math and texture lookups. The basic algorithm is this:

-Render scene to full-size render-target
-Downscale to lower size (1/4x, 1/16x, whatever)
-Threshold
-Blur with seperable gaussian blur (horizontal pass than vertical pass)
-Upscample to full size
-Combine with original render target

##### Share on other sites
Quote:
 Original post by GoilIs there a better/faster way to do bloom?

You can also use a separable filter for bloom, a separable filter will allow you to filter first along X and then along Y. Of course that limits you to filter that are truly separable and if your original filter could be done in place, this is not the case of a separable filter (that takes at least two passes).

You can also use anisotropy to do a sample decimation using the graphics hardware (sample decimation to work on a lower res later and avoid some bad aliasing artifacts caused by simple point sampling). See for example this paper :
Accelerated decimation using anisotropic filtering.

Anything that uses texturing capabilities of your hardware (be it bilinear samples, to anisotropic like above) is a performance win.

LeGreg

##### Share on other sites
float4 Threshold(float4 pixel){    float intensity = dot(pixel, float4(0.30, 0.59, 0.11, 0));  // Linear intensity        //intensity = (atan((intensity-bloomThreshold)*1024) / 3.141592) + 0.5;  // Sigmoidal threshold    //intensity = (intensity > bloomThreshold)? 1 : 0;                       // Simple threshold    return pixel * intensity;}

One thing your doing very wrong is this code has branch. In hlsl I've found that you should never use the ? operator.

you should just

return pixel * (intensity - bloomThreshold);

and what is this, some color contribute more to intensity than others?
I've never seen code that did it this way, and yet it always worked fine.

so you could just do this

return saturate(pixel - bloomThreshold);

you can scale it before to extend your range, this will limit it to a 1 glow value.

##### Share on other sites
Quote:
 One thing your doing very wrong is this code has branch. In hlsl I've found that you should never use the ? operator.you should justreturn pixel * (intensity - bloomThreshold);

Yeah, I also tried things like:

return pixel * saturate(intensity - bloomThreshold *100);

but that was slow too; internally saturate would use conditionals anyhow. Just doing pixel * (intensity - threshold) could actually end up with a "negative bloom" on areas below the threshold, because eventually this value gets added to the rendered frame.

Trying to avoid any branching is also what led to trying a sigmoidal threshold function, but that didn't help.

Quote:
 and what is this, some color contribute more to intensity than others?I've never seen code that did it this way, and yet it always worked fine.

It's not necessary, but I pulled the coefficients out of the NTSC/PAL conversions from RGB to YIQ/YUV (I just grabbed the first row of those conversion matrices, as all I want to calculate is the Y channel). It's just to take into account for the physiology of the eye, and how we perceive brightness. That is, we perceive (0.0, 1.0, 0.0) green as brighter than (1.0, 0.0, 0.0) red which itself is perceived brighter than (0.0, 0.0, 1.0) blue.

I've tried just using intensity as the simple average of the RGB channels and it made no difference in performance.

Thank everyone who responded for their ideas and suggestions, I'll try them out :)

1. 1
Rutin
65
2. 2
3. 3
4. 4
5. 5

• 17
• 10
• 29
• 20
• 9
• ### Forum Statistics

• Total Topics
633415
• Total Posts
3011768
• ### Who's Online (See full list)

There are no registered users currently online

×