understanding theory of bloom effect

Started by
1 comment, last by Tasche 12 years, 6 months ago
hey all.

i want to implement a bloom effect similar to the one described by this guy.
i do understand the basic principle of what he is doing, since it seems fairly easy to understand and to implement, but im not sure about the technical details.
seems to me, unless you have a card with an awesome fill rate, this method is rather expensive. but maybe it isn't, or i'm just not doing it right.

so this is how i would generate the bloom texture:

  • i do a regular render of my scene to a texture 'SCENE'. to keep this example easy, lets say i have a square screen with 1024x1024 resolution. *without* the bloom effect, i would now simply render a screen filling quad with this texture to the backbuffer and advance the swapchain, displaying the scene on screen. at the moment, this is also the status quo for my code.
  • however, now i want to generate a bloom texture based on the SCENE texture. so i go and render the SCENE screen quad to another texture BRIGHTPASS (by 'screen quad' im talking of normalized device coordinates, obviously, just something that fills the rendertarget) the size of BRIGHTPASS is only 1/64 of SCENE: 128x128. the pixelshader for this pass has to look something like this (pseudocode):

sample all texels from (x-blursize,y) to (x+blursize,y) {
if (luminance is above a certain value) //probably something like (texel.r+texel.g+texel.b > 2.4)
accumulate the contribution in color, multiplied by the gauss blur constant for that texel
}
normalize color in respect to samples taken
return color


this would yield a brightpass and a horizontal blur
  • now render to yet another texture BLUR1 applying a vertical blur (size of BLUR1 128x128)
  • at this point BLUR1 should contain a downscaled, brightpass filtered and blurred texture.
  • repeat this process for 2 other rendertargets with the sizes 256x256, BLUR2, and 512x512 BLUR3
  • render SCENE, then blend in BLUR1 to BLUR3 additively (which are all now mapped to a screen filling quad), using linear filtering or such to magnify
  • drink a beer
its probably quicker to do the brightpass only once for a 512x512 unblurred texture, then use this texture as the base for the BLUR1~3 textures. should save a lot of conditionals, the extra mul and add of scalar values should be faster, esp. since one of the factors is constant. however, this whole thing looks *awfully* slow ans expensive, and im not sure the author in the link i posted was talking about doing it this way.
would be nice if anyone could comment on this. dont care too much about the numbers used, they are more or less arbitrary, im more interested if the m.o. is correct. if you have got a completely different method of doing a nice bloom/glare effect that is even way more awesome than this feel free to post :) realism is not necessary, im more interested in speed, and, to a degree, in flexibilty considering the parameters (like the strength of the glare, scalability of the speed/quality tradeoff).

and yes, i do intend to drink a beer EVERY FRAME.

cheers,
tasche
Advertisement
A modern GPU will tear through a few low-res guassian blurs in no time at all, so don't worry about that. I've seen similar implementations to what's described in that article, and they usually work something like this:

1. Render your scene to a 1024x1024 render target
2. Run your threshold pass, rendering to a 512x512 render target
3. Downsample to 256x256
4. Downsample to 128x128
5. Blur the 128x128
6. Blur the 256x256
7. Sum the blurred 256x256 with the blurred 128x128 (can do this doing the last blur step if you want)
8. Blur the 512x512
9. Sum the blurred 512x512 with the blurred 256x256
10. Composite the result with your original scene render target


and yes, i do intend to drink a beer EVERY FRAME.


That's the real secret to good bloom. :P
so, when you say 'summing' or 'blurring' or 'downssample', do you actually mean render passes?

so i first render to severeal different sized rendertargets (512^2 to 128^2), using the initial texture obtained in the threshold pass (the 512x512 one).
then i do horizontal blur passes on all these textures, and verticals too.
in a final render pass i sum up all the textures obtained in these render passes.
so i end up with (in this example) 1 threshold pass + 3 passes for the different sizes + 3 horizantal + 3 vertical + 1 final summing pass = 11 render passes

im just making sure i understand this correctly, since this seems like a lot of render passes... i do realize they should be rather fast since you just have a single quad as geometry every pass. i also realize that this can be optimized by doing two steps in a single pass (vertical blur + summing can always be done in a single step, as you mentioned in your post, reducing the count by another pass). im not sure if this works how i think it does, but binding multiple rendertargets (i'm using dx11) should also cut the generation of the unblurred textures from 3 to 0, as they can be filled when the threshold texture is being rendered. this would still leave 1 threshold pass + 3 horizontal + 3 vertical+summing + 1 composite with original = 8 passes.

now, i've got no idea about this at all, but one could probably use compute shaders for the blurring, since all of it happens in video memory anyway, and you dont have the overhead of running the whole render pipeline. but like i said, i have got no idea what compute shaders actually do and if they are useful in this context, but might be a good exercise to learn about this maybe? but i want to get a low performance bloom shader working first, of course :)

thanks for the help (again), MJP


That's the real secret to good bloom. :P


hehe, then i must be an awesome game programmer :)

This topic is closed to new replies.

Advertisement