Real-time bokeh (high quality DOF)

Started by
9 comments, last by Hodgman 14 years, 1 month ago
I was going to just post this on my blog (aeolusengine.blogspot), but since I hoped to have some meaningful discussion about it, I decided to post here instead. :) Recently I found myself staring at a random photograph. What stood out was how nice the out of focus areas blurred into discs of light. It reminded me how much I dislike DOF effects in games these days, where it's usually based on gaussian blurs. It looks cheap to me. So I decided to try to implement a good bokeh effect in real-time. The way I looked at it was that the rendered image represents perfectly in focus bundles of light at each pixel. What I wanted to do was take those bundles of light that should be out of focus, and render them as a disc (preferably, though different lenses produce different shapes) over a larger area of the render target (the size of the circle of confusion). So, what are some of the ways you can do this? Rendering ~1280x720 point sprites is out of the question. My first idea was to look at it as rendering the accumulation of n discs at each pixel when rendering a full screen quad. Given a filter: float weights[] = { 0.000000, 0.015703, 0.022473, 0.023882, 0.022473, 0.015703, 0.000000, 0.015703, 0.023882, 0.023978, 0.023978, 0.023978, 0.023882, 0.015703, 0.022473, 0.023978, 0.023978, 0.023978, 0.023978, 0.023978, 0.022473, 0.023882, 0.023978, 0.023978, 0.023978, 0.023978, 0.023978, 0.023882, 0.022473, 0.023978, 0.023978, 0.023978, 0.023978, 0.023978, 0.022473, 0.015703, 0.023882, 0.023978, 0.023978, 0.023978, 0.023882, 0.015703, 0.000000, 0.015703, 0.022473, 0.023882, 0.022473, 0.015703, 0.000000 }; If you take 7x7 (minus the 4 zero weight) samples at a pixel, for each sample look at its circle of confusion radius (stored in the alpha in a previous pass). Then, generate an offset into the filter based on the sample offset and its radius. If the offset puts you outside the filter, we're outside this samples circle of confusion, so it shouldn't contribute to this pixel. Otherwise, add sample * weight[offset] to your result. I got this working. And it looked pretty amazing. It made me see that this effect is well worth the trouble. Unfortunately, there were several problems with this technique. For one, the calculations for building the weight offset were causing the shader compiler to choke with certain optimization settings, and those calculations were in a 7x7 loop (ie very slow). Not to mention taking 7x7 samples per pixel at full resolution is a bit insane. Options? Taking less samples? But then, you get a smaller max circle of confusion radius. I decided instead to generate two fixed disc sizes at 1/4 and 1/8 screen res, and interpolate between these and the unprocessed scene. This removes the need for offset calculations, since the offset is used pretty much directly: for( int x = -3; x <= 3; ++x ) { for( int y = -3; y <= 3; ++y ) { .. weights[x + 3 + (y + 3.f) * 7] .. } } This is much faster. I haven't done gpu timings yet, but for the scene in the screen below, I was getting 100fps. Unfortunately it doesn't look nearly as good as the more accurate method. It's not so much because of the downsized buffers though. 7x7 bokeh at 320x200 actually looks very good if you use filtering when sampling the result. The problem is that blending continuously between two bokeh levels doesn't look very natural, and adds a lot of unwanted blurring and haloing. This is about where I'm stuck. Any ideas? Things I would like: Continuous defocus instead of blending between discrete focus planes. Large circle of confusion radius with as few samples as possible. Random thoughts that might be useful: When storing circle of confusion radius in the alpha channel, you lose the ability to use texture filtering when doing the bokeh passes (since you're basically blurring transformed depth values). If you use texture filtering, you tend to get unexceptable fluttering around depth discontinuities. If you don't use texture filtering, you get a little bit of fluttering elsewhere. D'oh! The artifacts when using texture filtering are much worse though. Would be nice to only filter rgb and point sample a, but that would result in even more sampling. Figuring out how to properly composite the results was a pain. I ended up using MRT to calulate close (in front of focus plane) and far passes simultaneously, but to seperate render targets to ease blending. Basically: sample.a = weights[x + 3 + (y + 3.f) * 7]; output.near += sample * sample_is_near; output.far += sample * sample_is_far; sample_is_near is calculated from the sign of the circle of confusion radius. When calculating the circle of confusion radius, I simple don't abs() the result when shifting the depth value by the focus plane distance. sample_is_far = 1.f - sample_is_near. The alpha (weight) is in sample.a, so you accumulate coverage into the output alpha. This is necessary so you can blend properly when you reject pixels to avoid haloing. When compositing the results: (Continued in next post, ran out of space)...
Advertisement
(Continued from post above)
When compositing the results:

float4 final_color = bokeh_big_far;
final_color = bokeh_small_far + ((1.f-bokeh_small_far.a) * final_color);
final_color = clean_scene + ((1.f-clean_scene.a) * final_color);
final_color = bokeh_small_near + ((1.f-bokeh_small_near.a) * final_color);
final_color = bokeh_big_near + ((1.f-bokeh_big_near.a) * final_color);

There are a few other details that needed to be dealt with as well, maybe I'll elaborate on some of these things another time. This post is already huge. :)

I'll end with a screenshot I hastily grabbed showing a defocused scene in-engine. Sorry it's such a crap example, I don't have many assets to work with at the moment! The foreground objects are mostly using the small bokeh buffer. The background is a (originally completely in-focus) photograph at the far plane (and so, using the big bokeh buffer). This downsized image doesn't really do it justice I don't think, but I felt like this is a post just asking for a screengrab whether it's good or not. :) The bokeh balls(tm?) are still fairly visible even though it's downsized though, particularly on small highlights.

Thoughts on optimizations / improvements?

Free Image Hosting

[Edited by - FBMachine on February 23, 2010 5:54:59 PM]
The screenshot does look good, but how does it look when something is in focus as well?

Interesting idea.
Sorry, hopefully this is a better example. Excuse the wierd default texture, the models aren't textured.
You can see the haloing caused from interpolating between layers in this shot as well.

Free Image Hosting
That's pretty cool looking.
In case anyone is silently interested in implementing something like this, here's an update. :)

I went back to doing variable sized circle of confusion in one pass at full screen res. It just looks so much better. I got it working relatively real-time by getting rid of the complicated weight lookup since I realized the shape was a function of data I already had (distance to disc, and circle of confusion radius). Ie. In order to calculate the weight for a sample:

sample_radius*max_radius > length(offset)

Though to have clean edges around the disc, you want to use something like this instead:

smoothstep(-1,0, sample_radius*max_radius - length(offset))

I guess that's pretty obvious, not sure why I went the table lookup route. :)

Unfortunately this is only relatively real-time (6ms at 1280x720! 60fps in tech demos only ;)), and with a fairly small max radius. I have some ideas to speed it up that might help.

One idea is to limit the full res radius to a very small value, and interpolate to one or more downsized buffers with larger radius bokeh. There's the problem of blending artifacts again though.

Another idea is to create a 1/8 downsized buffer containing the max max_radius in the surrounding area. Finding max in an NxN block is seperable, so it should be relatively fast. Now, use this max radius (which is basically the distance to the furthest disc that touches the current pixel) to dynamically choose the sample count. Since it's a 1/8 downsized buffer, the branch will be coherent in 4x4 blocks which might make it a worthwhile optimization. Maybe! Depends how much of the view is completely out of focus.

Random note: don't bother trying this without HDR. I tried pushing it after the tone mapping pass, and it loses the hotness of the highlights, and without highlights it doesn't look much better than a gaussian blur.
Before seeing your screenshots I though that generating bokeh was really useless in visual quality terms. But now I understand your ideia, and in fact your images really provide a much more beautiful and natural depth of field effect that gaussian blur based ones.

Really nice work. Keep us updated with screenshots and more details about your technique.
A lot of people try doing DOF with a disc instead if Gaussian blur...it obviously produces much better quality but for most console ganes the performance hit is too much.

To optimize you might want to play around with doing a pre-pass mark off pixels that are totally in-focus and totally out-of-focus. If you mark these off in a stencil buffer and apply the DOF in multiple passes, you can make sure you're not taking lots of unneeded samples for pixels that don't need them. On newer harder you can also try just going for dynamic branching, since it might be fast enough to get a perf boost.
Yeah, wow, that looks beautiful (save the few haloing artifacts.) I'm generally used to see the Gaussian blur implementations where it hurts the eyes to look at it and is not as discrete. Good job on that right there.
Denzel Morris (@drdizzy) :: Software Engineer :: SkyTech Enterprises, Inc.
"When men are most sure and arrogant they are commonly most mistaken, giving views to passion without that proper deliberation which alone can secure them from the grossest absurdities." - David Hume
Thanks for the encouragement guys. :)
The haloing is thankfully fixed, it was a compositing issue that was a limitation of how I was interpolating between blur layers (which I don't do anymore).

MJP - I didn't realize I was late to the party. The closest thing to this that I've seen was a paper or two mentioning using disc shaped sampling patterns for a blur based on the circle of confusion at the current pixel.
While somewhat similar, it's not quite the same as what I'm doing. For one, you can't have out of focus pixels in the foreground in front of in-focus pixels using that method since it's basically just a disc shaped blur with per pixel radius. What I'm doing is block sampling and adding the contribution of neighboring discs based on their circle of confusion, which is more flexible (allows out of focus foreground, and simplifies halo reduction) and doesn't use dependent texture lookups (since I'm not scaling texture coordinates based on radius).

I've managed to get this running pretty well on the hardware I'm targetting, so I'm taking a break to work on other more pressing (and less sexy) things. But I'm definitely going to give the stencil optimization a try, great idea!

Thanks for the feedback everyone, I'll try to give an update when there's something worthwhile to add. :) I'll also post a more recent screenshot soon using the fancy pants sponza model crytek used recently for showing off ssgi, it looks pretty cool in a more natural/interesting scene. I need to work out a soft shadowing bug first though.

This topic is closed to new replies.

Advertisement