Variance Shadow Maps

Started by
20 comments, last by AndyTX 17 years ago
More or less ... 8:8:8:8 is just supported by most hardware.
Advertisement
Ok, let me try to respond to all of the questions/comments... sorry if I miss any!

Quote:Original post by wolf
Hey this looks cool. I just checked out the demo and the results look great, but ... unfortunately the technique does not seem to be fast enough on my target hardware platforms :-)

Yeah summed-area tables are pretty heavy, and feasible really only on G80's and maybe X1900s right now. That said, they're a forward-looking technique that should work really well for plausible soft shadows, and are much faster than PCSS, etc. I'll get really excited about them when we have hardware doubles (even if slow), and we can drop the distribute precision stuff. Doubles will completely clear up any precision issues.

Quote:Original post by wolf
... so I will invest some time to make VSM more bullet-proof.

Yeah, IMHO VSMs using ideally all of multisampling, mipmapping+trilinear and a small-to-medium blur (maybe 4x4 or more), and a light bleeding reduction function is the current best solution. Specifically on G80 that supports fp32 filtering and multisampling, the results are flawless and ridiculously fast (400+fps).

Quote:Original post by wolf
Would you guys be interested in seeing the propsals regarding VSM for ShaderX6? I would appreciate your comments here. If you are interested in cross-referencing, talk to your editor if you can send me your paper and we will refer to it in ShaderX6 :-). We usually do things like this ...

Yeah I'd love to see the proposals and give feedback, etc. You have my e-mail address correct? If not, please PM me.

Quote:Original post by wolf
One interesting thing: If I click distribute precision on my GeForce 7900 GTX there is no real difference in speed .. this might have something to do with the fact that the 7900 does not have a "native" 32:32 format.

At one point there was something broken with the 7000 series... in particular they were reporting a format as filterable that really was not. I'll have to look into that in more detail at some point...

Quote:Original post by wolf
How does Summed-Area VSM work then with 16:16:16:16 render targets?

Probably very badly... SAVSMs really need doubles, although they can be "made to work" decently with 4xfp32 as the demo shows.

Quote:Original post by wolf
What does light bleeding reduction do?

Use the function that I sent you earlier and describe a bit in the post... just lopping off the one tail of the distribution (artist-editable aggressiveness).

Quote:Original post by wolf
What does softness do?

It varies the minimum filter width. For PCF, this means taking more samples which is O(n^2). For standard VSM it means increasing the size of the separable blur (although this isn't implemented in that version of the demo - it just does a LOD bias which doesn't look very good ;)). In the D3D10 demo that I have, this is implemented correctly and costs O(n) due to the separable blur. For SAVSM this just literally means clamping the minimum size of the filter rectangle, which is O(1).

Quote:Original post by stanlo
Is there any way for VSMs to look nice on older machines? I tried using them with some older hardware, and I couldn't get them to look decent without 32f color channels.

Yeah VSMs do need high precision. The best that I can suggest is to use the tips that I gave in my GDC presentation last year (slides available at NVIDIA developer site). 16-bit precision is acceptable, and distributing that works fairly well for normal VSMs. 8-bit - even if distributed - may be pushing it, but it depends on the depth range of your light. Make sure to clamp the latter as aggressively as possible if precision is a problem! Also make sure that you're using a linear depth metric - such as distance to light point (for spot lights) or distance to light plane (for directional lights).
Hey Andy,
thanks for the extensive description. I just looked closer today at your new approach. Maybe it is my graphics card but it seems like the shadow is moving slightly ... it is a kind of shimmering. You can see this near the back bumper of the car. It happens with 512x512 and 1024x1024 indenpendly where the light bleeding slider is and also independently of the softness slider ...
Maybe it only happens on my 7900 GTX with 93.71 drivers ... slightly outdated.

- Wolf
Quote:Original post by wolf
thanks for the extensive description. I just looked closer today at your new approach. Maybe it is my graphics card but it seems like the shadow is moving slightly ... it is a kind of shimmering. You can see this near the back bumper of the car.

Almost certainly numeric problems, although they should be mitigated somewhat by increasing the softness, or decreasing the shadow map resolution. Doubles will solve this in the (near) future.

As I mentioned, SAVSMs will become more useful when people really want to do plausible soft shadows, which is still a little ways off, and by then doubles should be supported.

For now, standard hardware-filtered VSMs work really well, especially on the 8000 series! I'll release the D3D10 demo soonish (it'll be in Gems 3 at the very least)...
Actually on further reflection it may not be a precision problem. I remember having some trouble on both ATI cards and G70/NV40 with respect to dynamic flow control depth... IIRC that was causing random blinking blocks and other weirdness, which may be what you're seeing. At one point, merely loading the shader on ATI would instantly reboot the computer!

In any case the control flow depth problems are easy to work around if you're targeting these series of cards; I just didn't bother since I've been concerned mostly with G80 these days :)
Are there any articles on how to reduce light bleeding or can anyone describe his (or hers) working approach?

How severely is performance affected by VSM? It seems that GL_RGB32F, full screen blur and linear interpolation in the shader run at 30 FPS on my laptop (NVidia 7900 Go) while standard shadow maps with 24 bit fixed depth buffer and the free PCF run with > 200 FPS.

Another thing I noticed is that when using GL_RGBA16F and storing the moments in two components to enhance precision I see way more artefacts when using GL_LINEAR filtering (like "random" white pixels) than when doing linear interpolation in the shader and using GL_NEAREST. Sounds like a bug or are that expected precision problems?

Thanks!
Quote:Original post by krausest

Another thing I noticed is that when using GL_RGBA16F and storing the moments in two components to enhance precision I see way more artefacts when using GL_LINEAR filtering (like "random" white pixels) than when doing linear interpolation in the shader and using GL_NEAREST. Sounds like a bug or are that expected precision problems?

Thanks!



If your using NVIDIA cards (7 series or lower, I can't speak for the 8 series), then this is "normal" behaviour. AFAIK, Nvidia cards interpolate in the using the bit depth that the texture is in. So a F16 texture uses F16 for interpolation, causing the artifacts. I sure hope thats changed for the 8 series. I don't believe recent ATI cards do this, but don't quote me on that.

This issue is the main reason I had to dump VSM as a general solution (at least for now). I really want to use VSM, but it has serious problems using F16 based formats, and thats all most cards (nvidia) can do these days respectably. Add large light range to the equation (for city levels etc) and it gets worse . If someone solves this it would be much appreciated (free beer? :) ).

Yeah as noted VSM is only usable with fp16 for fairly small lights. The Gefore 8 series supports full filtering for fp32 textures and thus has *no* precision problems. Quite a joy to use :)

Regarding speed, on the 8800 VSM is *way* faster than an equivalent PCF implementation. It runs on the order of 600+fps in D3D10 even with gigantic blurs.

Regarding light bleeding, check out the beyond3D thread linked earlier in the thread by Pragma. It discusses a simple and pretty-much free way or reducing light bleeding. Of course degenerate cases can be constructed but I find that it produces quite good and acceptable results in practice.
Quote:Original post by AndyTX
Regarding light bleeding, check out the beyond3D thread linked earlier in the thread by Pragma. It discusses a simple and pretty-much free way or reducing light bleeding. Of course degenerate cases can be constructed but I find that it produces quite good and acceptable results in practice.

Thanks for your replies so far!
Just to leave no doubt: You propose
p = smoothstep(threshold, 1.0, p); where p is the bound for the Chebyshev's Inequality?


Actually I've been using "linstep" lately so as to try and maintain the original shape of the falloff function, but smoothstep will work fine too. You can change the falloff function as desired: the point is to clip off the tail to avoid light bleeding.

This topic is closed to new replies.

Advertisement