Deferred shading without MRTs and with MSAA

Started by
14 comments, last by jerm 17 years ago
Quote:Original post by Yann L
As AndyTX mentioned, current MSAA is just not usable with any form of deferred shading, unless there somehow was an efficient way to access the pre-resolved samples for all involved components, with full MRT support. We're not there yet, I'm afraid.

I looked into the support in D3D10 more and indeed you can access the pre-resolved data but unfortunately I'm not sure that deferred shading can make use of this efficiently. The problem is that you can't get access to any sort of compression flag to indicate whether all of the samples are the same, which means you are stuck evaluating the BRDF for all of the samples anyways... why not just super-sample?

Quote:Original post by Yann L
However, the visual impact of having completely dynamic area soft shadows everywhere is definitely worth it :)

Sounds really nice - any chance we can see some screenshots or get some info on the techniques that you're using?

Quote:Original post by Yann L
Oh, and could ATI *PLEASE* fix the two trillion depth buffer related bugs in their current OpenGL driver already ? Working around their bugs is a real PITA.

Haha, you ran into this too? The other major annoyance on ATI is the poor performance of MRT in general. In a deferred shading demo that I wrote a while back even with huge resolutions and lots of geometry it was *still* faster on ATI to render the buffers one by one, whereas on NVIDIA the crossover is legitimately at something like ~1000 polygons.

Quote:Original post by Yann L
Anyway, cheers guys, and thanks for your suggestions !

Sorry they couldn't be any more useful :( Good luck with your project though!
Advertisement
Ditto on what AndyTX said. I've been playing around with deferred rendering for the past month or two, and I have yet to come across an effective way of implementing MSAA (MRT or not). Having just got my 8800 recently, I haven't played around with retrieving the pre-resolved textures but I'd suspect the performance would be closer to SSAA than it would be to MSAA.
Quote:Original post by Yann L
...
So I ended up rendering the geometry twice, once to a non-MSAA buffer (with depth and normal data only, using MRT), and a second time to the HDR colour buffer with MSAA. This works fine, but eats up a lot of performance.


It's faster on ATI cards though. currently the driver recompiles ALL shaders when
you switch from e.g. one drawbuffer to two.
weird, especially since the drawbuffers extension has been introduced by ati in the
first place. (there's been a thread on opengl.org about this issue)

Quote:Original post by Yann L
However, the visual impact of having completely dynamic area soft shadows everywhere is definitely worth it :)
...


care to explain a little bit more in detail? :)

[Edited by - ze moo on April 29, 2007 5:31:46 AM]
Quote:why not just super-sample


You get the benefits of rotated grid (et al) antialiasing patterns without having to rerender the scene multiple times, e.g. in the case of jittered sub-pixel positions. With just regular super-sampling (i.e. render once to a mega-sized buffer), you get the same quality as square grid AA, and anyone who has seen both knows that rotated grid just knocks square all over the parking lot [grin]

Besides, the driver/GPU can probably handle the AA'd buffer in a smarter manner, in terms of performance, than just falling over and having to deal with a really large buffer.
Quote:Original post by Cypher19
Besides, the driver/GPU can probably handle the AA'd buffer in a smarter manner, in terms of performance, than just falling over and having to deal with a really large buffer.

Yes of course, your comments are correct. However, Andy's point was about the interaction between user supplied pixel shaders and the AA buffer. Since there seem to be no (easy ?) way of knowing whether some property within an AA sample cell is constant, you cannot optimize your shaders the same way MSAA optimizes over SSAA. In other words, you have to execute your shader once for every subpixel in the AA buffer, which would amount to SSAA from the performance penalty of shader invocation.

Quote:Original post by AndyTX
Quote:Original post by Yann L
Oh, and could ATI *PLEASE* fix the two trillion depth buffer related bugs in their current OpenGL driver already ? Working around their bugs is a real PITA.

Haha, you ran into this too?

Oh God, yes... Please don't remind me... I just ranted about this over in the OpenGL forum.

Quote:Original post by AndyTX
Sounds really nice - any chance we can see some screenshots or get some info on the techniques that you're using?

Oh, it's not that complex actually.

It's basically a form of advanced ambient occlusion field technique mixed with PRT. In ShaderX4, there is an article by Kontkanen/Laine about cubemap compressed ambient occlusion fields for contact shadows. Our system uses a similar principle, but encodes the data in a different way.

For each reference object, we're using one set of 3D textures and one cubemap set. The 3D textures spans the objects bounding box, and the cubemaps close the box like an exterior skin. Each 3D voxel contains not the spherical cap size and direction (as in the Shader X4 approach), but an entire SH coefficient set. This way, we can describe the transfer function at each inner point of the object towards the outside, from every possible direction, taking into account all forms of self shadowing and self reflections.

The outer cubemap encodes compressed polynomials, much like Kontkanen/Laine did, except we're storing outwards going SH coefficients again instead of a simple spherical cap.

When rendering, we project the 3D texture/cubemap set from every object onto the environment, compute the light transfer function at each pixel. Since we store an entire SH set at the inner and outer regions of each object, we can easily extend the ambient shadows to include direct and indirect lighting effects. Means that we don't only get near surface contact shadows through the 3D texture, but full soft shadows through the PRT polynomials on the cubemap set. See it as a kind of "inverted shadowmap" approach, but instead of storing depth values, we store light transfer polynomials.

The main idea is to get realistic shadows from many large dynamic objects in a complex GI environment (with area lights, sun and skylight), used in photorealistic architectural visualization.
I've always thought of them as inverted deep shadow maps. Close enough! Sounds interesting.

I'd be curious to know storage requirements for your technique. Kontkanen's technique was memory heavy on its own, without adding a 3D texture of SH coeffs. I always liked the ambient occlusion fields paper, but it always seemed like it was just a bit too impractical for real-time apps.

This topic is closed to new replies.

Advertisement