Some thoughts about renderers and deferred shading

Started by
94 comments, last by AndyTX 17 years, 10 months ago
Hello, right now I have a typical forward renderer with multipass lighting. When my current project is finished (won't take long, deadline is in a couple of weeks), I can start re-thinking the whole stuff. I am looking at all options now, including deferred shading. Reading about it, I noticed three real disadvantages: alpha blending, fillrate, and restricted shaders. I can live with the first two (the fillrate can be dealt with a little using scissor tests for lights), but the last one keeps me thinking. After all, shouldn't be the artists able to create some custom shaders? I thought about some visual shader fragment linking, like U3 does (given that artists won't touch the actual shader code unless they really have to :) ). This would be a great features, shaders as assest, but just does not fit with deferred shading as far as I can see, since at the 2D stage one cannot distinguish the source geometry. So now I wonder which way to go. Deferred shading really shines in terms of lighting complexity, but custom shaders are nice too. I'm not sure if there is a way to add flexibility to the deferred shading uber-shader pass. I would really miss this feature because then, if I want to insert some new effects, I have to modify the renderer, no matter which effect. For example, with the traditional forward renderer, I could easily insert some refracting/reflecting water with fresnel, and I wouldn't have to touch the renderer. With deferred shading thats an entirely different story. Your thoughts?
~dv();
Advertisement
My thinking about deferred shading: it's a hack based on current hardware limitations. It will likely never be used in real applications, or on a very limited basis, and will eventually die out.

My reasons are that first, there are many ways to deal with lights other than deffered shading, such as vertex specific lights, environment lighting, global illumination, etc. In most real cases, a single vertex of object wont have be affected by that many lights..3 or 4? maybe 8...

Also, since each light should cast a shadow, differed shading cant really help matters here, and in fact makes it a lot harder to do it. Whats the point having a lot of lgihts if they cant cast shadows?

Not having a robust alpha blending solution is a huge disadvantage, one many devlopers ar not going to want to deal with.

Also, as hardware and pixel processing speeds improve, more and more lights can be done in single passes and faster... eventually doing a per pixel light will be as cheap as a vertex light.
As far as I see, Deferred Shading is the most incredible technique to deal with rendering in general. It simplifies a lot the rendering process and most of its limitations can be avoided. There is an article on Graphic Programming Gems 6 that presents tips for reducing the fillrate and memory footprint up to an 13% of the memory used on the "storing all" solution.
I can't see why you are saying that shaders are restricted under Deferred Shading. In fact I think is the other way: they are enhanced. In fact Deferred shading (DS) and forward rendering (FR) use the same principle for shader management.
On a forward renderer the visualization is done:

- set VS1
- Set PS1
- Render Objects using that shaders

On the Deferred Shading:
- Set VS1
- Set newPS (only store data on G-Buffer)
- Render Objects
- Set newVs (super-simplest)
- Set PS1

As you can see, the VS1 and PS1 are still used and only newPS and newVS are inserted but those two are very simple. The only thing that has to be added in PS1 to work is that the Geometric data is obtained from the G-Buffer instead that from the Vertex Shader.
And the good news is that it allows no iluminating pixel that are going to be discarted and handled a huge amount of passes with out overwelming the graphic pipeline with the scene geometry over and over. Remember that actual generation graphics don't have so-complex scenes, but next generation graphic are going to be very very very complex and will need realist effects that need several passes to render correctly (HDR, Depth of Field, Heat distorsion, Volumetric Fogs, etc)

Quote:
Also, as hardware and pixel processing speeds improve, more and more lights can be done in single passes and faster... eventually doing a per pixel light will be as cheap as a vertex light.


As the hardware improve also the need for more complicated escenes with high number of materials and polygons, more lights and so.

My thought is that Deferred Shading is going to be the standard on next generation.

Quote:
This would be a great features, shaders as assest, but just does not fit with deferred shading as far as I can see, since at the 2D stage one cannot distinguish the source geometry.


On DS the pixel shader of the ilumination phase is done just a 2D post processing but it has all the Geometry data of the pixel as it's stored on the G-Buffer. If you need to change the geometry just do it on the Vertex Shader in the Geometry Phase that store the geometry data on the G-Buffer.
DS doesn't restrict any thing but allow you to use more complex shaders since it's possible to implement parallax mapping just one time in the Geometry Phase while storing data on the G-Buffer and all lights that are executed on the ilumination phase will use the parallax data with out re-doing the displacement of the ray tracing in the steep mapping case.
(Pardon me for being a bit of a downer in this post...)

As great as deferred shading sounds, e.g. minimal vertex usage, virtually no batching, with some optimizations low fillrate and memory usage, etc., there is a big, big, big, BIG catch behind it:

No.

Anti.

Aliasing.

Right now, all techniques of deferred shading+AA completely sucks in one way or another, and the AA that the hardware does just beats the pants off of all of them. While it may seem like something that can be trivially thrown away, antialiasing vastly improves the image quality of a scene, even at high resolutions (I play WoW at, effectively, 1600x1200, and still notice shimmering on the aliased tree leaves due to alpha testing), at no solution to DS+AA that I've seen can match that kind of image quality, even with a nosedive in performance.

Not to mention that the fact that the small vertex/batch issue quickly goes out the window if you include something like shadowing (either volumes or maps), as when making, say, shadow maps you have to re-render all of the verts and also reset all of the matrices for each object, pingpong between render targets, reset shaders for generating SMs and the mian scene, and so on. That's even ignoring the fact that the low-vertex argument is just plain unappealing considering how much vertex throughput modern GPUs have.

The only good solution that I've seen so far is semi-deferred shading, which SimmerD is using in his game-in-development, Ancient Galaxy. It works fairly well with AA from what I've seen, and has some nice little things about it, such as one-time calculation of Fresnel values for specular/reflectance coefficients, and normal calculation when using normal maps (both of which aren't super trivial to do, and have to be done per-light). On top of that, since it involves rendering the scene geometry 'on top' of the G-buffers, you can get the desired material coherence. The downside is that you lose the vertex/batch benefits, but as I mentioned above re: shadows, it's something you have to live with already anyways, so the extra batches on top of that aren't quite as significant. I still haven't worked with it myself, but it's something that I'm would like to try out at some point after ironing out the forward renderer that I'm working on.
I think I see one problem here.
It is possible to have object-specific shaders in the geometry phase (like the parallax mapping you mentioned), but not in the illumination phase, because it wont be possible to distinguish the geometry by then. So, if you want some phong over there, a fresnel term over there etc. you run into problems.

I don't see missing AA as fatal. AA gets less and less important on higher resolutions anyway, so I wouldn't really miss it.
I dont see AA as a big issue, there would be other ways to deal with that, but the difficulties of shadowing, the problems of alpha blending, and the fact that differed shading is not really needed for anything is what's important-- Do you need 100 lights affecting one vertex? no.

It's hack that sounds neat but in the end will likely not be used.
1) It's not a "hack"... it's simply a different way of organizing a nested for loop to reduce redundant computation... thing about, and read some of the original papers.

2) Alpha blending is a "hack" that fails in several instances. Other "more-correct" transluscency techniques work fine with deferred shading. Furthermore one can still do transluscent surfaces using a forward renderer after doing the majority of the scene with deferred shading. STALKER does this IIRC.

3) Memory bandwidth can be a potential issue, but it's not as crippling as one might think. Framebuffers of up to and over 1600x1200 work fine on modern hardware, even with complex scenes. Memory consumption can be an issue as well, but with 512MB cards now and larger ones coming, that'll go away soon. The key is that the memory transfer is extremely predictable (non-random) and so can be implemented very efficiently.

4) By "no-anti-aliasing", I think you mean no hardware multisampling. While true, supersampling with methods like jittered grids and quincunx-like stuff work great (of course at the cost of more memory). The other advantage of these methods is that they apply to the whole scene, shaders and all (not just depth discontinuities). Adaptive methods are also possible, but those probably won't be necessary in the long term.

5) I don't know what you mean by shadows causing trouble. They cause *less* trouble than in a forward renderer since a shadow map can be generated and thrown out after a light has been accumulated. The problems here with changing shaders, etc. are no different than for a forward renderer. Shadow maps are no more expensive with deferred shading.

6) Different surface shaders is also not a problem on modern hardware. This is a perfect case for dynamic branching as it is extremely coherant. Using something like libsh or Cg to a lesser extent, the "jumbo shader" can even be created from the smaller shaders and avoids needing to change shaders per-object/light.

Quote:
Also, as hardware and pixel processing speeds improve, more and more lights can be done in single passes and faster... eventually doing a per pixel light will be as cheap as a vertex light.

That's not the point - the point is that deferred shading has a complexity of O(G+L) while forward rendering is O(G*L). The more complex scenes get, the *worse* that makes forward rendering look. Sure you can bound L by only rendering a few contributing lights per object, but in a dynamic scene that's forcing a recomputation of light contributions every frame... not something we want for a "complex" scene containing hundreds or even thousands of lights and even more geometry.

I as well suspect that deferred shading will become more popular in the future, not less. Most of the initial problems with it are already gone, and the remaining few are more due to hardware being designed around forward rendering than anything.
As for the anti aliasing, can t you render the scene into a large render target and just scale the finaly image down to screensize?
You only had to take care of the width/height ration to avoid distortion.

I haven t done this on my own yet, but it I will use it in my upcoming engine
http://www.8ung.at/basiror/theironcross.html
Quote:Original post by Ardor
I don't see missing AA as fatal. AA gets less and less important on higher resolutions anyway, so I wouldn't really miss it.


It doesn't, imo. AA at higher resolutions still helps eliminate a lot of shimmering, and I find it quite noticable as I mentioned in my above post.

Quote:I dont see AA as a big issue, there would be other ways to deal with that,


There are, but the ones that I've seen just don't work well enough to justify DS. Here're the techniques that I know of:

-Weighted post process blur (GPUGems2, DS in STALKER)
-Supersampling the entire scene
-Deferred AA

Weighted blur I haven't been able to see in action, but the result that I imagine is, basically, a blurry scene, which doesn't antialias worth crap. It'd be like playing it on a TV, where aliasing is quite visible (anyone remember the "PS2 has jaggies" hoopla awhile back?). Oh, and it's not a cheap post-proc anyways. So, scratch that off the list due to just pure ineffectiveness (Fwiw, I haven't been able to find one screenshot of STALKER that has their antialiasing on, but my point still stands).

Supersampling the entire scene has no reason not to work, but what happens when you turn AA on at resolutions greater than 1024x768? You're suddenly limited to cards that support 4096x4096 textures, and on top of that, at high res's like 1600x1200, you're having to deal with absolutely horrendous amounts of fillrate and memory bandwidth since you're rendering to (and storing in memory...and accessing multiple times...) a 3200x2400 render target. In the end, 800x600 with supersampling on would run slower than 1600x1200 with supersampling off and creates a lower image quality due to the downsizing that occurs, as opposed to hardare AA which can be enabled at a very meager cost nowadays (e.g. the X1900 cards lose less than 20% performance when going from 0 to 4xAA at 1600x1200). And due to the nature of supersampling, you don't even get jittered grids or such, so 4xhardware AA at polygon edges still looks better than what supersampling could do.

And finally Deferred AA, which is an attempt that I made to duplicate hardware AA. It only works for large objects (i.e. ones that cover more than about 2x2 pixels) which is only half the reason why AA should be used, the other being sub-pixel objects that blink in and out as they or the camera move. On top of that though, the performance and memory cost, like supersampling, absolutely stinks. In addition to the main G-buffers involved, the backbuffer needs to store an AA'd version of the scene, so you have to have another render target to render the main scene (more memory...) as well as an extra depth buffer (more memory...). The performance isn't very good either (extra rendering of the scene to hardware AA'd buffers and a not-very-cheap post process effect) so even that can be scratched off of the list.
Antialiasing is indeed one of the greatest drawbacks of DS, but they are not as hard to simulate as you point. I had found that using a simple edge detection filter and then just blurring those few edges with a 3x3 kernel produce very convincing results with out sacrifying performance.

This topic is closed to new replies.

Advertisement