Jump to content

  • Log In with Google      Sign In   
  • Create Account


Some thoughts about renderers and deferred shading


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
95 replies to this topic

#1 dv   Members   -  Reputation: 101

Like
0Likes
Like

Posted 04 June 2006 - 11:53 AM

Hello, right now I have a typical forward renderer with multipass lighting. When my current project is finished (won't take long, deadline is in a couple of weeks), I can start re-thinking the whole stuff. I am looking at all options now, including deferred shading. Reading about it, I noticed three real disadvantages: alpha blending, fillrate, and restricted shaders. I can live with the first two (the fillrate can be dealt with a little using scissor tests for lights), but the last one keeps me thinking. After all, shouldn't be the artists able to create some custom shaders? I thought about some visual shader fragment linking, like U3 does (given that artists won't touch the actual shader code unless they really have to :) ). This would be a great features, shaders as assest, but just does not fit with deferred shading as far as I can see, since at the 2D stage one cannot distinguish the source geometry. So now I wonder which way to go. Deferred shading really shines in terms of lighting complexity, but custom shaders are nice too. I'm not sure if there is a way to add flexibility to the deferred shading uber-shader pass. I would really miss this feature because then, if I want to insert some new effects, I have to modify the renderer, no matter which effect. For example, with the traditional forward renderer, I could easily insert some refracting/reflecting water with fresnel, and I wouldn't have to touch the renderer. With deferred shading thats an entirely different story. Your thoughts?

Sponsor:

#2 Matt Aufderheide   Members   -  Reputation: 99

Like
0Likes
Like

Posted 04 June 2006 - 01:05 PM

My thinking about deferred shading: it's a hack based on current hardware limitations. It will likely never be used in real applications, or on a very limited basis, and will eventually die out.

My reasons are that first, there are many ways to deal with lights other than deffered shading, such as vertex specific lights, environment lighting, global illumination, etc. In most real cases, a single vertex of object wont have be affected by that many lights..3 or 4? maybe 8...

Also, since each light should cast a shadow, differed shading cant really help matters here, and in fact makes it a lot harder to do it. Whats the point having a lot of lgihts if they cant cast shadows?

Not having a robust alpha blending solution is a huge disadvantage, one many devlopers ar not going to want to deal with.

Also, as hardware and pixel processing speeds improve, more and more lights can be done in single passes and faster... eventually doing a per pixel light will be as cheap as a vertex light.

#3 CaossTec   Members   -  Reputation: 143

Like
-1Likes
Like

Posted 04 June 2006 - 03:48 PM

As far as I see, Deferred Shading is the most incredible technique to deal with rendering in general. It simplifies a lot the rendering process and most of its limitations can be avoided. There is an article on Graphic Programming Gems 6 that presents tips for reducing the fillrate and memory footprint up to an 13% of the memory used on the "storing all" solution.
I can't see why you are saying that shaders are restricted under Deferred Shading. In fact I think is the other way: they are enhanced. In fact Deferred shading (DS) and forward rendering (FR) use the same principle for shader management.
On a forward renderer the visualization is done:

- set VS1
- Set PS1
- Render Objects using that shaders

On the Deferred Shading:
- Set VS1
- Set newPS (only store data on G-Buffer)
- Render Objects
- Set newVs (super-simplest)
- Set PS1

As you can see, the VS1 and PS1 are still used and only newPS and newVS are inserted but those two are very simple. The only thing that has to be added in PS1 to work is that the Geometric data is obtained from the G-Buffer instead that from the Vertex Shader.
And the good news is that it allows no iluminating pixel that are going to be discarted and handled a huge amount of passes with out overwelming the graphic pipeline with the scene geometry over and over. Remember that actual generation graphics don't have so-complex scenes, but next generation graphic are going to be very very very complex and will need realist effects that need several passes to render correctly (HDR, Depth of Field, Heat distorsion, Volumetric Fogs, etc)

Quote:

Also, as hardware and pixel processing speeds improve, more and more lights can be done in single passes and faster... eventually doing a per pixel light will be as cheap as a vertex light.


As the hardware improve also the need for more complicated escenes with high number of materials and polygons, more lights and so.

My thought is that Deferred Shading is going to be the standard on next generation.

Quote:

This would be a great features, shaders as assest, but just does not fit with deferred shading as far as I can see, since at the 2D stage one cannot distinguish the source geometry.


On DS the pixel shader of the ilumination phase is done just a 2D post processing but it has all the Geometry data of the pixel as it's stored on the G-Buffer. If you need to change the geometry just do it on the Vertex Shader in the Geometry Phase that store the geometry data on the G-Buffer.
DS doesn't restrict any thing but allow you to use more complex shaders since it's possible to implement parallax mapping just one time in the Geometry Phase while storing data on the G-Buffer and all lights that are executed on the ilumination phase will use the parallax data with out re-doing the displacement of the ray tracing in the steep mapping case.

#4 Cypher19   Members   -  Reputation: 768

Like
1Likes
Like

Posted 04 June 2006 - 07:20 PM

(Pardon me for being a bit of a downer in this post...)

As great as deferred shading sounds, e.g. minimal vertex usage, virtually no batching, with some optimizations low fillrate and memory usage, etc., there is a big, big, big, BIG catch behind it:

No.

Anti.

Aliasing.

Right now, all techniques of deferred shading+AA completely sucks in one way or another, and the AA that the hardware does just beats the pants off of all of them. While it may seem like something that can be trivially thrown away, antialiasing vastly improves the image quality of a scene, even at high resolutions (I play WoW at, effectively, 1600x1200, and still notice shimmering on the aliased tree leaves due to alpha testing), at no solution to DS+AA that I've seen can match that kind of image quality, even with a nosedive in performance.

Not to mention that the fact that the small vertex/batch issue quickly goes out the window if you include something like shadowing (either volumes or maps), as when making, say, shadow maps you have to re-render all of the verts and also reset all of the matrices for each object, pingpong between render targets, reset shaders for generating SMs and the mian scene, and so on. That's even ignoring the fact that the low-vertex argument is just plain unappealing considering how much vertex throughput modern GPUs have.

The only good solution that I've seen so far is semi-deferred shading, which SimmerD is using in his game-in-development, Ancient Galaxy. It works fairly well with AA from what I've seen, and has some nice little things about it, such as one-time calculation of Fresnel values for specular/reflectance coefficients, and normal calculation when using normal maps (both of which aren't super trivial to do, and have to be done per-light). On top of that, since it involves rendering the scene geometry 'on top' of the G-buffers, you can get the desired material coherence. The downside is that you lose the vertex/batch benefits, but as I mentioned above re: shadows, it's something you have to live with already anyways, so the extra batches on top of that aren't quite as significant. I still haven't worked with it myself, but it's something that I'm would like to try out at some point after ironing out the forward renderer that I'm working on.

#5 Ardor   Members   -  Reputation: 122

Like
0Likes
Like

Posted 05 June 2006 - 01:11 AM

I think I see one problem here.
It is possible to have object-specific shaders in the geometry phase (like the parallax mapping you mentioned), but not in the illumination phase, because it wont be possible to distinguish the geometry by then. So, if you want some phong over there, a fresnel term over there etc. you run into problems.

I don't see missing AA as fatal. AA gets less and less important on higher resolutions anyway, so I wouldn't really miss it.

#6 Matt Aufderheide   Members   -  Reputation: 99

Like
-1Likes
Like

Posted 05 June 2006 - 03:27 AM

I dont see AA as a big issue, there would be other ways to deal with that, but the difficulties of shadowing, the problems of alpha blending, and the fact that differed shading is not really needed for anything is what's important-- Do you need 100 lights affecting one vertex? no.

It's hack that sounds neat but in the end will likely not be used.

#7 AndyTX   Members   -  Reputation: 802

Like
-1Likes
Like

Posted 05 June 2006 - 05:01 AM

1) It's not a "hack"... it's simply a different way of organizing a nested for loop to reduce redundant computation... thing about, and read some of the original papers.

2) Alpha blending is a "hack" that fails in several instances. Other "more-correct" transluscency techniques work fine with deferred shading. Furthermore one can still do transluscent surfaces using a forward renderer after doing the majority of the scene with deferred shading. STALKER does this IIRC.

3) Memory bandwidth can be a potential issue, but it's not as crippling as one might think. Framebuffers of up to and over 1600x1200 work fine on modern hardware, even with complex scenes. Memory consumption can be an issue as well, but with 512MB cards now and larger ones coming, that'll go away soon. The key is that the memory transfer is extremely predictable (non-random) and so can be implemented very efficiently.

4) By "no-anti-aliasing", I think you mean no hardware multisampling. While true, supersampling with methods like jittered grids and quincunx-like stuff work great (of course at the cost of more memory). The other advantage of these methods is that they apply to the whole scene, shaders and all (not just depth discontinuities). Adaptive methods are also possible, but those probably won't be necessary in the long term.

5) I don't know what you mean by shadows causing trouble. They cause *less* trouble than in a forward renderer since a shadow map can be generated and thrown out after a light has been accumulated. The problems here with changing shaders, etc. are no different than for a forward renderer. Shadow maps are no more expensive with deferred shading.

6) Different surface shaders is also not a problem on modern hardware. This is a perfect case for dynamic branching as it is extremely coherant. Using something like libsh or Cg to a lesser extent, the "jumbo shader" can even be created from the smaller shaders and avoids needing to change shaders per-object/light.

Quote:

Also, as hardware and pixel processing speeds improve, more and more lights can be done in single passes and faster... eventually doing a per pixel light will be as cheap as a vertex light.

That's not the point - the point is that deferred shading has a complexity of O(G+L) while forward rendering is O(G*L). The more complex scenes get, the *worse* that makes forward rendering look. Sure you can bound L by only rendering a few contributing lights per object, but in a dynamic scene that's forcing a recomputation of light contributions every frame... not something we want for a "complex" scene containing hundreds or even thousands of lights and even more geometry.

I as well suspect that deferred shading will become more popular in the future, not less. Most of the initial problems with it are already gone, and the remaining few are more due to hardware being designed around forward rendering than anything.

#8 Basiror   Members   -  Reputation: 241

Like
-1Likes
Like

Posted 05 June 2006 - 05:11 AM

As for the anti aliasing, can t you render the scene into a large render target and just scale the finaly image down to screensize?
You only had to take care of the width/height ration to avoid distortion.

I haven t done this on my own yet, but it I will use it in my upcoming engine

#9 Cypher19   Members   -  Reputation: 768

Like
1Likes
Like

Posted 05 June 2006 - 05:30 AM

Quote:
Original post by Ardor
I don't see missing AA as fatal. AA gets less and less important on higher resolutions anyway, so I wouldn't really miss it.


It doesn't, imo. AA at higher resolutions still helps eliminate a lot of shimmering, and I find it quite noticable as I mentioned in my above post.

Quote:
I dont see AA as a big issue, there would be other ways to deal with that,


There are, but the ones that I've seen just don't work well enough to justify DS. Here're the techniques that I know of:

-Weighted post process blur (GPUGems2, DS in STALKER)
-Supersampling the entire scene
-Deferred AA

Weighted blur I haven't been able to see in action, but the result that I imagine is, basically, a blurry scene, which doesn't antialias worth crap. It'd be like playing it on a TV, where aliasing is quite visible (anyone remember the "PS2 has jaggies" hoopla awhile back?). Oh, and it's not a cheap post-proc anyways. So, scratch that off the list due to just pure ineffectiveness (Fwiw, I haven't been able to find one screenshot of STALKER that has their antialiasing on, but my point still stands).

Supersampling the entire scene has no reason not to work, but what happens when you turn AA on at resolutions greater than 1024x768? You're suddenly limited to cards that support 4096x4096 textures, and on top of that, at high res's like 1600x1200, you're having to deal with absolutely horrendous amounts of fillrate and memory bandwidth since you're rendering to (and storing in memory...and accessing multiple times...) a 3200x2400 render target. In the end, 800x600 with supersampling on would run slower than 1600x1200 with supersampling off and creates a lower image quality due to the downsizing that occurs, as opposed to hardare AA which can be enabled at a very meager cost nowadays (e.g. the X1900 cards lose less than 20% performance when going from 0 to 4xAA at 1600x1200). And due to the nature of supersampling, you don't even get jittered grids or such, so 4xhardware AA at polygon edges still looks better than what supersampling could do.

And finally Deferred AA, which is an attempt that I made to duplicate hardware AA. It only works for large objects (i.e. ones that cover more than about 2x2 pixels) which is only half the reason why AA should be used, the other being sub-pixel objects that blink in and out as they or the camera move. On top of that though, the performance and memory cost, like supersampling, absolutely stinks. In addition to the main G-buffers involved, the backbuffer needs to store an AA'd version of the scene, so you have to have another render target to render the main scene (more memory...) as well as an extra depth buffer (more memory...). The performance isn't very good either (extra rendering of the scene to hardware AA'd buffers and a not-very-cheap post process effect) so even that can be scratched off of the list.

#10 CaossTec   Members   -  Reputation: 143

Like
0Likes
Like

Posted 05 June 2006 - 06:27 AM

Antialiasing is indeed one of the greatest drawbacks of DS, but they are not as hard to simulate as you point. I had found that using a simple edge detection filter and then just blurring those few edges with a 3x3 kernel produce very convincing results with out sacrifying performance.


#11 AndyTX   Members   -  Reputation: 802

Like
0Likes
Like

Posted 05 June 2006 - 06:45 AM

I already mentioned jittered grids, quincunx, etc. all of which work great with deferred shading, raytracing, or whatever (as they make no assumptions about the nature of the aliasing)... they also don't require lots of extra memory storage (as rendering large sizes and downsampling would). They do require re-rendering the scene with subtly different projection matrices but that will probably be plenty-fast in the future.

#12 wolf   Members   -  Reputation: 848

Like
0Likes
Like

Posted 05 June 2006 - 06:47 AM

Caosstec: this does not work, because the screenspace blur will be the same size throughout the image ... so objects that are far away would have the same blur as objects that are very near.

AndyTX: I am curious: what do you mean with subtly different projection matrices?

#13 AndyTX   Members   -  Reputation: 802

Like
0Likes
Like

Posted 05 June 2006 - 07:44 AM

Quote:
Original post by wolf
AndyTX: I am curious: what do you mean with subtly different projection matrices?

Just displacing the framebuffer pixels (jittering)... so really more of a different *post-projection* matrix, but usually it's easiest just to tack it on to projection.

#14 Cypher19   Members   -  Reputation: 768

Like
1Likes
Like

Posted 05 June 2006 - 07:47 AM

Quote:
Original post by CaossTec
Antialiasing is indeed one of the greatest drawbacks of DS, but they are not as hard to simulate as you point. I had found that using a simple edge detection filter and then just blurring those few edges with a 3x3 kernel produce very convincing results with out sacrifying performance.


That's exactly what the first thing I mentioned in my post was, and I'll be shocked if your solution looks as good as what hardware AA provides. I honestly that it is the #1 worst solution to AA+DS in existence, and I find it ludicrous that ANYONE in the graphics industry actually takes it seriously, considering the results that a 3x3 blur gives compared to what hardware AA does.

Quote:
AndyTX: I am curious: what do you mean with subtly different projection matrices?


I think he's referring to having two sets of G-buffers that are each screen-resolution size, and the projection matrices each have sub-pixel offsets (I don't know the math behind it, but it'd be worth checking out functions in D3DX like D3DXMatrixPerspectiveOffCenter). It'd be interesting to see how something like quincunx AA would work as a postprocess. I never did entertain that idea before, and it might work. On the same token though, I don't exactly recall having a superb experience with Quincunx AA back on my old GeForce4. Also, I'm curious as to what you mean, AndyTX, by a jittered grid in this context and how you'd implement it.

Quote:
Just displacing the framebuffer pixels (jittering)... so really more of a different *post-projection* matrix, but usually it's easiest just to tack it on to projection.


But where are you making that jittering? Inbetween frames? During a post-process?

#15 Basiror   Members   -  Reputation: 241

Like
0Likes
Like

Posted 05 June 2006 - 08:03 AM

Thinking about the problems we got with AntiAliasing:

Here is an image of a sample scene, the bottom shows the edges.
The polygons with brighter surface are nearer to the camera.
Can t you find the edges though another pass and build a map with high values for
PixelShader driven anti aliasing and low values to skip antialiasing there.
Like some sort of alpha test.
The gray rectangle underneath represents such an edge map

You don t have to use extra large render targets this way and you can easily skip this process for high resolutions where anti aliasing effects are hardly noticeable


#16 AndyTX   Members   -  Reputation: 802

Like
0Likes
Like

Posted 05 June 2006 - 08:08 AM

Quote:
Original post by Cypher19
But where are you making that jittering? Inbetween frames? During a post-process?

It's just brute force super-sampling.

Render totally separate frames with an pixel offset post-projection (ordered, jittered, whatever) and blend (weights can be fixed, gaussian, bilinear, whatever).

Of course this is rather expensive for complex scenes, but it doesn't require extra memory and effectively anti-aliases everything (depth discontinutities, high-frequency textures, shaders, etc). The results should be better than rendering a large image and downsampling (a box filter), and require significantly less memory. However each sample requires a shaded rendering pass which could hurt a vertex or cpu-limitted application more than just increasing the framebuffer resolution.

#17 AndyTX   Members   -  Reputation: 802

Like
0Likes
Like

Posted 05 June 2006 - 08:11 AM

Quote:
Original post by Basiror
Can t you find the edges though another pass and build a map with high values for
PixelShader driven anti aliasing and low values to skip antialiasing there.

The problem is that to eliminate rasterization aliasing requires re-rasterizing, which one can't do in the pixel shader. Ultimately image space post-process antialiasing isn't going to produce adequete results.

#18 okonomiyaki   Members   -  Reputation: 548

Like
0Likes
Like

Posted 05 June 2006 - 08:13 AM

Quote:
Original post by wolf
Caosstec: this does not work, because the screenspace blur will be the same size throughout the image ... so objects that are far away would have the same blur as objects that are very near.


What do you mean by the same blur? Because he is using an edge detection filter, he's only blurring the slight edges of objects, and so objects far away will have a slight blur around them, while objects up close will have more blurring because of the larger edge and width of the edge.

I just did some googling on deferred shading (not too experienced with it myself), but this paper describes that anti-aliasing technique in section 3.5.1. I have no practical experience with it, but I don't see why it wouldn't work.

Edit: ah, a lot of people responded before I submitted this... my comment is kind of obsolete now. ignore it.

#19 Cypher19   Members   -  Reputation: 768

Like
0Likes
Like

Posted 05 June 2006 - 08:19 AM

Quote:
Original post by AndyTX
Quote:
Original post by Cypher19
But where are you making that jittering? Inbetween frames? During a post-process?

It's just brute force super-sampling.

Render totally separate frames with an pixel offset post-projection (ordered, jittered, whatever) and blend (weights can be fixed, gaussian, bilinear, whatever).

Of course this is rather expensive for complex scenes, but it doesn't require extra memory and effectively anti-aliases everything (depth discontinutities, high-frequency textures, shaders, etc). The results should be better than rendering a large image and downsampling (a box filter), and require significantly less memory. However each sample requires a shaded rendering pass which could hurt a vertex or cpu-limitted application more than just increasing the framebuffer resolution.


Am I correct in assuming that that idea is a fair bit like RTHDRIBL's motion blur/AA feature, minus the position updates between re-rendering?

#20 AndyTX   Members   -  Reputation: 802

Like
0Likes
Like

Posted 05 June 2006 - 09:34 AM

Quote:
Original post by Cypher19
Am I correct in assuming that that idea is a fair bit like RTHDRIBL's motion blur/AA feature, minus the position updates between re-rendering?

It could be... I'm not certain how it is done in that demo (any explanation other than looking through the source?). In any case it's just super-sampling - each fragment will be the composite result of several rasterization passes. All I'm noting is that it can be done without requiring extra memory, and non-box filters can also be implemented to some extent (although probably only linear filters will work due to using alpha blending to composite).




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS