Oh boy, here we go again...when AndyTX finds this thread he'll be able to talk about it at greater lengths and possibly expand on my post, but here's a rundown of the pros and cons of deferred rendering that I know of, or are most prominent in my mind.
Pros: -VERY easy and simple to make, and can be made fairly fast. -Very tight boundaries on what pixels the light volumes affect (i.e. not many pixels, if any, are wasted). -Fairly geometric complexity-agnostic because you're only doing a single draw. -Low amount of state changes -One-time generation of reasonably complex perpixel data, e.g. calculation of normals for normal mapping, Fresnel specular coefficients, anisotropic filtering. -Scales very nicely as number of lights increases
Cons: -No hardware antialiasing. -Fairly high memory cost, especially at higher resolutions. -No hardware antialiasing. -Difficult, but not impossible, to support a reasonably wide variety of materials. -No hardware antialiasing. -High initial cost of G-Buffer generation due to the large amount of data that must be generated. (the last pro above should be strongly noted in conjunction with this point) -No hardware antialiasing. -Very fillrate/memory-bandwidth intensive. -No hardware antialiasing. -[This is more of an anti-pro than a con...] If you're doing any kind of shadowing solution, the geometry complexity argument flies out the window. -I'm not sure if I mentioned this or not, but there's also no hardware antialiasing.
Now, regarding the repeated con above, there are semi-decent solutions to the antialiasing issue. You can do supersampling, you can render the scene multiple times with jittered projection matrices, you can do post-process weighted blurs, and so on. However, imo at least, all of those are just absolute shit and BARELY justify the rest of the performance benefits that DR gives. Also, given that Red and Green (er..Green and Green?) have done just tons and TONS of R&D in improving AA performance and quality to the point that it you can lose less than 10% of your performance and have the image look extremely sharp and crisp even at very high resolutions, I don't think it should go to waste. Supersampling has the issue of multiplying your fillrate, memory bandwidth, and memory usage by a factor of 4 and looks just terrible compared to the high-end AA that the G70, G80, and R520 provide. Jittered proj matrices look a lot better since you can do rotate grid AA AND do supersampling at the same time, however you still have either an Nx (aside: you can do as many and as few samples as you want with this method and have it be consistent across the whole screen) jump in your memory bandwidth and, depending on your implementation, memory usage (if not the latter, then you'll have to re-render your frames a fair bit since you'll need to make new G-buffers, shadow maps, and so on). Lastly, weighted blurs: blurs the image a bit, doesn't provide good/any AA where it's needed most (usually high contrast regions), and doesn't provide any sub-pixel granularity, all on through the addition of a reasonably expensive post-process effect (that example in GPUGems 2 for STALKER had, what, something like 20 texture reads?).
The pros and cons of forward rendering I'm too lazy to type out now...
Original post by CaossTec Antialiasing is indeed one of the greatest drawbacks of DS, but they are not as hard to simulate as you point. I had found that using a simple edge detection filter and then just blurring those few edges with a 3x3 kernel produce very convincing results with out sacrifying performance.
That's exactly what the first thing I mentioned in my post was, and I'll be shocked if your solution looks as good as what hardware AA provides. I honestly that it is the #1 worst solution to AA+DS in existence, and I find it ludicrous that ANYONE in the graphics industry actually takes it seriously, considering the results that a 3x3 blur gives compared to what hardware AA does.
AndyTX: I am curious: what do you mean with subtly different projection matrices?
I think he's referring to having two sets of G-buffers that are each screen-resolution size, and the projection matrices each have sub-pixel offsets (I don't know the math behind it, but it'd be worth checking out functions in D3DX like D3DXMatrixPerspectiveOffCenter). It'd be interesting to see how something like quincunx AA would work as a postprocess. I never did entertain that idea before, and it might work. On the same token though, I don't exactly recall having a superb experience with Quincunx AA back on my old GeForce4. Also, I'm curious as to what you mean, AndyTX, by a jittered grid in this context and how you'd implement it.
Just displacing the framebuffer pixels (jittering)... so really more of a different *post-projection* matrix, but usually it's easiest just to tack it on to projection.
But where are you making that jittering? Inbetween frames? During a post-process?
Original post by Ardor I don't see missing AA as fatal. AA gets less and less important on higher resolutions anyway, so I wouldn't really miss it.
It doesn't, imo. AA at higher resolutions still helps eliminate a lot of shimmering, and I find it quite noticable as I mentioned in my above post.
I dont see AA as a big issue, there would be other ways to deal with that,
There are, but the ones that I've seen just don't work well enough to justify DS. Here're the techniques that I know of:
-Weighted post process blur (GPUGems2, DS in STALKER) -Supersampling the entire scene -Deferred AA
Weighted blur I haven't been able to see in action, but the result that I imagine is, basically, a blurry scene, which doesn't antialias worth crap. It'd be like playing it on a TV, where aliasing is quite visible (anyone remember the "PS2 has jaggies" hoopla awhile back?). Oh, and it's not a cheap post-proc anyways. So, scratch that off the list due to just pure ineffectiveness (Fwiw, I haven't been able to find one screenshot of STALKER that has their antialiasing on, but my point still stands).
Supersampling the entire scene has no reason not to work, but what happens when you turn AA on at resolutions greater than 1024x768? You're suddenly limited to cards that support 4096x4096 textures, and on top of that, at high res's like 1600x1200, you're having to deal with absolutely horrendous amounts of fillrate and memory bandwidth since you're rendering to (and storing in memory...and accessing multiple times...) a 3200x2400 render target. In the end, 800x600 with supersampling on would run slower than 1600x1200 with supersampling off and creates a lower image quality due to the downsizing that occurs, as opposed to hardare AA which can be enabled at a very meager cost nowadays (e.g. the X1900 cards lose less than 20% performance when going from 0 to 4xAA at 1600x1200). And due to the nature of supersampling, you don't even get jittered grids or such, so 4xhardware AA at polygon edges still looks better than what supersampling could do.
And finally Deferred AA, which is an attempt that I made to duplicate hardware AA. It only works for large objects (i.e. ones that cover more than about 2x2 pixels) which is only half the reason why AA should be used, the other being sub-pixel objects that blink in and out as they or the camera move. On top of that though, the performance and memory cost, like supersampling, absolutely stinks. In addition to the main G-buffers involved, the backbuffer needs to store an AA'd version of the scene, so you have to have another render target to render the main scene (more memory...) as well as an extra depth buffer (more memory...). The performance isn't very good either (extra rendering of the scene to hardware AA'd buffers and a not-very-cheap post process effect) so even that can be scratched off of the list.
(Pardon me for being a bit of a downer in this post...)
As great as deferred shading sounds, e.g. minimal vertex usage, virtually no batching, with some optimizations low fillrate and memory usage, etc., there is a big, big, big, BIG catch behind it:
Right now, all techniques of deferred shading+AA completely sucks in one way or another, and the AA that the hardware does just beats the pants off of all of them. While it may seem like something that can be trivially thrown away, antialiasing vastly improves the image quality of a scene, even at high resolutions (I play WoW at, effectively, 1600x1200, and still notice shimmering on the aliased tree leaves due to alpha testing), at no solution to DS+AA that I've seen can match that kind of image quality, even with a nosedive in performance.
Not to mention that the fact that the small vertex/batch issue quickly goes out the window if you include something like shadowing (either volumes or maps), as when making, say, shadow maps you have to re-render all of the verts and also reset all of the matrices for each object, pingpong between render targets, reset shaders for generating SMs and the mian scene, and so on. That's even ignoring the fact that the low-vertex argument is just plain unappealing considering how much vertex throughput modern GPUs have.
The only good solution that I've seen so far is semi-deferred shading, which SimmerD is using in his game-in-development, Ancient Galaxy. It works fairly well with AA from what I've seen, and has some nice little things about it, such as one-time calculation of Fresnel values for specular/reflectance coefficients, and normal calculation when using normal maps (both of which aren't super trivial to do, and have to be done per-light). On top of that, since it involves rendering the scene geometry 'on top' of the G-buffers, you can get the desired material coherence. The downside is that you lose the vertex/batch benefits, but as I mentioned above re: shadows, it's something you have to live with already anyways, so the extra batches on top of that aren't quite as significant. I still haven't worked with it myself, but it's something that I'm would like to try out at some point after ironing out the forward renderer that I'm working on.
PATENT IT?! I made the algorithm to benefit everyone. I know that a lot of people have been seeking an algo that gives high-quality omni SMs at decent perf/memory, and patenting it will basically restrict anyone from using it. (how often do you see people say "You could use TSMs,....buuuut it's got a patent"?)
And I'm not too concerned about other people patenting it if I formally publish the algorithm. Considering that nobody has patented, say, PSMs or Phong lighting and the author never did, I don't see why it would happen here (unless the above two and a slew of other algorithms have something special about them I'm not aware of)