Forward vs. Deferred Rendering

Started by
23 comments, last by swiftcoder 10 years, 6 months ago
I've been following this other thread lately on whether to use a single pass vs. a multipass lighting architecture, here: http://www.gamedev.net/community/forums/topic.asp?topic_id=424468 There's been some debate going on as to whether a deferred renderer can really help squeeze the graphics card on achieving more frame rates. I didn't want to pollute that thread with an off-topic question on deferred renderers, so I decided to post it here. I'm not very knowledgeable with deferred renderers, so correct me if I'm wrong. I understand, for lighting, one pays in number of pixels shaded instead of geometry complexity, so a deferred renderer is better for scenes where a lot of lights are involved. Somehow, I've got the impression (maybe because of S.T.A.L.K.E.R.?!) a deferred renderer is hard to implement efficiently which lessens the chances of achieving more performance, if any at all. They each certainly better fit their own classes of problems as there is never no absolute best. My goal here is to know each one better and learn those situations where one is dominant. I particularly want to know if it is possible to achieve higher frame rates with a deferred renderer on ordinary scenes (and what is ordinary, you might ask? :)). Is it a good trade-off on current hardware? Any articles or benchmarks to justify the results are also very much appreciated. Thank you all.
Advertisement
I´ve implemented a really simple deferred renderer some time ago and am switching over to a forward renderer right now. The thing I liked the most about the deferred renderer was its simplicity, as you don´t have to bother with which object is lit by which lights. That makes it rather easy to get working. Furthermore you only have to do stuff like normal mapping, parallax mapping or similar things, skinning and so on only once, no matter how many lights you have, which would allow you to use those things more extensively, I guess.

The main thing I didn´t like about it was the rather high cost to get started, so it won´t pay off in "simple scenes" (less lights especially). Perhaps I´ve done something wrong, but I was under the impression that the geometry pass (filling the render-targets with depth, normals, diffuse and specular color and so on) was rather expensive. I think this only pays off if you´re going to have a large number of lights visible at once.
Another thing that is always brought forward when a deferred renderer is discussed is the currently missing support of hardware anti-aliasing. I didn´t like that, too, but I guess it is up to you whether you need anti-aliasing or not.

As my project won´t have that much skinning and/or normal-mapping (and similar) going on and won´t have _that_ many lights visible at once I´m now giving a forward renderer a try.

Don´t know if this post was helpful, but here it is ;)
Thank you Matthias. I'm really not a big fan of anti aliasing. I prefer to keep those extra frame rates rather to loose them in favor of removing the jaggies. So, for me, it's not a big deal. And yes, a deferred renderer will probably won't pay off for scenes with small number of visible lights. What most interested me in your post was this:

Quote:
Furthermore you only have to do stuff like normal mapping, parallax mapping or similar things, skinning and so on only once, no matter how many lights you have, which would allow you to use those things more extensively, I guess.


This is interesting. It will greatly reduce state switches. If I'm right, state switches are more of a driver overhead rather than a GPU bottleneck. They cause CPU and GPU to get out of sync. So what we actually are doing in a deferred renderer is to shift the stress from CPU to GPU.
The other thorny issue with deferred rendering is supporting transparency. I've seen a few people suggesting dithering as a solution but that breaks down when you get enough "layers" in there. So, it seems you'd have to implement a (partial) forward renderer for the transparent stuff which is shame, especially regarding the complexity explosion when you start dealing with lit transparencies - that's two renderers you'll be maintaining.

Still, I really like the purity of deferred rendering - I hope someone comes up with a solution to these things. :)

stoo
Besides the AA and transparent problems with deferred rendering, my biggest problem with deferred is the single lighting model you get.

It's desireable to be able to have materials with lambert, blinn, oren nayar or other general brdfs. To support this you have to write an extra channel that represents the material ID. This ID doesn't just represent the lighting model, its both the lighting model and its parameters, since something the oren nayar has too many configurable constants to write to the initial buffers.

When lighting you need to have an uber lighting shader that has every possible lighting model. This can have problems when you start computing complex functions into texture lookups - sooner or later you'll run out of samplers. The other option is to draw N quads per light where N is the number of lighting models. You can optimize some based on culling, but otherwise you have to do all N because you have no idea where materials fell in the framebuffer.

In this case the forward renderer just has the brdf built into the material and all is well. Sure you have 1000 shaders but it works.

Then again, in blinn is good enough for your renderer, then it might be viable. If only transparecy wasn't such an issue.

-Pome
So that makes it three:
1) No anti aliasing
2) No out of the box transparency
3) Hard to implement a flexible lighting model.

Well, as I said, I'm not knowledgeable with deferred rendering, so I'm just throwing ideas here. If the the problem with transparency is caused by a deferred renderer not being able to render transparent objects in the correct order, can't the transparency issue be resolved with a depth peeling shader?

Anyway, that's really too much to handle! I especially dislike the third issue.
Oh boy, here we go again...when AndyTX finds this thread he'll be able to talk about it at greater lengths and possibly expand on my post, but here's a rundown of the pros and cons of deferred rendering that I know of, or are most prominent in my mind.

Deferred rendering:

Pros:
-VERY easy and simple to make, and can be made fairly fast.
-Very tight boundaries on what pixels the light volumes affect (i.e. not many pixels, if any, are wasted).
-Fairly geometric complexity-agnostic because you're only doing a single draw.
-Low amount of state changes
-One-time generation of reasonably complex perpixel data, e.g. calculation of normals for normal mapping, Fresnel specular coefficients, anisotropic filtering.
-Scales very nicely as number of lights increases

Cons:
-No hardware antialiasing.
-Fairly high memory cost, especially at higher resolutions.
-No hardware antialiasing.
-Difficult, but not impossible, to support a reasonably wide variety of materials.
-No hardware antialiasing.
-High initial cost of G-Buffer generation due to the large amount of data that must be generated. (the last pro above should be strongly noted in conjunction with this point)
-No hardware antialiasing.
-Very fillrate/memory-bandwidth intensive.
-No hardware antialiasing.
-[This is more of an anti-pro than a con...] If you're doing any kind of shadowing solution, the geometry complexity argument flies out the window.
-I'm not sure if I mentioned this or not, but there's also no hardware antialiasing.

Now, regarding the repeated con above, there are semi-decent solutions to the antialiasing issue. You can do supersampling, you can render the scene multiple times with jittered projection matrices, you can do post-process weighted blurs, and so on. However, imo at least, all of those are just absolute shit and BARELY justify the rest of the performance benefits that DR gives. Also, given that Red and Green (er..Green and Green?) have done just tons and TONS of R&D in improving AA performance and quality to the point that it you can lose less than 10% of your performance and have the image look extremely sharp and crisp even at very high resolutions, I don't think it should go to waste. Supersampling has the issue of multiplying your fillrate, memory bandwidth, and memory usage by a factor of 4 and looks just terrible compared to the high-end AA that the G70, G80, and R520 provide. Jittered proj matrices look a lot better since you can do rotate grid AA AND do supersampling at the same time, however you still have either an Nx (aside: you can do as many and as few samples as you want with this method and have it be consistent across the whole screen) jump in your memory bandwidth and, depending on your implementation, memory usage (if not the latter, then you'll have to re-render your frames a fair bit since you'll need to make new G-buffers, shadow maps, and so on). Lastly, weighted blurs: blurs the image a bit, doesn't provide good/any AA where it's needed most (usually high contrast regions), and doesn't provide any sub-pixel granularity, all on through the addition of a reasonably expensive post-process effect (that example in GPUGems 2 for STALKER had, what, something like 20 texture reads?).

The pros and cons of forward rendering I'm too lazy to type out now...
Someone please explain me why you don't shade unnecessary pixels with lights. You avoid the common pitfall of calculating a light for the whole object, even if only a small subset of the object is actually affected by the light. Instead, you render a screen quad or maybe a simple mesh resembling the light volume to catch all pixels affected by the light in screen space. How do you avoid shading pixels far away from the viewer?

For example, have a look at the picture I linked to in the other thread. The three lights in front cover alot of screen space. In a deferred renderer the light volume would cover almost the entire screen. The light shader will be executed also for all the pixels belonging to the dark hill behind the house. None of the lights reach this hill, yet the screen quad would try to light it as well, because the pixels are inside the lighted area in screen space. This also counts as "wasted pixels" in my opinion, and quite a lot of them at once.

The simple way out of this is to use dynamic branching to bail out if the attenuation is zero. But this trick can also be used for a forward renderer. Maybe you can use the ZBuffer and a proper compare function to discard all pixels behind the light volume. If you configure the depth test to GREATER_THAN, all pixels behind the light volume are discarded and the hill side wouldn't be lighted. But then you run into problems if your light volume is behind a wall. Either way, with Deferred Rendering you're going to light up as many useless pixels as in a Forward Renderer.

In addition you have to calculate alot of data in the pixel shader that is low frequency enough to be calculated in the VS of a forward renderer. The attenuation being an example. Shadow Map texture coords. And more. This is propably not a problem because you have alot of time to calculate that data until the texture units have fetched all necessary values from the various high precision buffers. Still I think this to be a waste as well.

Bye, Thomas
----------
Gonna try that "Indie" stuff I keep hearing about. Let's start with Splatter.
Quote:Original post by Cypher19
Oh boy, here we go again...when AndyTX finds this thread he'll be able to talk about it at greater lengths and possibly expand on my post

Sigh... I don't have the energy for this any more. All of the information is out there - we've spoken at length in previous posts about it. Your rundown is as good as mine, we just differ in opinion on how important/costly various parts are. To be fair, that will vary a lot depending on your application/scene, so the best advice is to JUST TRY IT.

IMHO an efficient deferred renderer is a lot easier to get up and running that an efficient forward renderer. The former I wrote in less than two weeks, and the latter took more than a month, simply due to the complexity of efficiently handling the geometric lighting/geometry complexity blowup. In the end, the deferred renderer was still hands-down the more flexible of the two, and had the most predictable performance characteristics. In some scenes the forward renderer was faster (even much faster for simpler scenes), but for others it was practically unusable.

If I'm writing an application that runs faster on a forward renderer, by all means I will use that (and rejoice at the nice hardware 8x MSAA!). The same applies to the deferred renderer. My point is that it is completely possible to write both renderers that deal with the same data sets and even the same shaders, producing the exact same output. Deferred shading isn't some fantastic (in the original meaning) way of reinventing rendering as we know it... it's really a simple refactoring of the rendering loop. The resulting image will be the same in both cases.

Anyways as I mentioned I'm getting tired of extolling the goodness of deferred rendering. In the end, I don't really care what you use... do what makes you happy. I only trying to help others learn from the experience that I have, and let people know that it may not be necessary to spend a month (or more) writing an efficient lighting system for the your forward renderer.

So simply for the benefit of the original poster, I'll make one last reply here, but I don't intend to get sucked into another discussion/argument. To be honest, I'm getting bored of the topic... there are more important and exciting things to research nowadays; deferred rendering has been a known technique for twenty years now.

In any case:

Quote:Original post by Ashkan
2) No out of the box transparency

If you really want to use alpha blending, just do it with forward rendering at the end. This works perfectly fine. I have the privilege of access to an efficient GPU raytracer, so this hasn't been a problem for me. I suspect raytracing will become the preferred way to do secondary rays soon enough, as it is more flexible, and quite fast on modern hardware.

Quote:Original post by Cypher19
-No hardware antialiasing.

I'm the first to admit that this isn't ideal. Still I'm not too phased by it since a good renderer will need other methods of anti-aliasing as well. Let me emphasize that MSAA helps to solve one aliasing problem - many more exist!

I'm somewhat partial to jittered projection matrices myself since they help to solve all aliasing problems (shader, texture, raytracing, etc). In particular almost all current renderers that I've seen have major issues with specular aliasing (due to normal maps). There are ways to help this, but I really don't want to have to attack every single aliasing problem one by one. With cards like the G80 available now, I'll gladly trade off my 300Hz for some extremely nicely anti-aliased 75Hz. This will only continue.

Quote:Original post by Cypher19
-Fairly high memory cost, especially at higher resolutions.

Definitely true as well, but it shouldn't be overstated. Even at 1600x1200 with a fairly large G-buffer, you're still talking <60MB. That's a quite reasonable for a 256MB card, and pebbles for a 512MB (or now 768MB).

Quote:Original post by Ashkan
3) Hard to implement a flexible lighting model.

Quote:Original post by Cypher19
-Difficult, but not impossible, to support a reasonably wide variety of materials.

I didn't find it that difficult, but your experience may vary. I admit that I had Libsh to help shader encapsulation here, but Cg can probably do something similar (if not quite as powerful). This doesn't seem like a very tough problem to me, but enough people complain about it that I'm willing to admit that I'm wrong. Still, I wonder if all those who complain have actually sat down and tried to implement it...

Quote:Original post by Cypher19
-High initial cost of G-Buffer generation due to the large amount of data that must be generated.

Yes definitely. The cost of rendering your scene with a forward renderer must be sufficiently high to hide this fixed cost. In my experience, that isn't as high as most people think, and indeed it has been getting lower and lower on newer hardware. I've not had time to play with it a lot, but it seems like the G80 is a deferred rendering dream machine due it's flexibility and architecture.

Quote:Original post by Cypher19
-Very fillrate/memory-bandwidth intensive.

Certainly true as well, but as I've mentioned several times, the reads are coherent and predictable, so this negative will drop off as memory architectures continue to improve. Already (on X1x00's and G7x and G80's) the latency is effectively hidden.

Quote:Original post by Cypher19
-If you're doing any kind of shadowing solution, the geometry complexity argument flies out the window.

I'm not sure if I agree with you here. Even with shadowing the rendering complexity of forward and deferred rendering is not changed. You simply add a fixed (potentially large) cost for generating shadow maps. Note that for forward rendering, this breaks batching and/or requires one to keep several shadow maps around at a time. This is not the case for deferred rendering, which deals with it rather naturally IMHO (shadowed light? rendering a shadow map -> render the light -> done with shadow map).

Plus do note that one usually only needs a handful of shadowed lights in a scene... in my experience there are many more unshadowed lights than shadowed ones, but maybe I'm just working with the wrong scenes.

Quote:Original post by Cypher19
Also, given that Red and Green (er..Green and Green?) have done just tons and TONS of R&D in improving AA performance and quality to the point that it you can lose less than 10% of your performance and have the image look extremely sharp and crisp even at very high resolutions, I don't think it should go to waste.

I agree, and I hate that it goes to waste. Still, wish they'd spend their transistors on more programmable hardware. I may have to bite my tongue though as there's some indication that D3D10 will support custom multi-sample resolves though, which means that MSAA *could* work with deferred rendering. The fact that it hasn't worked so far is simply due to there being yet another chunk of nonprogrammable hardware sitting in GPUs.

Quote:Original post by Cypher19
Lastly, weighted blurs: blurs the image a bit

Weighted blurs are stupid, and they simply are not anti-aliasing. I agree with you 100% that they suck.

Anyways that's hopefully the end of my contribution to this topic. I'm just trying to give some feedback as I have done a considerable amount of research and prototype implementations on the subject. In the end though, use what's best for your application. I'd encourage you to try both, but simply being aware of the existence of alternatives is enough to make me happy.

It seems like most people (Cypher19 at least :) understand the basics now, and the spread of misinformation has been significantly reduced over the past few months, so hopefully deferred shading won't need a champion here anymore ;)
wow...quite a post for no energy :P...what the hell happens when you're caffeinated?

Quoting the best advice out of all this, imo:

Quote:
To be fair, that will vary a lot depending on your application/scene, so the best advice is to JUST TRY IT.


That way YOU understand which path is dominate in YOUR application.

This topic is closed to new replies.

Advertisement