Jump to content

  • Log In with Google      Sign In   
  • Create Account


Forward vs. Deferred Rendering


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • This topic is locked This topic is locked
24 replies to this topic

#1 Ashkan   Members   -  Reputation: 450

Like
0Likes
Like

Posted 21 November 2006 - 12:29 AM

I've been following this other thread lately on whether to use a single pass vs. a multipass lighting architecture, here: http://www.gamedev.net/community/forums/topic.asp?topic_id=424468 There's been some debate going on as to whether a deferred renderer can really help squeeze the graphics card on achieving more frame rates. I didn't want to pollute that thread with an off-topic question on deferred renderers, so I decided to post it here. I'm not very knowledgeable with deferred renderers, so correct me if I'm wrong. I understand, for lighting, one pays in number of pixels shaded instead of geometry complexity, so a deferred renderer is better for scenes where a lot of lights are involved. Somehow, I've got the impression (maybe because of S.T.A.L.K.E.R.?!) a deferred renderer is hard to implement efficiently which lessens the chances of achieving more performance, if any at all. They each certainly better fit their own classes of problems as there is never no absolute best. My goal here is to know each one better and learn those situations where one is dominant. I particularly want to know if it is possible to achieve higher frame rates with a deferred renderer on ordinary scenes (and what is ordinary, you might ask? :)). Is it a good trade-off on current hardware? Any articles or benchmarks to justify the results are also very much appreciated. Thank you all.

Sponsor:

#2 matches81   Members   -  Reputation: 474

Like
1Likes
Like

Posted 21 November 2006 - 01:05 AM

I´ve implemented a really simple deferred renderer some time ago and am switching over to a forward renderer right now. The thing I liked the most about the deferred renderer was its simplicity, as you don´t have to bother with which object is lit by which lights. That makes it rather easy to get working. Furthermore you only have to do stuff like normal mapping, parallax mapping or similar things, skinning and so on only once, no matter how many lights you have, which would allow you to use those things more extensively, I guess.

The main thing I didn´t like about it was the rather high cost to get started, so it won´t pay off in "simple scenes" (less lights especially). Perhaps I´ve done something wrong, but I was under the impression that the geometry pass (filling the render-targets with depth, normals, diffuse and specular color and so on) was rather expensive. I think this only pays off if you´re going to have a large number of lights visible at once.
Another thing that is always brought forward when a deferred renderer is discussed is the currently missing support of hardware anti-aliasing. I didn´t like that, too, but I guess it is up to you whether you need anti-aliasing or not.

As my project won´t have that much skinning and/or normal-mapping (and similar) going on and won´t have _that_ many lights visible at once I´m now giving a forward renderer a try.

Don´t know if this post was helpful, but here it is ;)

#3 Ashkan   Members   -  Reputation: 450

Like
0Likes
Like

Posted 21 November 2006 - 01:33 AM

Thank you Matthias. I'm really not a big fan of anti aliasing. I prefer to keep those extra frame rates rather to loose them in favor of removing the jaggies. So, for me, it's not a big deal. And yes, a deferred renderer will probably won't pay off for scenes with small number of visible lights. What most interested me in your post was this:

Quote:

Furthermore you only have to do stuff like normal mapping, parallax mapping or similar things, skinning and so on only once, no matter how many lights you have, which would allow you to use those things more extensively, I guess.


This is interesting. It will greatly reduce state switches. If I'm right, state switches are more of a driver overhead rather than a GPU bottleneck. They cause CPU and GPU to get out of sync. So what we actually are doing in a deferred renderer is to shift the stress from CPU to GPU.

#4 Stuart Yarham   Members   -  Reputation: 174

Like
0Likes
Like

Posted 21 November 2006 - 01:44 AM

The other thorny issue with deferred rendering is supporting transparency. I've seen a few people suggesting dithering as a solution but that breaks down when you get enough "layers" in there. So, it seems you'd have to implement a (partial) forward renderer for the transparent stuff which is shame, especially regarding the complexity explosion when you start dealing with lit transparencies - that's two renderers you'll be maintaining.

Still, I really like the purity of deferred rendering - I hope someone comes up with a solution to these things. :)

stoo

#5 pomesuba   Members   -  Reputation: 128

Like
0Likes
Like

Posted 21 November 2006 - 02:01 AM

Besides the AA and transparent problems with deferred rendering, my biggest problem with deferred is the single lighting model you get.

It's desireable to be able to have materials with lambert, blinn, oren nayar or other general brdfs. To support this you have to write an extra channel that represents the material ID. This ID doesn't just represent the lighting model, its both the lighting model and its parameters, since something the oren nayar has too many configurable constants to write to the initial buffers.

When lighting you need to have an uber lighting shader that has every possible lighting model. This can have problems when you start computing complex functions into texture lookups - sooner or later you'll run out of samplers. The other option is to draw N quads per light where N is the number of lighting models. You can optimize some based on culling, but otherwise you have to do all N because you have no idea where materials fell in the framebuffer.

In this case the forward renderer just has the brdf built into the material and all is well. Sure you have 1000 shaders but it works.

Then again, in blinn is good enough for your renderer, then it might be viable. If only transparecy wasn't such an issue.

-Pome

#6 Ashkan   Members   -  Reputation: 450

Like
0Likes
Like

Posted 21 November 2006 - 02:29 AM

So that makes it three:
1) No anti aliasing
2) No out of the box transparency
3) Hard to implement a flexible lighting model.

Well, as I said, I'm not knowledgeable with deferred rendering, so I'm just throwing ideas here. If the the problem with transparency is caused by a deferred renderer not being able to render transparent objects in the correct order, can't the transparency issue be resolved with a depth peeling shader?

Anyway, that's really too much to handle! I especially dislike the third issue.


#7 Cypher19   Members   -  Reputation: 768

Like
1Likes
Like

Posted 21 November 2006 - 04:06 AM

Oh boy, here we go again...when AndyTX finds this thread he'll be able to talk about it at greater lengths and possibly expand on my post, but here's a rundown of the pros and cons of deferred rendering that I know of, or are most prominent in my mind.

Deferred rendering:

Pros:
-VERY easy and simple to make, and can be made fairly fast.
-Very tight boundaries on what pixels the light volumes affect (i.e. not many pixels, if any, are wasted).
-Fairly geometric complexity-agnostic because you're only doing a single draw.
-Low amount of state changes
-One-time generation of reasonably complex perpixel data, e.g. calculation of normals for normal mapping, Fresnel specular coefficients, anisotropic filtering.
-Scales very nicely as number of lights increases

Cons:
-No hardware antialiasing.
-Fairly high memory cost, especially at higher resolutions.
-No hardware antialiasing.
-Difficult, but not impossible, to support a reasonably wide variety of materials.
-No hardware antialiasing.
-High initial cost of G-Buffer generation due to the large amount of data that must be generated. (the last pro above should be strongly noted in conjunction with this point)
-No hardware antialiasing.
-Very fillrate/memory-bandwidth intensive.
-No hardware antialiasing.
-[This is more of an anti-pro than a con...] If you're doing any kind of shadowing solution, the geometry complexity argument flies out the window.
-I'm not sure if I mentioned this or not, but there's also no hardware antialiasing.

Now, regarding the repeated con above, there are semi-decent solutions to the antialiasing issue. You can do supersampling, you can render the scene multiple times with jittered projection matrices, you can do post-process weighted blurs, and so on. However, imo at least, all of those are just absolute shit and BARELY justify the rest of the performance benefits that DR gives. Also, given that Red and Green (er..Green and Green?) have done just tons and TONS of R&D in improving AA performance and quality to the point that it you can lose less than 10% of your performance and have the image look extremely sharp and crisp even at very high resolutions, I don't think it should go to waste. Supersampling has the issue of multiplying your fillrate, memory bandwidth, and memory usage by a factor of 4 and looks just terrible compared to the high-end AA that the G70, G80, and R520 provide. Jittered proj matrices look a lot better since you can do rotate grid AA AND do supersampling at the same time, however you still have either an Nx (aside: you can do as many and as few samples as you want with this method and have it be consistent across the whole screen) jump in your memory bandwidth and, depending on your implementation, memory usage (if not the latter, then you'll have to re-render your frames a fair bit since you'll need to make new G-buffers, shadow maps, and so on). Lastly, weighted blurs: blurs the image a bit, doesn't provide good/any AA where it's needed most (usually high contrast regions), and doesn't provide any sub-pixel granularity, all on through the addition of a reasonably expensive post-process effect (that example in GPUGems 2 for STALKER had, what, something like 20 texture reads?).

The pros and cons of forward rendering I'm too lazy to type out now...

#8 Schrompf   Prime Members   -  Reputation: 932

Like
0Likes
Like

Posted 21 November 2006 - 05:15 AM

Someone please explain me why you don't shade unnecessary pixels with lights. You avoid the common pitfall of calculating a light for the whole object, even if only a small subset of the object is actually affected by the light. Instead, you render a screen quad or maybe a simple mesh resembling the light volume to catch all pixels affected by the light in screen space. How do you avoid shading pixels far away from the viewer?

For example, have a look at the picture I linked to in the other thread. The three lights in front cover alot of screen space. In a deferred renderer the light volume would cover almost the entire screen. The light shader will be executed also for all the pixels belonging to the dark hill behind the house. None of the lights reach this hill, yet the screen quad would try to light it as well, because the pixels are inside the lighted area in screen space. This also counts as "wasted pixels" in my opinion, and quite a lot of them at once.

The simple way out of this is to use dynamic branching to bail out if the attenuation is zero. But this trick can also be used for a forward renderer. Maybe you can use the ZBuffer and a proper compare function to discard all pixels behind the light volume. If you configure the depth test to GREATER_THAN, all pixels behind the light volume are discarded and the hill side wouldn't be lighted. But then you run into problems if your light volume is behind a wall. Either way, with Deferred Rendering you're going to light up as many useless pixels as in a Forward Renderer.

In addition you have to calculate alot of data in the pixel shader that is low frequency enough to be calculated in the VS of a forward renderer. The attenuation being an example. Shadow Map texture coords. And more. This is propably not a problem because you have alot of time to calculate that data until the texture units have fetched all necessary values from the various high precision buffers. Still I think this to be a waste as well.

Bye, Thomas

#9 AndyTX   Members   -  Reputation: 802

Like
1Likes
Like

Posted 21 November 2006 - 05:16 AM

Quote:
Original post by Cypher19
Oh boy, here we go again...when AndyTX finds this thread he'll be able to talk about it at greater lengths and possibly expand on my post

Sigh... I don't have the energy for this any more. All of the information is out there - we've spoken at length in previous posts about it. Your rundown is as good as mine, we just differ in opinion on how important/costly various parts are. To be fair, that will vary a lot depending on your application/scene, so the best advice is to JUST TRY IT.

IMHO an efficient deferred renderer is a lot easier to get up and running that an efficient forward renderer. The former I wrote in less than two weeks, and the latter took more than a month, simply due to the complexity of efficiently handling the geometric lighting/geometry complexity blowup. In the end, the deferred renderer was still hands-down the more flexible of the two, and had the most predictable performance characteristics. In some scenes the forward renderer was faster (even much faster for simpler scenes), but for others it was practically unusable.

If I'm writing an application that runs faster on a forward renderer, by all means I will use that (and rejoice at the nice hardware 8x MSAA!). The same applies to the deferred renderer. My point is that it is completely possible to write both renderers that deal with the same data sets and even the same shaders, producing the exact same output. Deferred shading isn't some fantastic (in the original meaning) way of reinventing rendering as we know it... it's really a simple refactoring of the rendering loop. The resulting image will be the same in both cases.

Anyways as I mentioned I'm getting tired of extolling the goodness of deferred rendering. In the end, I don't really care what you use... do what makes you happy. I only trying to help others learn from the experience that I have, and let people know that it may not be necessary to spend a month (or more) writing an efficient lighting system for the your forward renderer.

So simply for the benefit of the original poster, I'll make one last reply here, but I don't intend to get sucked into another discussion/argument. To be honest, I'm getting bored of the topic... there are more important and exciting things to research nowadays; deferred rendering has been a known technique for twenty years now.

In any case:

Quote:
Original post by Ashkan
2) No out of the box transparency

If you really want to use alpha blending, just do it with forward rendering at the end. This works perfectly fine. I have the privilege of access to an efficient GPU raytracer, so this hasn't been a problem for me. I suspect raytracing will become the preferred way to do secondary rays soon enough, as it is more flexible, and quite fast on modern hardware.

Quote:
Original post by Cypher19
-No hardware antialiasing.

I'm the first to admit that this isn't ideal. Still I'm not too phased by it since a good renderer will need other methods of anti-aliasing as well. Let me emphasize that MSAA helps to solve one aliasing problem - many more exist!

I'm somewhat partial to jittered projection matrices myself since they help to solve all aliasing problems (shader, texture, raytracing, etc). In particular almost all current renderers that I've seen have major issues with specular aliasing (due to normal maps). There are ways to help this, but I really don't want to have to attack every single aliasing problem one by one. With cards like the G80 available now, I'll gladly trade off my 300Hz for some extremely nicely anti-aliased 75Hz. This will only continue.

Quote:
Original post by Cypher19
-Fairly high memory cost, especially at higher resolutions.

Definitely true as well, but it shouldn't be overstated. Even at 1600x1200 with a fairly large G-buffer, you're still talking <60MB. That's a quite reasonable for a 256MB card, and pebbles for a 512MB (or now 768MB).

Quote:
Original post by Ashkan
3) Hard to implement a flexible lighting model.

Quote:
Original post by Cypher19
-Difficult, but not impossible, to support a reasonably wide variety of materials.

I didn't find it that difficult, but your experience may vary. I admit that I had Libsh to help shader encapsulation here, but Cg can probably do something similar (if not quite as powerful). This doesn't seem like a very tough problem to me, but enough people complain about it that I'm willing to admit that I'm wrong. Still, I wonder if all those who complain have actually sat down and tried to implement it...

Quote:
Original post by Cypher19
-High initial cost of G-Buffer generation due to the large amount of data that must be generated.

Yes definitely. The cost of rendering your scene with a forward renderer must be sufficiently high to hide this fixed cost. In my experience, that isn't as high as most people think, and indeed it has been getting lower and lower on newer hardware. I've not had time to play with it a lot, but it seems like the G80 is a deferred rendering dream machine due it's flexibility and architecture.

Quote:
Original post by Cypher19
-Very fillrate/memory-bandwidth intensive.

Certainly true as well, but as I've mentioned several times, the reads are coherent and predictable, so this negative will drop off as memory architectures continue to improve. Already (on X1x00's and G7x and G80's) the latency is effectively hidden.

Quote:
Original post by Cypher19
-If you're doing any kind of shadowing solution, the geometry complexity argument flies out the window.

I'm not sure if I agree with you here. Even with shadowing the rendering complexity of forward and deferred rendering is not changed. You simply add a fixed (potentially large) cost for generating shadow maps. Note that for forward rendering, this breaks batching and/or requires one to keep several shadow maps around at a time. This is not the case for deferred rendering, which deals with it rather naturally IMHO (shadowed light? rendering a shadow map -> render the light -> done with shadow map).

Plus do note that one usually only needs a handful of shadowed lights in a scene... in my experience there are many more unshadowed lights than shadowed ones, but maybe I'm just working with the wrong scenes.

Quote:
Original post by Cypher19
Also, given that Red and Green (er..Green and Green?) have done just tons and TONS of R&D in improving AA performance and quality to the point that it you can lose less than 10% of your performance and have the image look extremely sharp and crisp even at very high resolutions, I don't think it should go to waste.

I agree, and I hate that it goes to waste. Still, wish they'd spend their transistors on more programmable hardware. I may have to bite my tongue though as there's some indication that D3D10 will support custom multi-sample resolves though, which means that MSAA *could* work with deferred rendering. The fact that it hasn't worked so far is simply due to there being yet another chunk of nonprogrammable hardware sitting in GPUs.

Quote:
Original post by Cypher19
Lastly, weighted blurs: blurs the image a bit

Weighted blurs are stupid, and they simply are not anti-aliasing. I agree with you 100% that they suck.

Anyways that's hopefully the end of my contribution to this topic. I'm just trying to give some feedback as I have done a considerable amount of research and prototype implementations on the subject. In the end though, use what's best for your application. I'd encourage you to try both, but simply being aware of the existence of alternatives is enough to make me happy.

It seems like most people (Cypher19 at least :) understand the basics now, and the spread of misinformation has been significantly reduced over the past few months, so hopefully deferred shading won't need a champion here anymore ;)

#10 daktaris   Members   -  Reputation: 145

Like
0Likes
Like

Posted 21 November 2006 - 05:45 AM

wow...quite a post for no energy :P...what the hell happens when you're caffeinated?

Quoting the best advice out of all this, imo:

Quote:

To be fair, that will vary a lot depending on your application/scene, so the best advice is to JUST TRY IT.


That way YOU understand which path is dominate in YOUR application.

#11 Ashkan   Members   -  Reputation: 450

Like
0Likes
Like

Posted 21 November 2006 - 06:28 AM

First of all I sincerely thank you all for your contribution to this thread. But may I release this built up tension by expressing my deepest burning urge to shout:

THIS IS ONE HELL OF A CRAZY THREAD!



But boy, did you crave for such a thread or what?

First, we have MR. pomesuba, a guinness record holder, who has been a registered member of GameDev.net since 2004, but his contribution to this thread marks his very first post on the forums, almost two years after his initial registration! If this is not for he wanted to contribute to a next generation -Deferred vs. Forward renderer- flame war that he waited 2 long years to post his first response, then what can it be for?

And you Cypher19 and AndyTx, you need more than shotguns to deal with eachother.

But seriously guys, your posts were all very useful. I think I take your advice AndyTx and start implementing a deferred renderer for my own. I will, as soon as I recover from this hypovolaemic shock.

#12 Cypher19   Members   -  Reputation: 768

Like
0Likes
Like

Posted 21 November 2006 - 06:34 AM

Quote:
It seems like most people (Cypher19 at least :) understand the basics now, and the spread of misinformation has been significantly reduced over the past few months, so hopefully deferred shading won't need a champion here anymore ;)


Nor forward rendering [wink]

Quote:
And you Cypher19 and AndyTx, you need more than shotguns to deal with eachother.


Bleh, we really don't. A good chunk of the energy that Andy spent on DR discussions that he was referring to was in debates against me, I think [grin]

#13 wolf   Members   -  Reputation: 836

Like
0Likes
Like

Posted 21 November 2006 - 07:07 AM

... I want to be the last person who says something in this thread:

It all depends... :
- on your hardware platform (target group etc.)
- on your game design

... there is no right or wrong, just compare the requirements with the abilities of each approach.


#14 AndyTX   Members   -  Reputation: 802

Like
0Likes
Like

Posted 21 November 2006 - 07:11 AM

Quote:
Original post by Ashkan
And you Cypher19 and AndyTx, you need more than shotguns to deal with eachother.

Heh, I'm out of ammo ;)

Seriously though, I have a lot of respect with Cypher19, and there's no need for us to "argue" any more. He (she?) represents exactly what I'm trying to accomplish: someone who knowns and understands both techniques and their features enough to make an informed decision when faced with a problem.

#15 rick_appleton   Members   -  Reputation: 857

Like
0Likes
Like

Posted 24 November 2006 - 10:58 AM

I've been reading all your posts with great interest Andy, and am in the (slow) progress of creating a deferred renderer myself, to get some familiarity with them.

You mentioned it took you a few weeks to create a good forward renderer that could handle all the shader permutations efficiently. How long did the forward rendering part of your DR take then? Are there so much less permutations for transparent objects? Do you move as much transparency to alpha-testing where possible (which I think can be deferred)? What kind of complexity are we talking about for this part of the DR pipeline?

#16 Matt Aufderheide   Members   -  Reputation: 99

Like
0Likes
Like

Posted 24 November 2006 - 11:39 AM

What are benefits of a differed render? When it comes down to it, it seems only useful for large number *non-shadow casting* lights in one scene. This seems to me not very useful in most applications. Unless your game features a lot of christmas tree lights, why bother?

I fail to see any other major benefit. Maybe I'm being simplistic..but there it is.

#17 wolf   Members   -  Reputation: 836

Like
0Likes
Like

Posted 24 November 2006 - 12:30 PM

Quote:
... it seems only useful for large number *non-shadow casting* lights in one scene. This seems to me not very useful in most applications.
I like that :-) ... if you want to handle more than four lights per-pixel, you will be concerned about the number of cached shadows you can handle :-) ... not the question if your renderer stores data in an intermediate step :-)

... in other words: the question if forward or deferred rendering is better, depends on the requirements of the game (> 4 per-pixel lights are necessary?) and what your target platform is offering you. if you are able to think about so many light sources, your smallest problem is how you designed the renderer ... your real concern is the number of shadows you can apply :-)




#18 superpig   Staff Emeritus   -  Reputation: 1825

Like
0Likes
Like

Posted 24 November 2006 - 01:48 PM

Has anyone here tried implementing transparency on a deferred rendering using something akin to depth peeling?

I've never tried it, and I'd guess you find yourself having to do things like re-run all the lighting passes for each peel layer - and of course there's the problem whereby you don't easily know how many layers you need to do - but I'm just curious as to whether anyone here has tried it.

#19 AndyTX   Members   -  Reputation: 802

Like
0Likes
Like

Posted 25 November 2006 - 09:14 AM

Quote:
Original post by rick_appleton
You mentioned it took you a few weeks to create a good forward renderer that could handle all the shader permutations efficiently. How long did the forward rendering part of your DR take then?

Alas I just used the forward renderer that I had already written :) Note that the number of transparent objects in a scene is usually pretty low, so the renderer here doesn't have to be full-featured or efficient. If you have a scene with lots of transparent objects, you probably shouldn't be using deferred (or forward) rendering anyways ;)

Quote:
Original post by rick_appleton
Are there so much less permutations for transparent objects?

No not really, although there are often many fewer transparent objects. Also note that lighting on glass and similar surfaces is usually either non-existent, or low frequency and easy to get away with really rough approximations (like simply an ambient term).

Quote:
Original post by rick_appleton
Do you move as much transparency to alpha-testing where possible (which I think can be deferred)? What kind of complexity are we talking about for this part of the DR pipeline?

Alpha testing is fine where possible, but only really gives you predicated writes to the framebuffer, not compositing. The latter requires sorting and thus you'll need to do that somewhere... Either that or don't use compositing for alpha blending.

Quote:
Original post by superpig
Has anyone here tried implementing transparency on a deferred rendering using something akin to depth peeling?

I've played with that a bit, but it's somewhat slow on current hardware. May be feasible to do a few layers on a G80 though.

Depth peeling really does turn out to be the "proper" answer to a lot of these problems though... that or having a layered Z/colour buffer, but I can't think of a good way of implementing that in hardware. Sorting fragments would be a major performance hit regardless of where it is in the pipeline.

#20 Schrompf   Prime Members   -  Reputation: 932

Like
0Likes
Like

Posted 26 November 2006 - 01:05 AM

One question to satisfy my curiosity: How is ambient lighting done in a Deferred Renderer? As far as I got it, everything beyond a constant ambient term can't be done in screen space. Do you simply add a ambient term to every light source? Do you write out the ambient term per object in a separate pass? What did you do in your Deferred Renderer?

Bye, Thomas




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS