Deferred Shading vs Pre-pass lighting

Started by
24 comments, last by MJP 12 years ago
I just changed my implementation of deferred shading to pre-pass lighting (the algo discussed here: http://diaryofagraphicsprogrammer.blogspot.com/2008/03/light-pre-pass-renderer.html). I still had the code for the old rendering and did some test to compare the two methods.

I first thought that pre-pass would be much faster when drawing many lights (I guess I was a bit naive), but when testing deferred shading was almost always faster on my card. Testing on some other cards, the difference changed a bit and on a laptop card with shared memory (that has no specific gpu ram), pre-pass lighting seems faster overall.

I have collected all my data here:
http://frictionalgames.blogspot.com/2010/10/pre-pass-lighting-redux.html

The biggest question for me right now is if I should skip the early-z pass (as Crytech seems to be doing for Crysis 2), but it just seem problematic. If I skip early z, then I choose to:
1) sort by depth, resulting in a lot of render state changes.
2) sort by render state, resulting in a lot of unneeded overdraw.

Would be very interesting if anybody else has tested these two rendering algos against each other. And would also be interested in hearing how all else do with the early-z pass.
Advertisement
Interesting topic :) I've done testing with both, using similar G-buffer setups as you have (with depth being either an R32F rendertarget, or INTZ hardware depth stencil buffer) and found them to perform very close to each other, for the most part, as long as the batch & vertex count is moderately low.

About Z-prepass I've never bothered, so can't answer. Rather, I just sort by state first, and then draw each state "bucket" front to back to fill the G-buffer.

I can give data from one test scene, which is about 200K triangles and 110 objects in the initial view, and 600K triangles and 500 objects for the whole scene. It's lit by 20 unshadowed spotlights (without much overdraw, though), and on a Geforce 8800M GTS, the results are:
Prepass: 8.3 ms (initial) 9.9 ms (whole scene)
Deferred: 8.6 ms (initial) 6.4 ms (whole scene)

Light prepass is where I started with deferred lighting techniques, and I would want to like it :) as it's cool to have a slim G-buffer. However according to my results it seems that when it wins over deferred, the wins are small, but when vertex counts grow, then deferred can have a large advantage. Furthermore, with deferred shading it's easy to match the look of forward rendering, as the lighting pass accumulates the final framebuffer colors additively, whereas prepass has problems with limited range of the light accumulation buffer and constructing the specular highlights.
Seems like you results are the same as mine!

I also want to like pre-pass ligthing :) But as you say, it is very rarely that it is better than deferred and when it is, the difference is slim.

Regarding the difference is light accumulation, I do not think this matters + pre-pass will give nicer results when many dark lights overlap. So I do not think there will be much difference with this. The screens I have tried so far with both algos look basically the same.

The big reason for me with sticking to pre-pass lighting, is that there are so much more material variety possible. Doing a materials that have rim-lighting and stuff like that all require extra passes on deferred shading, and any kind of skin rendering is not really possible. But it also feels wrong to switch to what seems like worse technique performance and precision-wise.

AgentC, what did you end up using for the application?
At this stage there's no real application yet, it's just a result from my opensource engine Urho3D, which right now implements deferred & forward multipass shading.

However, I'm thinking that I might want to reimplement prepass mode to the public version just so that there's more choice, in case it's better for some (yet undecided) purpose!
Once I had deferred rendering in my engine, too. And as you mentioned in you blog, deferred rendering was slower on older hardware and/or laptops that the LPP is.
Well, I am on a old laptop here! It just has a GeForce 8400GS Go which is pretty lame at EVERYTHING.
With deferred rendering I only had to place like 3 big lights into my scene and my FPS was at 8-10.

That's pretty much solved with the LPP. I get good framerates with it.
I would say, even if it is not faster than deferred rendering (Which it is not in my case), the freedom you get with your shaders is reason enough to use it.

Our artist always said: "I need emissive here, I need specular power there, and this thing would be cool and that even cooler, etc etc" and I ended up in having 5 MRTs for stuff.
That's pretty much solved with the LPP.
Well it definitely depends on a lot of factors. Deferred rendering is typically very heavy on bandwidth, due to the fat G-Buffer (MSAA makes this even worse). Light prepass on the other hand requires rendering most of your geo twice, and this can hurt a lot depending on your scene. Multiple render targets can also be unusually painful on certain platforms, which makes light prepass more attractive. Then you have PS3 where SPU's can be used to handle the lighting pass of a light prepass renderer.

The important point here is that like with anything performance-related, it always varies depending on the usage and the target hardware.
I just went back from a lighting prepass system to an ubershader that only supports a finite number of lights.

Just ask yourself if you really need 2000 lights in a scene and when you give the honest answer, long single pass shaders suddenly look more attractive again.

For a start, you can do lighting effects on translucent materials (which was my main reason for switching back tbh).

You just can't get faster than one pass. And shader execution speed is more likely to keep on going up in the future, at a faster rate than bandwidth widening, because faster shading is just cores++ which is more or less an unlimited upgrade path.

Your only real limit now is interpolators and the added a shitload more with DX10, along with an infinite shader length (sort of).

But the above isn't the current trendy thing, so I guess now I get flamed to death for having crap tech... :)
------------------------------Great Little War Game
Nope, you're not alone. At least on current generation consoles, forward rendering & ubershaders are still very common. Many people have switched to LPP, especially some of the major PS3 studios, but I think the majority of 360 & PS3 titles are still using more traditional techniques. (My last shipped title had an ubershader with ~12 lights.)

Yes, the artists want more lights in the scene, and we're always looking at ways to achieve that. But they also love transparency, so it's a balancing act. I will say that the many recent advances in post-process AA have pretty much eliminated that as a point in favor of forward rendering. But I don't think that the translucency solutions (Inferred lighting, or the similar technique from Little Big Planet) have caught on that fast.
Some other food for thought:

Complex forward rendering needs a Z-pre-pass to reduce shading costs. This is less important with deferred, and not required in LPP.

Most deferred/LPP renderers also support forward lighting (e.g. for translucent objects). If a particular object needs a special shader, it can use forward lighting while the rest of the scene uses deferred/LPP.

LPP can be implemented without MRT, meaning it works on older cards (e.g. DX8).
LPP works with MSAA on DX9 (deferred needs DX10).

Inferred rendering (basically an extension of LPP) supports lighting of translucent surfaces (without requiring a forward-rendering path!), and also allows you to scale the cost of the lighting passes with mixed resolution rendering.

[Edit]
Quote:Original post by Hardguy
I have collected all my data here:
http://frictionalgames.blogspot.com/2010/10/pre-pass-lighting-redux.html
This isn't the best competitive benchmark of both algorithms - it would be nice to see CPU ms and GPU ms reported seperately.

In the "4000 x boxes, 1 x point light" test, is that 4000 draw-calls, each with 12 triangles? If so, it would be a much fairer test if the boxes were drawn using 40 draw-calls, each with 1200 triangles...

[Edited by - Hodgman on October 22, 2010 11:20:13 PM]
Sorry for digging this topic out but I recently I've been thinking about LPP (a.k.a. deferred lighting) vs deferred shading. I read some materials in the net and eventually came to a conclusion that deferred shading is indeed "better" in many ways. While I was still hesitating which technique would be nicer to use, I considered how adding parallax mapping would affect both pipelines. Now, tell me if I am wrong but... isn't it necessary in deferred lighting to do parallax mapping (or relief, or any other alike) twice, first in the g-buffer pass (need to offset normals), and then in the shading pass (to offset diffuse, specular and other textures). If that is so then it seems like a huge waste of GPU time, given that parallax/relief mapping techniques can be very expensive.

On the other hand, how would you handle cube-mapped materials and emissive materials in deferred shading? When it comes to cube maps I think I would simply blend it with diffuse map in the g-buffer pass, but what about emissive? Using another render target to store emissive doesn't sound nice. The only thing I can think of is to just re-render all emissive objects again (with emissive lighting only) after the deferred shading pass.

This topic is closed to new replies.

Advertisement