Forward+ vs Deferred rendering

Started by
11 comments, last by Eklipse 9 years, 1 month ago

Hi,

As an independent study for college I am implementing a deferred renderer into the game engine I have been developing. I've been reading a lot about the details of this, but something that caught my eye is "forward+" rendering. It would seem to me that with the rise of physically based materials in video games, deferred rendering should be an obvious choice moving forward (as PBR relies on a unified shading model, which deferred rendering addresses wonderfully). I'm having a little bit of trouble understanding the concept behind forward+, but from what I've read it seems like "partially deferred rendering"... or something. Could anyone explain this more clearly? Are there any significant advantages to forward+ over deferred (or the other way around)? Thanks

Advertisement

Hey, it's been a little while since I've looked at implementation, but IIRC it's like this:

Generate a G Buffer of only the information for normals and depth.

Generate an irradiance buffer for all the lights

Forward render your scene using the irradiance buffer as an input for your lighting.

Deferred Shading will have you generate all your material/albedo/depth/etc parameters and then render the lights to shade them further. I hope that was clear.

Perception is when one imagination clashes with another

This is my understanding, without having implemented it: The core concept of Forward+ is to have forward rendering, but culling lights that don't affect certain areas of the screen so you can avoid computation of the influence those lights have on those pixels. In practice this is done by dividing the screen into equally sized rectangles ('tiles'), computing for each tile what lights affect it, and then storing, for each tile, a list of lights that affect that tile that you use in a separate step to compute the shading.

The culling is done in two steps. First you have a z-prepass where you render your scene to a depth-only buffer (similar to shadow mapping). Then, the culling for a single tile is done by using that z-prepass-buffer to determine the smallest and largest (or perhaps nearest and furthest) depth found in that tile. You then use that [min,max] depth range and the tile's extents in screen space to construct a frustum that covers everything that can be lit in that tile. You then cull all the lights in your scene against that frustum, and store the lights that pass the test in a list for later use. The culling can be efficiently implemented in a compute shader. For reference, these slides show how Battlefield 3 uses compute shader based light culling, although they do it for tiled deferred rendering. But it should be similar (identical?) for Forward+.

Then, as a separate rendering pass, you go over those tiles and for a tile only need to compute the shading for the lights that actually affect it (using the light list).

As for separate G-Buffers or something like that: As far as I understand it, at least the basic versions don't have this. You basically just run a second geometry pass when you do the actual shading, which of course costs some performance but apparently not costly enough to avoid the method altogether. In return you get all the advantages of forward rendering.

Well those are the basics anyways. These things are always implemented slightly differently depending on who is doing it. If MJP pops up, he's probably a good reference since his team used it to make the prettiest game to date.

(as PBR relies on a unified shading model, which deferred rendering addresses wonderfully).

I want to quickly mention that PBR doesn't rely on whether you're using deferred, forward, their tiled variants, or even some ray tracing algorithm to render your images. I think a state of the art real time PBR renderer is also pretty likely to have a lot of varying parts (e.g. having different BRDFs for different special cases of materials, or having forward passes in a renderer that mainly does deferred shading, for example to handle transparency), so not a lot of unification going on. What PBR really shines at (I'd say) is consistency in the results it produces, independent of whether you're going for photorealism or something artsy.

The idea behind forward+ instead of putting material properties into a G-Buffer and applying lighting to those properties, you figure a list of lights that affect a pixel (or group of pixels). Then you do a normal forward rendering pass, where each pixel being shaded loops over the list of lights, and computes the final reflectance. The most common way to do this is to use a compute shader that computes lists of lights per screen-space tile, where each tile is something like 16x16 pixels in size. With forward+ you typically also have a depth prepass before computing your tile lighting lists. Having depth information lets you do a better job of culling lights, and it also ensures that you don't have overdraw during your main forward pass.

Also, PBR does not imply a unified shading model for all materials. There are plenty of material types that don't fit well within the standard microfacet specular models, for instance hair and cloth.

Cool, I think I get it now. Thanks!

With forward+ in mind, is there any reason why you would still choose deferred rendering? Also, why couldn't you do tile-based light culling with deferred rendering as well?

Sometimes rendering the scene twice can be too expensive.

Perception is when one imagination clashes with another

With forward+ in mind, is there any reason why you would still choose deferred rendering?

  1. With forward+ you need to rasterize your geometry twice if you want to eliminate overdraw when computing lighting, and use depth bounds for culling lights. With deferred, even if you rasterize once you won't have any overdraw during the lighting computation. Note that it's not strictly necessary to have a depth prepass for forward+: Forza 5 shipped without a Z prepass, and they used a clustering system to bin lights along the Z axis as well as in X and Y in order to get better culling. I would imagine that they also put effort into making sure that they rendered in mostly front-to-back order, so that they could reduce overdraw.
  2. Deferred is generally less susceptible to decreased efficiency due to pixel quad efficiency, and sub-pixel triangles. This means that with forward+ having good LOD systems is very important (although it can still be important for deferred, of course).
  3. Having a G-Buffer can be really useful for other purposes. For example: SSAO, screen-space reflections, as well as decals and other effects that work by modifying the G-Buffer properties.


Also, why couldn't you do tile-based light culling with deferred rendering as well?

You absolutely can, and many games already do this. Doing this lets you do all of your lighting in a single pass, which keeps your from sampling your G-Buffer over and over again.

Also, why couldn't you do tile-based light culling with deferred rendering as well?

You can! Battlefield 3 used it, they go over it in the slides I linked.

Cool, I think I get it now. Thanks!

With forward+ in mind, is there any reason why you would still choose deferred rendering? Also, why couldn't you do tile-based light culling with deferred rendering as well?

Because you can just do clustered rendering and do both: http://www.humus.name/Articles/PracticalClusteredShading.pdf

It's a very similar idea to tiled rendering, you just extend the tiles into 3D space and do your standard G-buffer for most stuff, and a forward loop pass for forward rendered stuff. You get transparency, highly complex materials, screenspace decals, a unified lighting model, etc. etc. Also tighter depth ranges, better light culling and etc. Which, as MJP pointed out for Forza 5, means you can get away with no Z-prepass and avoid doing geometry twice even for the forward rendering pass.

Frankly, for storing material properties in a g-buffer I've found even a single 8bit channel to be too much for a single parameter. You can do clever schemes wherein a render target channel can be use for multiple material types, splitting the 0-255 range into multiple material descriptions and/or using that channel for differing material types that aren't going to appear on the same model. You don't really need 8bit precision for metallicity after all, as even half precision, or less, will probably be fine for your artists. Other engines like Lords of the Fallen just use a channel as an 8bit LUT of materials, and Destiny manages to be extra clever and pack a 10bit LUT into an 8bit channel. So if you're clever enough getting multiple material types into a reasonable G-buffer footprint isn't the hardest thing to do.

Because you can just do clustered rendering and do both: http://www.humus.name/Articles/PracticalClusteredShading.pdf

It's a very similar idea to tiled rendering, you just extend the tiles into 3D space and do your standard G-buffer for most stuff, and a forward loop pass for forward rendered stuff. You get transparency, highly complex materials, screenspace decals, a unified lighting model, etc. etc. Also tighter depth ranges, better light culling and etc. Which, as MJP pointed out for Forza 5, means you can get away with no Z-prepass and avoid doing geometry twice even for the forward rendering pass.

Very cool, I'll take a look at that!

This topic is closed to new replies.

Advertisement