[DX11] Tile-based Deferred Shading in BF3 discussion

Started by
22 comments, last by olaolsson 12 years, 1 month ago

It's not clear to me why rendering translucent geo into a render target with the blend mode set to multiply wouldn't work.


I'm sorry, I misunderstood your approach. Never mind that part about the blending modes. smile.png



So I bind the depth buffer as a SRV and run the pixel shader at per-pixel frequency by not specifying SV_SampleIndex as an input to the shader? Then, just simply read the depth texture and write it out to SV_Depth?

It sounds like this method (depth buffer resolve shader) is a better choice for our application. We draw a lot of translucent particles like smoke and so rendering that into a non-MSAA buffer sounds like less bandwidth. And since the particles tend to have smooth texture edges, MSAA probably wouldn't benefit us much.


Yup. In our engine at work we actually take this concept a step further and downsample the depth buffer to half-sized, so that we can render expensive things (volumetrics, really dense smoke, etc.) to a half-sized render target and save performance.
Advertisement
Hi,

Just thought I'd point you towards a paper about tiled shading, and associated OpenGL demo, by, *ahem*, myself. The paper is sadly paywalled by JGT, but I've put up a preprint, which is not hugely different from the published paper (it contains some bonus listings that were removed dues to space restrictions), on my web site. You may be able to access the published paper from a uni library or similar.

http://www.cse.chalm...d=tiled_shading

The main takeaway is a much more thorough performance evaluation and analysis, the introduction of tiled forward shading (which enables easy handling of transparent geometry).

In relation to the discussion here. I go a different way to the others and do the tile intersection by first transforming the lights to screen space, and then testing the screen space extents against each tile. On the CPU I do it scan line fashion, which is as efficient as it gets, but somewhat hard to do in parallel. Therefore the GPU version does a brute force tiles-test-all-lights approach, much like others have done, but with a much cheaper aabb/aabb test (2D extents + depth range). This saves constructing/testing identical planes all over the place.

The demo only implements the CPU variety, and without depth range (though I may update that).

Hope you find this useful.

Cheers
.ola
I am working on a deferred pipeline for PC. Since tile based technique has been implemented on X360, can anyone say me the advantages and disvantages of tile based over quad based deferred in DirecX 10??
I haven't used it to optimise my deferred shading yet (I'm planning on it and have high hopes), but applying the same tile-based optimisations to shadow-filtering, DOF, SSAO and FXAA has been a huge win for me on DX9-PC and the 360/PS3.

mmm, interesting, Im going to implement a light volume technique in a first moment (I understand it better), and then I will try to implement the tile-based to see the performance difference smile.png .

Thanks for the answers!


So, to underline the main difference: Traditional deferred shaing is typically memory bound, whereas tiled deferred shading completely eliminates this bottleneck and is squarely compute bound. Given this, you can get an idea of how much better it will perform on your platform, either by looking at performance numbers, or by simple experimetation (e.g. vary G+Buffer bit depth). Both xbox 360 and PS3 have a very high compute to bandwidth ratio, and this is true for modern GPUs as well, and increasingly so.

As I found in my experiments, going between GTX 280 and GTX 480, shading performance doubles for tiled deferred, whereas my implementation of traditional deferred shading scales by the expected 30%, corresponding to the increase in memory bandwidth.

Anyway, of course, if you have massively complex shaders you may not be memory bandwidth bound (yet) but its a pretty safe bet you will be sooner or later as memory bandwidth fall further and further behind. If rumours about GTX 680 are to be believed we'll see this gap widen significantly again in this new generation.

Cheers
.ola

This topic is closed to new replies.

Advertisement