# DX11 Stippled Deferred Translucency

This topic is 2132 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

How do you handle ssao and particle effects in this approach? Do you keep the stippled depth buffer?

We use stippling and limited it to 2 layers. It is good enough for what we need, but especially ssao and particles are really bothering me.

##### Share on other sites

For alpha lighting, I generate a volume texture locked to the camera with lighting information (warped to match the frustum). Atm I fill this in a CS, similar to the usual CS light culling pass. I store both single non directional approximated lighting value, and a separate set of directional values. This allows me to do either a simple texture fetch to get rough lighting info when applying it (for particles for example), or higher quality directional application with 3 texture fetches and a few ALU ops.

It's a pretty simple system atm, downsides are lower lighting resolution in the distance, and it's not exactly free to generate. (That might be possible to optimize by at least partially generating it on CPU though). Also, no specular atm...

Pros are cheap lighting, even for a huge number of particles, and semi cheap volumetric lighting / light shafts for any shadow casting light in the scene, as I also march through the volume when I apply my directional light volumetric shadows (simple raymarch through shadowmap).

##### Share on other sites

it might be just as fast, but with better quality, to create a seperate texture for alpha blended objects.

1. as gpus work on 2x2 quads, you poison the depth buffer and any common deferred optimization (depth bound check, stencil culling) as the 2x2 pixel quad is processed even if just one pixel is valid. the results of the other pixels are discarded, of course -> not slower

2. you have still the full res backbuffer for solids, but now also a separate target with alpha -> better quality

3. you could use a lower res alpha buffer, you're interleaving anyway which is like less resolution, so why not an alpha buffer with lower res from the beginning? you'd need to render into the high-res first, then you'd do a resolve where you combine the alpha-gbuffer, kinda filling empty pixel, then you shade this lower res buffer ->I estimate this to be faster than interleaved.

##### Share on other sites

How do you handle ssao and particle effects in this approach?

I do SSAO using a half-resolution depth buffer (and bilaterally upsample the results). To get the half-res depth, I take the minimum value in each 2x2 quad, so SSAO is only computed for the closest layer.

I haven't yet decided how to best solve interactions with traditional alpha surfaces, which I guess is what you mean by particles issues.

Ideally, when rendering particles into the stippled lighting buffer (before the reconstruction filter is executed), the particles would perform a depth test against all 4 layers, and then only write/blend over the pixels that correspond with the furthest layer that they are in front of...

For alpha lighting, I generate a volume texture locked to the camera with lighting information (warped to match the frustum).

That's similar to clustered shading, but instead of storing a list of lights per cell/texel in the volume, you're storing the (approximate) radiance at that location. I was thinking of using something similar for things where approximate lighting is ok, like smoke particles. Does it work well for you in these cases?

BTW, if you stored the light in each cell as SH, you could extract the dominant light direction and colour from this representation, and use it for some fake specular highlights ;)

1. as gpus work on 2x2 quads, you poison the depth buffer and any common deferred optimization (depth bound check, stencil culling) as the 2x2 pixel quad is processed even if just one pixel is valid. the results of the other pixels are discarded, of course -> not slower

I'm currently using tiled/clustered deferred shading, so loss of depth/stencil optimizations isn't a worry ;) The varying depths in the layers does still cause data inefficiencies in both tiled (tile-frustums with large depth ranges) and clustered (data/branch coherency) though.
I was thinking about addressing these by running a de-stippling pass over the G-Buffers, which simply re-arranges them before the lighting step so that where the top-left pixels from each 2x2-quad fill the top-left quarter of the buffer, and so on for the other three.
e.g. the transform would rearrange the like below, so it looks like you've got 4 half-res views of the scene packed/atlased next to each other.

121212    111222
343434 -> 111222
121212    333444
343434    333444

The coherency of each of these sub-buffers would then be improved during lighting. Afterwards, I'd have to run the inverse of this transformation over the lighting result to get the actual image, instead of 4 near-identical sub-images.

3. you could use a lower res alpha buffer, you're interleaving anyway which is like less resolution, so why not an alpha buffer with lower res from the beginning?

In the case where there's only one layer of translucency, the background is at half-resolution (1/4th as many pixels), but the translucent layer is only missing 1/4th of it's pixels (still almost full resolution). In this case, the alpha layer does not appear like a half-res rendering -- the quality is very sharp.

Also in the case of two layers, the front-most layer is only half-res in one axis instead of both axes (1/2 as many pixels).

I just realized I had some debug code enabled when taking the above screenshot, where it behaves as if every layer is half res, even if more information about that layer is available.
Here's a close-up (zoomed 4x) of the front layer having 3/4 pixels, and 1/4 pixels: http://i.imgur.com/TLUGQHh.png
As the background is blurred, the lack of resolution there isn't as much of an issue.

Edited by Hodgman

##### Share on other sites

You may be interested in what Bungie is doing for Destiny: http://advances.realtimerendering.com/s2013/Tatarchuk-Destiny-SIGGRAPH2013.pdf

For transparencies they are, essentially, using multiple tiny spherical harmonic light probes to light their transparencies (a but like what ATEfred is doing). So instead of lower res you get a more proxy lighting, and something of a more complex pipeline. They also have stuff on using an eighth, 1/8th! res buffer for particles, and how they manages to avoid aliasing artefacts on the edges.

The other thing I can think of is what Epic apparently does with UE4, and that's just brute force more G-buffer layers altogether. A layer of transparency in front? An entire other filled g-buffer for it. A rather direct way to use all that memory and bandwidth I suppose. Still, you could combine it with stippling, going for up to twice the buffer, keeping more resolution for more important layers or going for up to eight layers if you want.

If you go with multiple g-buffers you could also consider thin g-buffers. Compact normals down to X and Y and reconstruct Z. Use Bungies "material ID" to compact that down to 1 channel, compact color down ala what Crytek does for Crysis 3 (there's so many presentations on it, I'm not sure which one it is). Bungie also skips separate specular channels by just hacking spec color based on diffuse color. Point would be to save as much as you can on your extra g-buffers.

Edited by Frenetic Pony

##### Share on other sites

That's similar to clustered shading, but instead of storing a list of lights per cell/texel in the volume, you're storing the (approximate) radiance at that location. I was thinking of using something similar for things where approximate lighting is ok, like smoke particles. Does it work well for you in these cases?

BTW, if you stored the light in each cell as SH, you could extract the dominant light direction and colour from this representation, and use it for some fake specular highlights ;)

That's pretty much it. It works really well for particles and fog with the single directionless approximated value, and it's lightning fast, once it is generated. I'll have to get a video capture done at some point.

atm I use HL2 basis rather than SH (simply because it was easier to prototype, and for alpha geo I only really care about camera facing stuff). Getting dominant direction from SH sounds like a good idea, now sure how computationally expensive it is? I'll need to look it up.

##### Share on other sites

atm I use HL2 basis rather than SH (simply because it was easier to prototype, and for alpha geo I only really care about camera facing stuff). Getting dominant direction from SH sounds like a good idea, now sure how computationally expensive it is? I'll need to look it up.

It's very cheap.

//-------------------------------------------------------------------------------------------------
// Computes the "optimal linear direction" for a set of SH coefficients
//-------------------------------------------------------------------------------------------------
float3 OptimalLinearDirection(in SH4Color sh)
{
float x = dot(sh.c[3], 1.0f / 3.0f);
float y = dot(sh.c[1], 1.0f / 3.0f);
float z = dot(sh.c[2], 1.0f / 3.0f);
return normalize(float3(x, y, z));
}

//-------------------------------------------------------------------------------------------------
// Computes the direction and color of a directional light that approximates a set of SH
// coefficients. Uses Peter Pike-Sloan's method from "Stupid SH Tricks"
//-------------------------------------------------------------------------------------------------
void ApproximateDirectionalLight(in SH4Color sh, out float3 direction, out float3 color)
{
direction = OptimalLinearDirection(sh);
SH4Color dirSH = ProjectOntoSH4(direction, 1.0f);
dirSH.c[0] = 0.0f;
sh.c[0] = 0.0f;
color = SHDotProduct(dirSH, sh) * 867.0f / (316.0f * Pi);
}


##### Share on other sites

atm I use HL2 basis rather than SH (simply because it was easier to prototype, and for alpha geo I only really care about camera facing stuff). Getting dominant direction from SH sounds like a good idea, now sure how computationally expensive it is? I'll need to look it up.

It's very cheap.

//-------------------------------------------------------------------------------------------------
// Computes the "optimal linear direction" for a set of SH coefficients
//-------------------------------------------------------------------------------------------------
float3 OptimalLinearDirection(in SH4Color sh)
{
float x = dot(sh.c[3], 1.0f / 3.0f);
float y = dot(sh.c[1], 1.0f / 3.0f);
float z = dot(sh.c[2], 1.0f / 3.0f);
return normalize(float3(x, y, z));
}

//-------------------------------------------------------------------------------------------------
// Computes the direction and color of a directional light that approximates a set of SH
// coefficients. Uses Peter Pike-Sloan's method from "Stupid SH Tricks"
//-------------------------------------------------------------------------------------------------
void ApproximateDirectionalLight(in SH4Color sh, out float3 direction, out float3 color)
{
direction = OptimalLinearDirection(sh);
SH4Color dirSH = ProjectOntoSH4(direction, 1.0f);
dirSH.c[0] = 0.0f;
sh.c[0] = 0.0f;
color = SHDotProduct(dirSH, sh) * 867.0f / (316.0f * Pi);
}


Awesome, thanks for the info! I'll give this a whirl this weekend!

• 21
• 11
• 9
• 17
• 13
×

## Important Information

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!