Rendertarget switching with deferred lighting

Started by
6 comments, last by Tasche 11 years, 1 month ago

hello again, this time im doing deferred lighting following this concept:

http://kayru.org/articles/deferred-stencil/

ive got it implemented, but some questions remain...

-in the first pass you need to turn off color, while the second one draws to the rendertarget. at the moment i achieve this by setting the rendertarget to 0 during the first pass. however rendertarget switching is expensive, so i wonder if i should do this maybe with alpha blending. i know i can simply try it out, but i don't want a days worth of bughunting just to find out my original solution was better, and rather profit from someone elses experience. any other clever (i.e. fast) ways to mask out color drawing?

-also, when doing multiple lights, i save a light color and specular factor in the rendertarget and accumulate values for all lights in the same target. is this the right way to do it?

in pseudocode, i do a loop over my lights which looks something like this:

for (all lights)

{

set_rendertarget(null,depthstencil)

renderfrontfaces()

set_rendertarget(colorbuffer,depthstencil)

renderbackfaces()

}

so my loop always switches between those targets, feels kinda expensive (atm my fps drop from 200 to 60 with 50 lights each covering ~30% of the screen, even though their pixelshader code does almost nothing, only a tex lookup (well overdraw also kills fps, but that much?)). doing the first pass with all lights, then the second doesnt work obviously, because backfaces of some light may coincide with frontfaces of another drawn earlier, and produce a draw area which should've been stenciled out.

im using dx11 btw, if it matters...

any thoughts how to do this properly is appreciated

cheers,

tasche

Advertisement

that is a lot of rendertarget switching. why not drop the stencil buffer and just handle it in the light shader. unproject from screen space coord and depth value to world space position and discard fragment if its outside the bounds of the light (super easy to calculate for boxes, spheres, cones and probably doing it anyway when computing attenuation). then you can just set the render target and render all your lights in one go.

If you have to switch render targets anyways (i.e. from the previous final frame rendering) then you aren't really losing anything by only binding the depth buffer and then after the first step binding the render target.

However, if you really want to try something else, you could always use the RenderTargetWriteMask in the blend state. I'm pretty sure using the render target set to null is going to be significantly faster than this though, since this will allow your pixel shader to continue operating and just throws away the results (whereas with no render target the pixel shader probably won't execute at all).

so first thanks for those quick responses...

that is a lot of rendertarget switching. why not drop the stencil buffer and just handle it in the light shader. unproject from screen space coord and depth value to world space position and discard fragment if its outside the bounds of the light (super easy to calculate for boxes, spheres, cones and probably doing it anyway when computing attenuation). then you can just set the render target and render all your lights in one go.

hm i heard about people doing something like this, but to be honest i couldn't figure out how. doing a bounding volume test on a pixel's world position seems rather expensive compared to rendering a low poly sphere twice, since it involves trigonometry, and has to be done for every pixel covered by the sphere (which still has to be rendered), not only the lit areas. but i probably am missing some integral part of the algorithm. got any links? googling something like 'deferred lighting without stencil' lists a bunch of stencil algorithms -.-' but as soon as i find some info on this ill definitely look into it.

If you have to switch render targets anyways (i.e. from the previous final frame rendering) then you aren't really losing anything by only binding the depth buffer and then after the first step binding the render target.

However, if you really want to try something else, you could always use the RenderTargetWriteMask in the blend state. I'm pretty sure using the render target set to null is going to be significantly faster than this though, since this will allow your pixel shader to continue operating and just throws away the results (whereas with no render target the pixel shader probably won't execute at all).

to the first part: but the color buffer always get bound and unbound (depthstencil remains) for every light iteration. if the card/driver is clever it will optimize it to just skip the pix shader, in which case i get optimal performance (best case), if not it will move the entire light accumulation buffer into cache and out again (worst case), which i think is actually happening.

to the second part: hm so you are saying the way i got it at the moment is faster? that word 'allow' confuses me, since it implies the opposite =)

pls post back once you read this guys! thanks again by the way...

so first thanks for those quick responses...

that is a lot of rendertarget switching. why not drop the stencil buffer and just handle it in the light shader. unproject from screen space coord and depth value to world space position and discard fragment if its outside the bounds of the light (super easy to calculate for boxes, spheres, cones and probably doing it anyway when computing attenuation). then you can just set the render target and render all your lights in one go.

hm i heard about people doing something like this, but to be honest i couldn't figure out how. doing a bounding volume test on a pixel's world position seems rather expensive compared to rendering a low poly sphere twice, since it involves trigonometry, and has to be done for every pixel covered by the sphere (which still has to be rendered), not only the lit areas. but i probably am missing some integral part of the algorithm. got any links? googling something like 'deferred lighting without stencil' lists a bunch of stencil algorithms -.-' but as soon as i find some info on this ill definitely look into it.

>

If you have to switch render targets anyways (i.e. from the previous final frame rendering) then you aren't really losing anything by only binding the depth buffer and then after the first step binding the render target.

However, if you really want to try something else, you could always use the RenderTargetWriteMask in the blend state. I'm pretty sure using the render target set to null is going to be significantly faster than this though, since this will allow your pixel shader to continue operating and just throws away the results (whereas with no render target the pixel shader probably won't execute at all).

to the first part: but the color buffer always get bound and unbound (depthstencil remains) for every light iteration. if the card/driver is clever it will optimize it to just skip the pix shader, in which case i get optimal performance (best case), if not it will move the entire light accumulation buffer into cache and out again (worst case), which i think is actually happening.

to the second part: hm so you are saying the way i got it at the moment is faster? that word 'allow' confuses me, since it implies the opposite =)

pls post back once you read this guys! thanks again by the way...

That's right - I meant that using the blend state would allow your pixel shader to run --> meaning it will be slower than just enabling and disabling the whole render target. I know you don't want to hear this, but the best way is just to try it out - it should be very easy to test out, and you can verify that you are doing things correctly with PIX / Graphics Debugger too.

so first thanks for those quick responses...

that is a lot of rendertarget switching. why not drop the stencil buffer and just handle it in the light shader. unproject from screen space coord and depth value to world space position and discard fragment if its outside the bounds of the light (super easy to calculate for boxes, spheres, cones and probably doing it anyway when computing attenuation). then you can just set the render target and render all your lights in one go.

hm i heard about people doing something like this, but to be honest i couldn't figure out how. doing a bounding volume test on a pixel's world position seems rather expensive compared to rendering a low poly sphere twice, since it involves trigonometry, and has to be done for every pixel covered by the sphere (which still has to be rendered), not only the lit areas. but i probably am missing some integral part of the algorithm. got any links? googling something like 'deferred lighting without stencil' lists a bunch of stencil algorithms -.-' but as soon as i find some info on this ill definitely look into it.

so in point light rendered as sphere you're probably already computing the attenuation with some linear fall off that hits zero at light radius. if the distance between light pos and world pos is greater than the light radius then you don't want to light that point so you either fully attenuate or discard (depending on expense of the rest of your shader)

so in point light rendered as sphere you're probably already computing the attenuation with some linear fall off that hits zero at light radius. if the distance between light pos and world pos is greater than the light radius then you don't want to light that point so you either fully attenuate or discard (depending on expense of the rest of your shader)

consider following situation (picture done in pov-ray, just for demo purposes)

brick wall + blue stuff = a floor and a ...well ... a wall in the gbuffer

white sphere = my pointlight bounding object

small ellipses = intersections of the bounding object with wall and floor

http://imagebin.org/248058 (sry somehow i cant post images directly)

my code will only run the pixel shader for the small ellipses in the white sphere. if i understand your suggestion correctly, the pixel shader will run for every pixel in the white sphere, and do at least a distance testing (trigonometry) before exiting.

since the sphere covers nearly the entire screen, this will be very expensive compared to my method. of course its a constructed situation, but i wouldn't say its uncommon.

the way i do it will always be less or same (since it only runs a pixel shader on intersections with the light sphere).

true, the sphere has to be rendered twice, but a modern GPU tears through a low vertex count vertex shader like a knife through butter, and aside from that rendertarget switch and some depthstencil settings data remains in cache for both passes. its definitly the target switch that is painful (provided that it is needed at all).

im still not 100% sure i got your method right, because i know a lot of people do deferred lighting in one pass, i just cant find proper info on it)

guess ill just have to try some alpha technique... if anyone knows anything else on how to mask color buffer (i may need it for something else, you never know :D) pls share!

on a side note, i noticed my severe frame rate drops were due to me loading and unloading the gbuffer as resource (3 fullHD size 32 bit textures) for every light, after fixing that i can render up to 250 lights at otherwise same settings/resulting fps. stupid me. i'm still very interested in the answers to my question though, but with this i can work.

ah sry to dig out this old one, but just for completeness, i tried the alpha = 0 version and just setting first pass target to 0 is marginally quicker.

so if anyone ever wondered, go for a nulltarget =)

This topic is closed to new replies.

Advertisement