## Recommended Posts

Hi, I am currently trying to implement deferred shading in DX9.0 when I encounter some problems and queries with lighting/shadowmapping. I understand that a deferred shading generally works as:
Render to g-buffer
For each light:
Use g-buffer to calculate result and merge with frame buffer.

However the article 6800_Legues_deferred_Shading.pdf, wrote to keep diffuse and specular separate and then merge them to a framebuffer as a final pass. Does that mean I have to have actually do:
Render to g-buffer
For each light:
Calculate and write:
Color0 = Diffuse
Color1 = Specular
Do a final pass after all light calculation to merge value

Another problem I have is with shadow maps. How do I integrate them? I was thinking that I need to generate them before creating the g-buffer. But how do I use them? Say I have 3 shadowmap, how do I write the information to g-buffer telling it whether the pixel is shadow? Thanks

##### Share on other sites
Quote:
 Original post by littlekidAnother problem I have is with shadow maps. How do I integrate them? I was thinking that I need to generate them before creating the g-buffer. But how do I use them? Say I have 3 shadowmap, how do I write the information to g-buffer telling it whether the pixel is shadow?
I use Horde3D, which supports deferred lighting.
Off the top of my head, their algorithm is:
Render scene to g-buffer (from cameras perspective)For each light:    Render scene shadow-buffer (from lights perspective)    Use g-buffer and shadow-buffer to calculate result and add to frame buffer.

##### Share on other sites
Quote:
 ...Use g-buffer and shadow-buffer to calculate result and add to frame buffer

But how do I know which shadowmap buffer to use? If I have 3 lights, means 3 shadowmap buffer. How do you identify which pixel use which shadowmap?

##### Share on other sites
Sorry, I should have explained that better.

There is only one shadow-buffer, not one-per-light.

After the G-Buffer has been filled in, the frame-buffer is cleared to black and each light is processed in turn.

When processing a light, first it's shadow map is generated, and then this temporary-shadow-map and the G-Buffer are used to produce the lighting values for that light-source, and these values are additive-blended to the frame-buffer.

Pixels that are in shadow will add (0,0,0) to the frame-buffer (i.e. have no effect).

So with two lights it looks like:
Render scene to g-buffer (from cameras perspective)Clear frame-buffer to black    Render scene to temp shadow-buffer (from light 1's perspective)    Use g-buffer, temp shadow-buffer and light 1's properties to calculate result and add to frame buffer.    Render scene to temp shadow-buffer (from light 2's perspective)    Use g-buffer, temp shadow-buffer and light 2's properties to calculate result and add to frame buffer.

##### Share on other sites
Hodgman very nicely described the standard way of doing it, which I think is the best one, too.

The "6800 leagues" recipe that you mentioned should not be confused with that. It is a different approach, which involves a little trade-off (monochromatic specular lights) but supposedly offers better memory bandwidth.
However, I honestly don't see how it is any better, in my opinion it is only worse. You certainly save some texture reads (albedo color) from the G-Buffer while calculating lights, but you pay for that with an even higher number of additional pixel writes (which is twice bad).
I've asked if someone could explain how this approach could be superior a few weeks back (since I really don't see it), but to no success.

##### Share on other sites
Thanks I think I am getting it.

However won't it be expensive considering I have to redraw the geometry for the number of lights? Even after my Light Tree culling, I might be left with around 1-4 lights.

##### Share on other sites
Quote:
 Original post by littlekidHowever won't it be expensive considering I have to redraw the geometry for the number of lights? Even after my Light Tree culling, I might be left with around 1-4 lights.

This is no different than forward rendering... you need a shadow map for each light that you want to cast shadows; there's no getting around that. Indeed with deferred rendering you save memory since you only need one shadow map at a time.

##### Share on other sites
Quote:
 Original post by littlekidHowever won't it be expensive considering I have to redraw the geometry for the number of lights? Even after my Light Tree culling, I might be left with around 1-4 lights.

When drawing the geometry for each light, all that needs to be calculated is the depth values (for the shadow map). This is super fast compared to regular rendering with complicated shaders etc...

The up-side is that once you've got your G-buffer (and shadow buffer), (almost) no geometry needs to be drawn to do the lighting calculations.

To make an unfair example, lets say you have 100 point lights that don't cast shadows.

With forward shading, you can only handle a certain amount of lights with each pass. So you might have to draw the entire scene anywhere from 10 to 100 times.

With deferred shading, you only draw the scene once (to G-buffer), and then for each light you just need to render a quad that encompasses the area of the screen effected by that light.

Obviously, once you take shadows into account it's a bit more complex:
Forward = 100 Shadow-buffer passes + ~10 to ~100 geometry passes.
Deferred = One geometry pass + 100 Shadow-buffer passes + 100 light passes.

The best lighting approach depends on many things, such as how many lights you can process in a forward-shaded pass, how many lights need shadows, how big the area of effect of each light is, how many lights you want visible at once, the required resolution of the shadow buffers, whether you have lots of transparent geometry, if support for old hardware is required, etc...

[Edited by - Hodgman on February 13, 2008 10:16:26 PM]

##### Share on other sites
Quote:
 Original post by HodgmanWith deferred shading, you only draw the scene once (to G-buffer), and then for each light you just need to render a tiny sprite that encompasses the area of the screen effected by that light.

I don't really get what you mean by this sentence. Don't we draw the whole full screen quad for every light we process?

##### Share on other sites
Quote:
 Original post by littlekidDon't we draw the whole full screen quad for every light we process?

Sorry, I'm getting ahead of myself again!

Yes, in the basic implementation you draw a full-screen quad to calculate the lighting.

This can then be optimised though - if you find the bounding-sphere / bounding-box of the light source, and then find the top-left/bottom-right extents of this bounding-area in screen-space, then instead of drawing a full-screen quad, you can draw a smaller quad that only covers the area that the light will affect.

This optimisation is very important if you have lots of small light sources.
If you've only got a few large light sources which are going to cover the whole screen anyway then there isn't much point in doing this though.

Another form of this same optimisation is to use 3D geometry instead of a screen-space quad.
E.g. for a spot-light, you would draw an actual 3D cone that encompasses the area effected by the light-source - this way (assuming the frame-buffer's z-buffer is correct) you can even further cut-down the number of pixel operations required, as some of the pixels will be rejected by the z-test before the pixel-shader is executed.

##### Share on other sites
Thanks Hodgman, thats sounds a pretty niffy method. However I have some queries as to the 2 methods you have just describe.

For the first method, where I would draw a smaller screen-space quad, is it right to say I have to do some calculations to get the Tex Coord of the smaller quad, since I don't think setting the texcoord from (0,0) to (1,1) is going to work in all case?

For the second method how does it work and what do exactly mean by drawing it in 3D? Do you mean I draw the full screen quad as normal, then for each spotlight, I draw a 3D cone?

##### Share on other sites
As for rendering of light volumes, you simply create a mesh representing the shape of a given light type. For a point light that would be a sphere, not a cone.

In lighting pass just render the light volume. 3D volume: [1] automaticaly limits amount of processed pixel to those only, which lay in light's range; [2] automaticaly rejects those pixels in light's range which won't be visible to observer because they are hidden behind the geometry between the light and the observer (they are simply culled basing on z-buffer contents).

In my deferred renderer I store depth in camera space and I perform all lighting calculations in camera space. Computation of pixel's position in camera space is as simple as only one MAD in pixel shader...

There are a few papers covering deferred shading and 3D light volumes + stencil technique used in that case, however - I may always present you a simple sketch of all steps necessary, if you want.

##### Share on other sites
Sure thanks, I would be glad if you could outline thoese steps. I am still pndering over how does drawing the cone/sphere allow rejection of the unwated pixels. since my z-buffer contains only the z-value of the screen quad which is all at 0.0f

##### Share on other sites
Surely I can outline it (however - you've read the 6800_legaues and it's all in there ;)).

But first of all - why do you say that your z-buffer contains only 0.0 (from a fullscreen quad)? Habitualy your z-buffer should containt scene depth information after the g-buffer pass and that's what you use for all the lighting steps... Well, I might not be as much experienced with deferred lighting since I'm still working on some features (in parallel - simple app for raw test + one complete engine library with all caps and features).

But that's what I do in g-pass:

g-pass, keep the z-buffer filled with depth values of the whole scene;   - RT#0 = albedo (RGB) + shininess (A)   - RT#1 = normal (RGB) + spare space (A)   - RT#2 = depth in camera space   Use R32F for depth if possible.   I chose camera space for all lighting calculations (seemed to be the simplest choice).

Now - I have a unit sphere for point lights. In lighting pass I just scale it by light's range, transform to correct place in the scene and render...

l-pass, for each light do:   - if camera lies outside the light volume      - render front faces of the light volume and mask them out in the stencil buffer;      - render back faces of the light volume comparing to stencil   - if camera lies inside the light volume      - render back faces of the light volume (ignore stenciling)

Option #1 (camera outside the light volume): z-buffer culls out all these points, which lay beyond the geometry and should not be visible - those pixels will not take part in further lighting. You have to mask them in stencil buffer (set stencil reference value to Light.ID - you have 255 lights before stencil.clear will be required :)). Next - you render back faces. Z-buffer culls all those pixels, which do not "touch" any geometry (they float in space) and than, stencil comparison culls all those pixels, which are hidden behind visible geometry (after the first step).

Option #2 (camera inside the light volume): you cannot render front faces, so you don't bother with stenciling. Instead, you render back faces and cull all those pixels which float in space and don't involve any geometry.

Hmmm... I think, I'll prepare some screenshots and come back a bit later...

##### Share on other sites
Ok. First of all, results of g-pass:

1) diffuse only (RT#0)

2) perturbed normal (RT#1)

3) depth in camera space (RT#2)

4) final product for 1 point light (no shadowing though)

5) final product for 4 point lights (no shadowing though)

And now, as for lighting pass... The most important is option #1 (camera outside light volume).

- render front faces   - disable color writes   - disable z-buffer writes   - z-function = LESS   - stencil enabled   - stencilfunc = ALWAYS   - stencilops = all KEEP but STENCILPASS = REPLACE   - stencil reference = Light.ID (<>0)

This fragment culls all those pixels, which lay behind occluding geometry. Only visible part of light volume will be marked in stencil buffer.

- render back faces   - color writes enabled   - z-buffer writes disabled   - z-function = GREATEREQUAL   - stenfilfunc = EQUAL   - stencilops = all KEEP   - stencil reference = Light.ID   - alpha blending enabled for lighting accumulation

This is the real lighting... We've masked all pixels of light volume occluded by geometry. Now we eliminate those, which are not affected by light volume (those laying inside the area of projected light volume, but "behind" /deeper/ then the range of light).

Here you go:

It's the result of culling (one light, diffuse factor /perturbed normal dot light vector/, no attenuation). Note how the middle column in the room occludes light...

As for the Option #2 - since camera is inside the light volume, you should render backfaces only (no stenciling, just z-func = GREATEREQUAL)...

Camera space seems to be very accomodating for all lighting calculations... Perturbed normals are stored in camera space (that's a simple transformation). Depth is stored in camera space... For each vertex of the light volume you can calculate an eye vector (light volume's vertex in camera space = eye vector...). Normalize eye vector to maximum depth (far clipping plane.Z in camera space) and pass it as texcoord to pixel shader... Now you only have to do vEye * pixDepth and you have position in camera space... Just add an addition (light.pos - (vEye * pixDepth)) and you have pixel->light vector for diffuse lighting... That is a simple MAD instruction:

MAD vLit,-pixelDepth,vEye,light.poswhich should be:MAD r1.xyz,-r0.r,t1,c0

And all in PS2.0... No higher version required... (at least at this stage of complexity)...

As for the framerates - screenshots were taken in 800x600, but usually I work in 1280x1024 (native for my LCD) and I get ~170 FPS for 4 point lights (old, good GF 7900GS)... And these lights are huge... Since I added linear attenuation, I had to increase the range of each light... Even with geometry culling, these lights cover almost all the screen (the last, sixth screenshot was taken with light's range decreased by half, in order to get better visibility of culling).

Images are blocky due to a poor quality. Real product is smooth and not pixelized... :)

##### Share on other sites
Wow, I will have to slowly digest these informations.

A question about your light volume outside the camera. Why can't we just not draw this light since it is not in the camera volume anyway.

What I mean is that, I have a spatial tree that just contains all the light volumes, before doing any light pass, I cull this spatial tree with my view frustum and thoes lights that drawn are automatically within the camera frustum.

Will that work?

##### Share on other sites
Well... It's not the question of a light outside the camera volume (frustum) - such lights should be rejected as soon as possible, of course. It's the question of camera lying inside the light volume. Think about the position of the camera, not the whole frustum. In other words - camera position stays within light's range (light's sphere in case of point lights). We should be clear now :)

If camera (observer) stays within light's volume, we don't draw front faces (since they will be partly or totaly clipped by near plane and thus - not visible and we will have some pixels not lit at all, though they should be).

As a matter of the fact - you don't have to differentiate these states. You can skip stenciling and render backfaces in both situations (whether you are in or outside the light's volume). But usually - differentiation is a win, if only there is an occluding geometry between the camera and the light (stenciling works here like a depth-only pass before drawing to the backbuffer).

##### Share on other sites
Ok, let me see if i got the whole process right.

Generate 4 G-Buffer(Depth or Pos, Diffuse, Normal, Others)For each directional light    Draw full screen quad, with lighting and stuff.For each point or spot light:    Draw Light volume only back faces. if back faces z-value is less than the    current g-buffer z-value, update stencil value else ignore.        Draw the full screen quad. This time we se to compare the stencil value as     equal and only thoes pixels that are in the stencil values get to the     shader. The result is additively blended to the buffer

Is it something along this line? or am I missing out something.

A side track question, if I want to do HDR is it correct that instead of drawing the result to the default backbuffer, I created another floating point Render Target to store all the final lighting result, and then proceed to do HDR with this RT?

##### Share on other sites
I think you're still missing a part of the idea :)

As for directional lights - yup, you still need a fullscreen quad. That's correct.

As for point and spot lights - you partly get it, but don't draw a fullscreen quad in the second step!

Look at the sketches:

1) you have a light volume and three pieces of geometry. One lays behind the light, beyond it's range. One lays inside the light volume and one (on the right side) occludes the light. It is obvious, that you want to shade and lit only those pixels, which affect the geometry (geometry lies withing light's range, inside the light's volume). The blue line on the top represents backbuffer and the region, which should be processed is marked in red:

2) You draw the light's volume front faces with stenciling and depth function set to LESS (color writes disabled this time, so you don't affect the backbuffer contents). All the pixels occluded by the geometry on the right side (LESS = false) are left unaffected. All front faces on the left are marked positive in the stencil buffer because LESS = true.

But that's still too much. There is a significant amount of pixels which will be lit for nothing (some part of the geometry lies behind the front faces of the light's volume, but not within the range of the light!). The truth is - you could leave it if only you compute the attenuation (attenuation factor will be < 0, so these pixels will not be lit anyway) but they will be processed! And that's what it is about - don't process things you don't need to process...

3) This time you draw backfaces of the light's volume (depth function = GREATEREQUAL) and filter by stencil buffer contents. You need stenciling, because of the occluding geometry (the right part of the picture would be lit if not stenciled - cause for that geometry backfaces of the light's volume are GREATER):

And just remember - you have light's volume, so you don't want to draw a fullscreen quad in the second step!

As for HDR rendering - yes. If you want to do some HDR, you don't draw to the backbuffer directly, you render to additional (accumulation) render target with, possibly, higher precision (16+ bits per component).

Good luck! :)

##### Share on other sites
Ok but I am kind of stuck when you say we draw the light volume at the 2nd part and not the full screen quad. If I draw the light volume, how do I get the Texture Coords to sample the g-buffers for the information on position/normals/diffuse etc?

float4x4 g_LightVolumeWorldViewProjvoid vs_draw_lvolume(float4 Position      : POSITION0,                     out float4 oPosition : POSITION0){    oPosition = mul(Position, LightVolumeWorldViewProj);    oTexture = Texture;}void ps_draw_lvolume(out float4 oColor0 : COLOR0){    float2 GBufferTexCoord; //How do I get this coordinates?    float4 Diffuse = tex2D(g_DiffuseBuffer, GBufferTexCoord);        ... other lighting calculations, e.g phong shading etc    oColor0 = float4(Diffuse * Color, 1.0f);}

##### Share on other sites
All you have to do is: project texture coordinates (use texldp instead of texld / tex2Dproj instead of tex2D).

Math is quite simple, you need to know "behind the scenes" though:

1) in a normal rendering queue position is transformed to projection space (projection transformation, projection matrix). Next it is divided by Direct3D by Z value (stored in W component after projection transformation). That gives you position in clipping space (X and Y are clipped to [-1.0,+1.0] range):

X[c] = X[p] / W[p]Y[c] = Y[p] / W[p]

2) Next step - Direct3D transforms position to viewport space ([-1.0,+1.0] range to [0.0,screen.width/screen.height]):

X[v] = X[c] * S[w]/2 + S[w]/2Y[v] = -Y[c] * S[h]/2 + S[h]/2so:X[v] = (X[c] + 1) * S[w]/2Y[v] = (1 - Y[c]) * S[w]/2

That gives you position on screen. Just note, that Y component must be inverted (since Y in world/view/projection space increase in opposite direction than screen coordinates).

3) We don't need screen coordinates, but texture coordinates. That means - we won't multiply by screen dimensions, but rather transform coordinates from clipping space (range [-1.0,+1.0]) to texture space (range [ 0.0,+1.0]). Thus, we want to do this:

T[u] = X[c] * 1/2 + 1/2T[v] = -Y[c] * 1/2 + 1/2so:T[u] = (X[c] + 1) * 1/2T[v] = (1 - Y[c]) * 1/2

4) It is common, that render target dimensions are powers of 2, while screen coordinates aren't. That means, in such case, our texture coordinates shoud not be in range [ 0.0,+1.0] but rahter [ 0.0, Screen.size / Target.size]:

T[u] = (X[c] + 1) * 1/2 * (S.width / T.width)T[v] = (1 - Y[c]) * 1/2 * (S.height / T.height)soT[u] = (X[c] + 1) * S[w]/2*T[w]T[v] = (1 - Y[c]) * S[h]/2*T[h]

5) Last but not least, gbuffer coordinates should be pixel/texel exact. Due to the texturing algorithm used by Direct3D, we need to adjust texture coordinates by a half of texel (pixel 0.0 has to match 1/2 * 1/T.size):

T[u] = (X[c] + 1) * S[w]/2*T[w] + 1/2*T[w]T[v] = (1 - Y[c]) * S[h]/2*T[h] + 1/2*T[h]

Now, just substitute for X[c] and Y[c]:

T[u] = (X[p] / W[p] + 1) * S[w]/2*T[w] + 1/2*T[w]T[v] = (1 - Y[p] / W[p]) * S[h]/2*T[h] + 1/2*T[h]

6) Last thing... We are dividing position by Z value (stored in W component). We cannot interpolate X/W and Y/W (interpolated 1/W is not the same as 1 / interpolated W). Luckily - pixel shader offers projected texture lookup function (texldp / tex2Dproj) which divides texturing coordinates by W element before using UV coords... So - we will use projected lookup in pixel shader, but we have to get rid of /W[p] in our calculations:

T[u] = 1/W[p] * (X[p] + W[p]) * S[W]/2*T[w] + 1/2*T[w]T[v] = 1/W[p] * (W[p] - Y[p]) * S[h]/2*T[h] + 1/2*T[h]1/W will be provided in Pixel Shader, soT[u] = (X[p] + W[p]) * S[w]/2*T[w] + W[p]/2*T[w]T[v] = (W[p] - Y[p]) * S[h]/2*T[h] + W[p]/2*T[h]soT[u] = 1/2*T[w] * (X[p]*S[w] + W[p]*S[w] + W[p])T[v] = 1/2*T[h] * (W[p]*S[h] - Y[p]*S[h] + W[p])

That's all...

For each light's volume vertex just do in Vertex Shader:

float4x4 MVPMatrix;float4 Screen;		// place screen dimensions herefloat4 Target;		// place 1/(2*target_dimensions) hereVS_OUTPUT main(VS_INPUT In){	VS_OUTPUT Output;	float4 oPosition = mul(In.Position,WVPMatrix);	Output.Position = oPosition;// I still wirte asm shaders and only a little HLSL, so I'm not sure if it's correct...// But you will know what to do, since you know the math...	oPosition.x = ((oPosition.x + oPosition.w) * Screen.x + oPosition.w) * Target.x;	oPosition.y = ((oPosition.w - oPosition.y) * Screen.y + oPosition.w) * Target.y;// This operations might be vectorized, but my HLSL is too weak ;)	Output.TexCoords = oPosition;	return Output;}

PS_OUTPUT main(VS_OUTPUT In){	float4 diffuseSample = tex2DProj(diffuseRTsampler, In.TexCoords);	// and so on...}

Sorry for my HLSL code :)

[Edited by - mikaelc on February 17, 2008 3:52:25 PM]

##### Share on other sites
Thanks for the help guys, I shall go try and implement it and see if I can completely grasp the concept and stuff.

A huge thanks to the all posters that replied.

##### Share on other sites
Soory for waking up this thread but i have problem with texcoord part that mikaelc was posted:
[code]
// VS
oPosition.x = ((oPosition.x + oPosition.w) * Screen.x + oPosition.w) * Target.x;
oPosition.y = ((oPosition.w - oPosition.y) * Screen.y + oPosition.w) * Target.y;

Output.TexCoords = oPosition;
// PS
float4 diffuseSample = tex2DProj(diffuseRTsampler, In.TexCoords);
[/code]

This is not working for me as i get wrong result out. Can someone confirm is this right way to obtain TexCoords in Light pass to be able to sample position/normal witch are in view space (texture).

This is diffuse tex from gbuffer:
[img]http://www.dodaj.rs/f/2U/mU/3J2W4YPX/diff.jpg[/img]

And i put light in between two columns, and here is result:
[img]http://www.dodaj.rs/f/H/tD/3DTKFf7S/light.jpg[/img]

[code]
float4 diff = tex2DProj(diffuseRTsampler, In.TexCoords);
...
OUT.Color = float4(diff.rgb, 1.0f);// Just to test if texcoords are right
[/code]

##### Share on other sites

[s]Actualy it has some offset, and i can't see how this can be as i don't move my camera in between gbuffer & light pass:[/s]

Ahhhh. Solved. Wrong position for brace:
[code]
//1.0f / ((float)TARGET_SIZE.cx * 2.0f), 1.0f / ((float)TARGET_SIZE.cy[b])[/b] * 2.0f
1.0f / ((float)TARGET_SIZE.cx * 2.0f), 1.0f / ((float)TARGET_SIZE.cy * 2.0f)
[/code]

## Create an account

Register a new account

• ### Forum Statistics

• Total Topics
628334
• Total Posts
2982147

• 9
• 24
• 9
• 9
• 13