Upcoming Events
Southwest Gaming Expo
11/20 - 11/22 @ Dallas, TX

Workshop on Network and Systems Support for Games (NetGames 2009)
11/23 - 11/25 @ Paris, France

ICIDS 2009 Interactive Storytelling
12/9 - 12/11 @ Guimarães, Portugal

Global Game Jam
1/29 - 1/31  

More events...


Quick Stats
6602 people currently visiting GDNet.
2341 articles in the reference section.

Help us fight cancer!
Join SETI Team GDNet!



Link to us

Link to us

  Intel sponsors gamedev.net search:   

The rendering process

Now we need to cover the more interesting aspects of shadow rendering - the actual rendering process! I'm going to dive straight in with this - you may wish to look through the DirectX-SDK shadow volume sample to get familiar with 1-light shadowing, although this isn't required.

It is crucial to this algorithm that you understand that the shadow rendering CANNOT be done all in one go. By that I mean that it is a built up process of rendering the correct geometry in the correct way - stage by stage. Due to this fact, it may well be worth investing some time in pre-computing regularly used data. As you'll see a bit later on, the geometry may well be transformed and rendered several times each frame - therefore any complex animation system might have a much heavier impact on the frame rate. The solution would be to animate the mesh once at the start of the frame, and then render from this static mesh.

The following is an overview of how the algorithm will work:

[Frame Starts]

1. Clear colour, depth and stencil buffers
2. Disable all forms of lighting
3. Disable texturing and all shaders
4. Disable writing to the colour buffer and stencil buffer
5. Render the ENTIRE scene

6. Enable ambient light only
7. Enable all texturing and shading
8. Configure the colour buffer for additive rendering
9. Disable Z-writing (Z-testing remains on)
10. Render the ENTIRE scene

11. Disable ambient lighting
12. For each light in the scene
    a. Clear the stencil buffer
    b. Disable colour buffer rendering
    c. Disable texturing and lighting
    d. Configure the stencil buffer for first pass
    e. Render shadow geometry with CCW culling
    f. Configure the stencil buffer for second pass
    g. Render shadow geometry with CW culling
    h. Disable all lights EXCEPT the current light
    i. Enable texturing and shading
    j. Configure colour buffer for additive rendering
    k. Render all geometry influenced by the current light [Frame Ends]

The above may look complicated, and in truth it is - however as soon as you understand the general idea of a single pass it's only a case of doing the same thing over-and-over again.

The above steps have been divided into 3 main sections; this is deliberate as it helps to split apart the different sections.

Lines 2 - 5 are often referred to as the "Z-Fill Pass", that is, we fill the Z-buffer with a correct representation of the scene. Notice that for the rest of the algorithm we just test against this but never actually write to it again. This is one of a few bandwidth-saving tricks.

Lines 6 - 10 are for ambient lighting, obviously these can skip if you don't want to use ambient lighting for the current frame/world. The reason this light source is separated from the others is because ambient lights can't cast shadows - so we don't need to complicate issues by including shadow volumes/stencil operations.

Lines 11 - 12 (inc. sub-parts of 12) are the real meat of the algorithm - this is where you will spend the majority of your execution time. I will discuss this in more detail shortly, but for now just think of it like this: we render each pass to be the contribution by the selected light, but certain areas will be masked off as shadowed (no contribution from the selected light).

Z-Fill< Rendering Pass
The following code is taken from the sample program, and represents the Z-Fill stage of the rendering process.

//colour buffer OFF
pDev->SetRenderState( D3DRS_ALPHABLENDENABLE, TRUE );
pDev->SetRenderState( D3DRS_SRCBLEND, D3DBLEND_ZERO );
pDev->SetRenderState( D3DRS_DESTBLEND, D3DBLEND_ZERO );
//lighting OFF
pDev->SetRenderState( D3DRS_LIGHTING, FALSE );
//depth buffer ON (write+test)
pDev->SetRenderState( D3DRS_ZENABLE, TRUE );
pDev->SetRenderState( D3DRS_ZWRITEENABLE, TRUE );
//stencil buffer OFF
pDev->SetRenderState( D3DRS_STENCILENABLE, FALSE );

//FIRST PASS: render scene to the depth buffer
for ( int i = 0; i < 8; i++ ) {
  mshCaster[i]->Render( pDev );
}

It's actually a very trivial piece of code, and just relies on a good knowledge of the available render state configurations. The render state changes are basically there to stop the scene rendering affecting anything other than the depth buffer.

The reason for doing this stage is two fold - firstly, we need the Z-data when it comes to rendering the shadow volumes, and secondly it greatly reduces overdraws later on. A crude approach to this technique would be to re-fill the depth buffer for each light-pass, but that would be rather pointless.

Overdraw is a big issue in current real-time environments where complex shaders and texture stages are routinely used; most modern hardware won't process the texturing/pixel-shading stage if the pixel fails the Z-test (that is, it is behind other pixels). Because all passes in this algorithm can Z-test against a properly filled depth buffer we can hope that the GPU will reject a reasonable percentage of pixels without wasting time on any complex texturing effects.

Ambient Lighting Pass

As far as rendering steps that actually affect the final result, this is the simplest.

//colour buffer ON
pDev->SetRenderState( D3DRS_ALPHABLENDENABLE, TRUE );
pDev->SetRenderState( D3DRS_SRCBLEND, D3DBLEND_ONE );
pDev->SetRenderState( D3DRS_DESTBLEND, D3DBLEND_ONE );
//ambient lighting ON
pDev->SetRenderState( D3DRS_LIGHTING, TRUE );
pDev->SetRenderState( D3DRS_AMBIENT, AMB_LIGHT );
pDev->LightEnable( 0, FALSE );
pDev->LightEnable( 1, FALSE );
pDev->LightEnable( 2, FALSE );
pDev->LightEnable( 3, FALSE );

//SECOND PASS: render scene to the colour buffer
for ( int i = 0; i < 8; i++ ) {
  mshCaster[i]->Render( pDev );
}

Ambient light is an interesting issue in that it can be thought of as a light source - yet it doesn't follow the same rules (or syntax) as the majority of lights that you'll be using. Also, due to its nature it doesn't cast any shadows. Therefore it needs to be separated from the rest of the lighting passes.

By the time we get to this stage of the rendering process the colour buffer and stencil buffer should be empty (black and '0' respectively), yet the depth buffer should have a perfect image of our scene. When we render our entire scene for the second time we should only need to render the pixels that are actually visible (that is, there should be no, or minimal, overdraw). The pixels rendered should be very fast - with no other lighting the calculations are only held back by any texture combinations being used. Some of the more elaborate shading effects could be skipped for this stage - bump mapping for example has little effect when using only an ambient light.

Each Light Rendering Pass
Introduction

int iSVolIdx = 0;
for ( int i = 0; i < iLightCount; i ++ ) {

  //if  enabled && casting shadows
  if ( lList[i]->bEnabled && lList[i]->bCastsShadows ) {

    //clear stencil buffer
    pDev->Clear( 0, NULL, D3DCLEAR_STENCIL, D3DCOLOR_XRGB(0,0,0), 1.0f, 0 );

    //turn OFF colour buffer
    pDev->SetRenderState( D3DRS_ALPHABLENDENABLE, TRUE );
    pDev->SetRenderState( D3DRS_SRCBLEND, D3DBLEND_ZERO );
    pDev->SetRenderState( D3DRS_DESTBLEND, D3DBLEND_ONE );

    //disable lighting (not needed for stencil writes!)
    pDev->SetRenderState( D3DRS_LIGHTING, FALSE );

    //turn ON stencil buffer
    pDev->SetRenderState( D3DRS_STENCILENABLE, TRUE );
    pDev->SetRenderState( D3DRS_STENCILFUNC,  D3DCMP_ALWAYS );

    //render shadow volume
    //OPTIMISED: use support for 2-sided stencil
    if ( b2SidedStencils ) {
      //USE THE LATEST 1-PASS METHOD
        
      //configure the necessary render states
      pDev->SetRenderState(   D3DRS_STENCILPASS, D3DSTENCILOP_INCR );
      pDev->SetRenderState(   D3DRS_CULLMODE, D3DCULL_NONE );
      pDev->SetRenderState( D3DRS_TWOSIDEDSTENCILMODE, TRUE );

      //render the geometry once to the stencil buffer
      for ( int j = iSVolIdx; j < ( iSVolIdx + 4 ); j++ ) {
        pVols[j]->Render( pDev );
      }

      //reset any necessary states
      pDev->SetRenderState( D3DRS_TWOSIDEDSTENCILMODE, FALSE );
      
    } else {
      //USE THE TRADITIONAL 2-PASS METHOD

      //set stencil to increment
      pDev->SetRenderState(   D3DRS_STENCILPASS, D3DSTENCILOP_INCR );
      //render front faces
      pDev->SetRenderState( D3DRS_CULLMODE, D3DCULL_CCW );
      for ( int j = iSVolIdx; j < ( iSVolIdx + 4 ); j++ ) {
        pVols[j]->Render( pDev );
      }
      //set stencil to decrement
      pDev->SetRenderState(   D3DRS_STENCILPASS, D3DSTENCILOP_DECR );
      //render back faces
      pDev->SetRenderState( D3DRS_CULLMODE, D3DCULL_CW );
      for ( int j = iSVolIdx; j < ( iSVolIdx + 4 ); j++ ) {
            pVols[j]->Render( pDev );
          }
      pDev->SetRenderState( D3DRS_CULLMODE, D3DCULL_CCW );
    }

    //Increment shadow volume index, slightly hack-ish but it'll do.
    //basically, there is no formula for the idx into pVols[] so we
    //just have to keep on counting...
    iSVolIdx += 4;

    //alter stencil buffer
    pDev->SetRenderState( D3DRS_STENCILFUNC, D3DCMP_GREATER );
    pDev->SetRenderState( D3DRS_STENCILPASS, D3DSTENCILOP_KEEP );

    //turn on CURRENT light, turn off all others
    pDev->SetRenderState( D3DRS_LIGHTING, TRUE );
    pDev->SetRenderState( D3DRS_AMBIENT, D3DCOLOR_XRGB(0,0,0) );
    pDev->LightEnable( 0, i == 0 ? TRUE : FALSE );
    pDev->LightEnable( 1, i == 1 ? TRUE : FALSE );
    pDev->LightEnable( 2, i == 2 ? TRUE : FALSE );
    pDev->LightEnable( 3, i == 3 ? TRUE : FALSE );
    
    //turn ON colour buffer
    pDev->SetRenderState( D3DRS_ALPHABLENDENABLE, TRUE );
    pDev->SetRenderState( D3DRS_SRCBLEND, D3DBLEND_ONE );
    pDev->SetRenderState( D3DRS_DESTBLEND, D3DBLEND_ONE );
    
    //render scene
    for ( int k = 0; k < 8; k++ ) {
      mshCaster[k]->Render( pDev );
    }

    //reset any necessary render states
    pDev->SetRenderState( D3DRS_STENCILENABLE, FALSE );

  } //if(enabled&&caster)

} //for(each light)

The code above looks far more complicated than it actually is. Unfortunately due to the nature of it being cut-n-pasted from the sample program, there are a few variables and object referred to that won't make any sense. For this reason I strongly suggest that you download the sample program.

The core of the above code is a simple For( ) loop, one that goes through each light source in the "world"; it is this For( ) loop you may want to alter if you implement some of the optimizations discussed later on. Inside the For( ) loop is a simple If( ) block; this block contains all the code to render one pass, but it will only do this if the light is turned on and if it is set to cast shadows (a feature of the lights in the sample code is that you can turn off their shadow-casting ability).

Once we've reached the code for actually rendering the pass with shadows, it is simply a case of following the list of instructions outlined at the very start of this section. The only really important thing to note is that I've included the code for a 2-sided stencil operation; this is a new feature allowed by Direct3D9 (provided driver support exists) and effectively allows the elimination of a transform/render for the shadow volume(s).

The above code is optimised for render states, that is, I've removed any unnecessary state changes, and moved some standard calls to the initialisation section of the sample. If we look at the depth/stencil configuration as found in the initialise code:

pDev->SetRenderState( D3DRS_STENCILREF,       0x1 );
pDev->SetRenderState( D3DRS_STENCILMASK,       0xffffffff );
pDev->SetRenderState( D3DRS_STENCILWRITEMASK, 0xffffffff );
pDev->SetRenderState( D3DRS_CCW_STENCILFUNC,  D3DCMP_ALWAYS );
pDev->SetRenderState( D3DRS_CCW_STENCILZFAIL,  D3DSTENCILOP_KEEP );
pDev->SetRenderState( D3DRS_CCW_STENCILFAIL,  D3DSTENCILOP_KEEP );
pDev->SetRenderState( D3DRS_CCW_STENCILPASS,  D3DSTENCILOP_DECR );
b2SidedStencils = ( ( caps.StencilCaps & D3DSTENCILCAPS_TWOSIDED ) != 0 );

It is the second line that is the key to realising why this technique is called depth-pass. It may seem odd that I'm talking about a "ZFAIL" render state with respect to a Z-Pass name; however if you think about it - the normal D3DRS_STENCILPASS state could be called D3DRS_STENCILZPASS. The bottom line is that in the above code we alter the stencil buffer on a _STENCILPASS but do nothing on a _STENCILZFAIL; a depth-fail algorithm would do the opposite.





Crucial Optimization Tips

Contents
  Introduction
  The rendering process
  Crucial Optimization Tips

  Source code
  Printable version
  Discuss this article