Upcoming Events
VIEW Conference 2009
11/4 - 11/7 @ Turin, Italy

Project Horseshoe
11/5 - 11/8 @ Burnet, TX

Independent Game Conference West
11/5 - 11/6 @ Los Angeles, CA

IGDA Leadership Forum
11/12 - 11/13 @ San Francisco, CA

More events...


Quick Stats
6692 people currently visiting GDNet.
2337 articles in the reference section.

Help us fight cancer!
Join SETI Team GDNet!



Link to us

Link to us

  search:   

Crucial Optimization Tips

I'm going to use this section to briefly round up a few optimization notes that you really should take the time to implement if you are going to make serious use of this technique.

I've found in my testing that the real key to good performance with this technique is to manage your resources well. I know this holds true for most 3D applications, but in this case we will be using the same/similar geometry over and over again on each frame. The number of active lights will often multiply any reduction of processing we can get.

I mentioned earlier in the article that caching animated meshes at the start of the frame (where possible) might be beneficial. The idea of caching geometry doesn't just have to apply here - it can be applied to both the construction of shadow volumes and the persistence of shadow volumes across frames (if the light doesn't move, why update the shadow volume). If frame rate is particularly poor then a movement delta can be employed; if you calculate the speed at which the light and/or geometry is moving then you may choose to skip the re-calculation of shadow volumes (and wait until a noticeable change in geometry/lighting has occurred). The sample code has this as an option - update the shadow volumes only every n frames; and it works quite well. Setting the delay to "2" (update every 3rd frame) then you can gain 40fps without much of a loss in quality.

The other really crucial optimization - one that far outweighs any other - is to render only the geometry that you must. This is an old adage really, and applies across the computer graphics spectrum; however with shadow rendering it requires a slightly different look at what geometry is rendered. Key points to note are:

1. Geometry off screen can cast shadows on-screen, such that view-frustum culling should be used carefully if you use it to select which shadow volumes are to be rendered.

2. A light can only cast shadows as far as it can illuminate. Thus there is no point generating a shadow volume for an object 200m away from a point light with a range of 100m.

3. Each light requires a whole rendering pass, so we only want to enable lights that will have a visible effect. This is actually very easy, comparing a sphere (point light) against the view-frustum planes is trivial, and we know that directional lights and ambient lights will almost always have some influence.

The final optimization tip to note can be quite powerful - with a few clever changes to the code you can eliminate the need for a separate Z-Fill and Ambient lighting passes, the net result is you save the transform time for an entire render of the scene. The following code shows how you do this:

if ( bAmbientLight ) {
//colour buffer ON
  pDev->SetRenderState( D3DRS_ALPHABLENDENABLE, FALSE );
  pDev->SetRenderState( D3DRS_SRCBLEND, D3DBLEND_ONE );
  pDev->SetRenderState( D3DRS_DESTBLEND, D3DBLEND_ONE );
  //ambient lighting ON
  pDev->SetRenderState( D3DRS_LIGHTING, TRUE );
  pDev->SetRenderState( D3DRS_AMBIENT, AMB_LIGHT );
  pDev->LightEnable( 0, FALSE );
  pDev->LightEnable( 1, FALSE );
  pDev->LightEnable( 2, FALSE );
  pDev->LightEnable( 3, FALSE );

} else {  
  //colour buffer OFF
  pDev->SetRenderState( D3DRS_ALPHABLENDENABLE, TRUE );
  pDev->SetRenderState( D3DRS_SRCBLEND, D3DBLEND_ZERO );
  pDev->SetRenderState( D3DRS_DESTBLEND, D3DBLEND_ZERO );
  //lighting OFF
  pDev->SetRenderState( D3DRS_LIGHTING, FALSE );

} //if(bAmbientLight)

//depth buffer ON (write+test)
pDev->SetRenderState( D3DRS_ZENABLE, TRUE );
pDev->SetRenderState( D3DRS_ZWRITEENABLE, TRUE );
//stencil buffer OFF
pDev->SetRenderState( D3DRS_STENCILENABLE, FALSE );

//FIRST PASS: render scene to the depth buffer
for ( int i = 0; i < 8; i++ ) {
  mshCaster[i]->Render( pDev );
}

pDev->SetRenderState( D3DRS_ZWRITEENABLE, FALSE );

Combined with this tip, I also want to suggest looking at Direct3D state blocks for render states. Note that it's a fairly repetitive algorithm we use each frame, such that it may make good sense to use state blocks instead of the multiple SetRenderState( ) calls.

Considerations when using this technique

I've decided to separate this section from optimizations, as this is more of a conceptual view of the algorithm. The following things could possibly be optimised or have their efficiency improved but it's unlikely.

The most major consideration and one that I've brushed over a couple of times now is that the algorithm requires "n+1" passes - both with rasterization and transformation. In a well lit environment it is possible to have n+1 overdraw on the majority of pixels - meaning that you're wonderful pixel shader or texture stage setup could hurt you many times more than it did when you didn't have shadows..

The lights in this sample program use additive blending - that is, if you put lots of lights in a scene you are likely to end up with a very bright final image. As a consequence you may wish to reduce the brightness of some lights, or even look at a different combiner (modulate instead of additive for example).

Source code for this article

You can download the source code for this article by clicking on this link: here.

The source code should be fairly straightforward to follow, it was written with Visual C++ .Net (2002), but should compile fine with Visual C++ 6 and/or other compilers. The code is commented throughout, and there are a set of #define's at the top of each source module allowing you to customise various properties.

When you have the source code running, you can press "T" twice to get a list of the controls for the sample.

References

The following selection of references is by no means a definitive list, more of a list of those that I found useful when I did my research and wrote this article…

Books

Real-Time Rendering Tricks and Techniques in DirectX (ISBN: 1-931841-27-6), Kelly Dempski
I found this book to be a good overview of all algorithms at an applied level. Chapter 27 covers simple planar shadows, chapter 28 covers shadow volumes (as in this article) and chapter 29 covers shadow maps. An interesting thing to note is that the authors coverage of shadow volumes includes the use of vertex shader extrusion.

Real-Time Rendering, Second Edition (ISBN: 156881-182-9), Tomas Akenine-Möller and Eric Haines
This is one of those books that all serious graphics programmers seem to have, and whilst it has little applied content (and no samples), it is a great overview and all round discussion of real-time computer graphics. Chapter 6, part 12 has a good overview of real-time shadow rendering research. Makes for good background reading.

Game Programming Gems (ISBN: 1-58450-049-2), Edited by Mark DeLoura
An excellent all round book, chapter 5.7 by Yossarian King covers planar shadows ("Ground-Plane Shadows"). Chapter 5.8 by Gabor Nagy covers projective shadows ("Real-Time Shadows on Complex Objects").

Game Programming Gems 2 (ISBN: 1-58450-054-9), Edited by Mark DeLoura
The sequel to the previously listed book, is also very good and has another couple of articles on shadowing. Chapter 4.10, "Self Shadowing Characters", by Alex Vlachos, David Gosselin and Jason L. Mitchell (ATI Research) isn't the best article around, but might be interesting to some people. Chapter 5.6, "Practical Priority Buffer Shadows", by Sim Dietrich (Nvidia) introduces a more optimal way (although the only hardware I know of that supports this is Nvidia's) of rendering projective shadows.

Game Programming Gems 3 (ISBN: 1-58450-233-9), Edited by Dante Treglia
The second sequel in the ever-popular series, this time only features on article on shadowing. Chapter 4.6, "Computing Optimized Shadow Volumes for Complex Data Sets", by Alex Vlachos and Drew Card (ATI Research) explain an algorithm for picking the correct triangles to render for projective shadow rendering - despite the name, it's not amazingly useful when using the technique explained in this article.

Websites

"The Theory of Stencil Shadow Volumes" by Hun Yen Kwoon
http://www.gamedev.net/columns/hardcore/shadowvolume/

"Cg Shadow Volumes" by Razvan Surdulescu
http://www.gamedev.net/columns/hardcore/cgshadow/

Nvidia's Robust Shadow Volumes paper (includes link to Carmacks Reverse)
http://developer.nvidia.com/object/robust_shadow_volumes.html

"Stencil Shadow Volumes with Shadow Extrusion using ASM or HLSL"
http://www.booyah.com/article04-dx9.html

"Z Pass to Z Fail - Capping shadow volumes" - An interesting topic if you wish to extend the code in this article
http://www.gamedev.net/community/forums/topic.asp?topic_id=179947

About the author

Jack Hoxley is currently studying for a BSc in Computer Science at the University of Nottingham, England; and has been interested in computer graphics for a long time now. Jack also runs (time permitting) www.DirectX4VB.com - a collection of over 100 tutorials regarding all aspects of Microsoft's DirectX API. He can be contacted via email: Jack.Hoxley@DirectX4VB.com or jjh02u@cs.nott.ac.uk




Contents
  Introduction
  The rendering process
  Crucial Optimization Tips

  Source code
  Printable version
  Discuss this article