Crucial Optimization TipsI'm going to use this section to briefly round up a few optimization notes that you really should take the time to implement if you are going to make serious use of this technique. I've found in my testing that the real key to good performance with this technique is to manage your resources well. I know this holds true for most 3D applications, but in this case we will be using the same/similar geometry over and over again on each frame. The number of active lights will often multiply any reduction of processing we can get. I mentioned earlier in the article that caching animated meshes at the start of the frame (where possible) might be beneficial. The idea of caching geometry doesn't just have to apply here - it can be applied to both the construction of shadow volumes and the persistence of shadow volumes across frames (if the light doesn't move, why update the shadow volume). If frame rate is particularly poor then a movement delta can be employed; if you calculate the speed at which the light and/or geometry is moving then you may choose to skip the re-calculation of shadow volumes (and wait until a noticeable change in geometry/lighting has occurred). The sample code has this as an option - update the shadow volumes only every n frames; and it works quite well. Setting the delay to "2" (update every 3rd frame) then you can gain 40fps without much of a loss in quality. The other really crucial optimization - one that far outweighs any other - is to render only the geometry that you must. This is an old adage really, and applies across the computer graphics spectrum; however with shadow rendering it requires a slightly different look at what geometry is rendered. Key points to note are: 1. Geometry off screen can cast shadows on-screen, such that view-frustum culling should be used carefully if you use it to select which shadow volumes are to be rendered. 2. A light can only cast shadows as far as it can illuminate. Thus there is no point generating a shadow volume for an object 200m away from a point light with a range of 100m. 3. Each light requires a whole rendering pass, so we only want to enable lights that will have a visible effect. This is actually very easy, comparing a sphere (point light) against the view-frustum planes is trivial, and we know that directional lights and ambient lights will almost always have some influence. The final optimization tip to note can be quite powerful - with a few clever changes to the code you can eliminate the need for a separate Z-Fill and Ambient lighting passes, the net result is you save the transform time for an entire render of the scene. The following code shows how you do this:
if ( bAmbientLight ) {
//colour buffer ON
pDev->SetRenderState( D3DRS_ALPHABLENDENABLE, FALSE );
pDev->SetRenderState( D3DRS_SRCBLEND, D3DBLEND_ONE );
pDev->SetRenderState( D3DRS_DESTBLEND, D3DBLEND_ONE );
//ambient lighting ON
pDev->SetRenderState( D3DRS_LIGHTING, TRUE );
pDev->SetRenderState( D3DRS_AMBIENT, AMB_LIGHT );
pDev->LightEnable( 0, FALSE );
pDev->LightEnable( 1, FALSE );
pDev->LightEnable( 2, FALSE );
pDev->LightEnable( 3, FALSE );
} else {
//colour buffer OFF
pDev->SetRenderState( D3DRS_ALPHABLENDENABLE, TRUE );
pDev->SetRenderState( D3DRS_SRCBLEND, D3DBLEND_ZERO );
pDev->SetRenderState( D3DRS_DESTBLEND, D3DBLEND_ZERO );
//lighting OFF
pDev->SetRenderState( D3DRS_LIGHTING, FALSE );
} //if(bAmbientLight)
//depth buffer ON (write+test)
pDev->SetRenderState( D3DRS_ZENABLE, TRUE );
pDev->SetRenderState( D3DRS_ZWRITEENABLE, TRUE );
//stencil buffer OFF
pDev->SetRenderState( D3DRS_STENCILENABLE, FALSE );
//FIRST PASS: render scene to the depth buffer
for ( int i = 0; i < 8; i++ ) {
mshCaster[i]->Render( pDev );
}
pDev->SetRenderState( D3DRS_ZWRITEENABLE, FALSE );
Combined with this tip, I also want to suggest looking at Direct3D state blocks for render states. Note that it's a fairly repetitive algorithm we use each frame, such that it may make good sense to use state blocks instead of the multiple SetRenderState( ) calls. Considerations when using this techniqueI've decided to separate this section from optimizations, as this is more of a conceptual view of the algorithm. The following things could possibly be optimised or have their efficiency improved but it's unlikely. The most major consideration and one that I've brushed over a couple of times now is that the algorithm requires "n+1" passes - both with rasterization and transformation. In a well lit environment it is possible to have n+1 overdraw on the majority of pixels - meaning that you're wonderful pixel shader or texture stage setup could hurt you many times more than it did when you didn't have shadows.. The lights in this sample program use additive blending - that is, if you put lots of lights in a scene you are likely to end up with a very bright final image. As a consequence you may wish to reduce the brightness of some lights, or even look at a different combiner (modulate instead of additive for example). Source code for this articleYou can download the source code for this article by clicking on this link: here. The source code should be fairly straightforward to follow, it was written with Visual C++ .Net (2002), but should compile fine with Visual C++ 6 and/or other compilers. The code is commented throughout, and there are a set of #define's at the top of each source module allowing you to customise various properties. When you have the source code running, you can press "T" twice to get a list of the controls for the sample. ReferencesThe following selection of references is by no means a definitive list, more of a list of those that I found useful when I did my research and wrote this article… BooksReal-Time Rendering Tricks and Techniques in DirectX (ISBN: 1-931841-27-6), Kelly Dempski
Real-Time Rendering, Second Edition (ISBN: 156881-182-9), Tomas Akenine-Möller and Eric Haines
Game Programming Gems (ISBN: 1-58450-049-2), Edited by Mark DeLoura
Game Programming Gems 2 (ISBN: 1-58450-054-9), Edited by Mark DeLoura
Game Programming Gems 3 (ISBN: 1-58450-233-9), Edited by Dante Treglia
Websites"The Theory of Stencil Shadow Volumes" by Hun Yen Kwoon "Cg Shadow Volumes" by Razvan Surdulescu Nvidia's Robust Shadow Volumes paper (includes link to Carmacks Reverse) "Stencil Shadow Volumes with Shadow Extrusion using ASM or HLSL" "Z Pass to Z Fail - Capping shadow volumes" - An interesting topic if you wish to extend the code in this article About the authorJack Hoxley is currently studying for a BSc in Computer Science at the University of Nottingham, England; and has been interested in computer graphics for a long time now. Jack also runs (time permitting) www.DirectX4VB.com - a collection of over 100 tutorials regarding all aspects of Microsoft's DirectX API. He can be contacted via email: Jack.Hoxley@DirectX4VB.com or jjh02u@cs.nott.ac.uk … |
|