Upcoming Events
5th Australasian Conference on Interactive Entertainment
12/3 - 12/5 @ Brisbane, Australia

2K Bot Prize
12/15 - 12/18 @ Perth, Australia

IEEE Symposium on Computational Intelligence and Games
12/15 - 12/18 @ Perth, Australia

IEEE Consumer Communications & Networking Conference
1/10 - 1/13 @ Las Vegas, NV

More events...


Quick Stats
3003 people currently visiting GDNet.
2240 articles in the reference section.

Help us fight cancer!
Join SETI Team GDNet!



Link to us

  search:   


The above image is a screenshot from the sample program, it shows 4 shadow casting objects, and 4 dynamic lights (plus 1 ambient). Running at approx. 65 fps on a Pentium-4 3.06ghz / Radeon-9800 pro. Notice how the shadows from different lights/objects interact and overlap each other.

Overview

Algorithm: Depth-pass stencil shadow volume rendering
Target Audience: Intermediate to advanced graphics programmers
Experience/Skill: Experience with Direct3D essential
Target Platform: Microsoft Direct3D 9.0 / Visual C++ 7

Introduction

Welcome to this article on shadow rendering in Direct3D 9.0; shadow rendering is not a particularly new effect - real time applications have been implementing various forms of shadows for many years now. However, it is only in the last couple of years that it has been possible (at reasonable speeds) on the majority of commercially available hardware. You can now expect to implement good quality shadows as a standard feature, and not just a fancy special effect.

Shadows are one of the most important light effects - far more so than fancy reflectance and BRDF models. The reason for this is that our brain makes great use of shadows to aid our spatial awareness. For example, if an object is hovering very slightly above a flat plane the addition of a shadow will aid greatly in identifying this. Also, as well as being useful for realism, they do look pretty cool - and you will see many of the next generation games creating incredible atmospheres using shadows and lights (think Doom-3, Half-Life 2 and Deus-Ex: The Invisible War).

The purpose of this article is to provide a solid example of "Depth-pass shadow volume rendering with multiple light sources". You'll see, as this article progresses, that rendering with one light source is very easy by comparison to rendering with 4 dynamic lights. You'll find lots and lots of examples online that cover the implementation of one light (there has been a good example in the last 3 versions of the DirectX SDK), but I aim to take it one step further.

The other important aim of this example is to primarily cover the implementation, and not get too bogged down in graphics theory. Theory will be discussed as and when it's needed, but for a deeper discussion you may wish to read one or more of the references.

The results of this article will be a working example with several shadow casting objects and up to 5 light sources dynamically moving each frame. Using the code in this article/download you should be able to implement the technique in your own applications. Whilst the article is aimed at Direct3D 9.0 hardware you could, with not much work, get it working under Direct3D 8.0.

Introduction to shadow volumes

What is a shadow volume? This is obviously a crucial question; luckily the answer is remarkably simple. The volume part refers to a "piece" of 3D space that is somehow identified as being different from the rest, much like you have "area" marking different sections of 2D space (e.g. A box drawn on a piece of paper). The shadow part refers to whether a certain point is inside or outside of a shadow (by the very nature of light, a point is either in or out of shadow - there is no in between1). Combine these two facts and you have a shadow volume defined as a piece of 3D space identified as being a shadow.

See the following diagram:


On the left of the diagram above, we see the shadow volumes. On the right we see just the final result. In the case of the LHS, any geometry that intersects (or is totally inside) the darkened areas will be shadowed.

When we use this in a real-time scenario we can almost think of it as a very simple ray tracer. For every pixel that we draw to the screen we will check to see if it is in a shadowed area, in order to do this we will compare it to a shadow volume; the ray tracing is done in the form of a line from the camera to the pixel being tested for entering and exiting a shadow volume. Technically speaking it isn't using conventional ray-tracing algorithms, but I find it a useful analogy to help explain the idea to people.

This next diagram shows how the ray-tracing aspect works. If you take the Blue circle to be where the camera is, and the black lines coming from it indicate the edges of the view frustum (we only see what is between these two lines). The magenta shapes indicate potential geometry to be rendered (we'll only consider this in 2D for now, 3D gets too difficult for diagrams). The Magenta circle is casting a grey shadow across the scene - this is the shadow volume. In simplistic terms, when we render a pixel to the screen it can be traced back as a ray to a tiny part of an object in the scene - and this is where my ray tracing analogy comes from. If we draw a line (3 in this case) from the camera to each object we can tell if it intersects the grey area (shadow). More importantly we can determine how many "sides" it intersects. If it intersects one side of the volume it has entered shadow, if it intersects two sides it has entered and then left the shadow, corresponding respectively to in shadow and out of shadow. In the diagram, the line to the square crosses 0 sides of the volume, and is not in shadow; the line to the pentagon crosses 1 side of the volume so it's in shadow. The line to the final object, a star, crosses 2 sides of the shadow (enters the volume and leaves it) so it is deemed as out of the shadow.

Shadow volume rendering is quite a fashionable algorithm currently, many of the graphically intensive games due to be released towards the end of 2003 will be using this technique. However, if you monitor the more advanced graphics programmers and the work they publish you will find that they have already moved on. Various clever techniques are becoming increasingly popular that give better and better results.

There are quite a few limitations to shadow volumes, in many respects they are the medium-level of shadow rendering - they aren't the fastest (planar shadows generally are) and they don't look the best (soft shadowing/projective shadows generally look better). However, for the speed and features you get, they are probably the most practical to implement currently.

An overview of the main limitations:

1. Hardware Stencil Buffering
As you'll see later on, the application requires the use of a stencil buffer. The majority of hardware since the GeForce / Radeon hardware will have reasonable support, but you can still find a few obscure chipsets. More importantly, you need as many bits for the stencil buffer as you can; you may be able to work with a 1-bit stencil, but ideally you want either a 4 or an 8 bit stencil buffer. Support for 8-bit stencils has only been common in the more recent hardware.

2. Bandwidth Intensive
The algorithm will chew up as much graphics card bandwidth as you can throw at it, especially when it comes to rendering with multiple light sources. Fill rate in particular is very heavily used; with a possible n+1 overdraw (where n is the number of lights). There are a few tricks you can use to reduce the trouble this causes.

3. No Soft Shadowing
The mathematical nature of a shadow volume dictates that there is no intermediary values - it's a Boolean operation - pixels are either in shadow or aren't. Therefore you can often see distinct aliased lines around shadows. Apart from relying on Anti Aliasing, the main way to get soft shadowing is to use additional multiple lights and get a "jittered" sample for the shadowed region, this can be computationally expensive.

4. Complicated for Multiple Light Sources
The majority of real-time scenes have several lights (4-5) enabled at any one point in time, whilst with this technique there is no limit to the number of lights (even if the device caps indicate a fixed number) the more lights that are enabled the slower the system goes. The two factors to watch are geometric complexity (and number of meshes) and light count, the best systems will use an algorithm to select only the most important lights, and only the affected geometry.

5. Problems When the Camera Intersects the Shadow Volume
There are quite serious issues with camera movement when using depth-pass shadow volume rendering. When the camera is positioned inside a shadow volume you will get noticeable artefacts appearing on-screen. There aren't any good solutions to this problem apart from switching to another, similar, rendering algorithm: Depth-Fail. A combination of depth fail rendering, shadow volume capping and projection tweaks allows for a robust shadow rendering procedure. The only trade off is that it is more complicated to implement, and marginally slower in the performance stakes. This article is focusing on depth-pass rendering, for depth-fail examples/information you'll have to check out the references and/or look around online.

As hinted to earlier, several other techniques exist to render shadows in a 3D scene. Covering them here in any detail is far too lengthy - have a look at the references section if you want some more detail.

The easiest shadow technique is known as "planar shadows" where the casting geometry is flattened onto a plane using some clever matrix maths. The result is a very simple shadow, only particularly good for drop shadows. It has been extensively used in racing games, as the road can often be thought of as flat.

A more advanced technique is to use shadow-maps, and project them onto the scene as a texture stage. This gets quite complicated and can rely on extremely large render targets (1024x1024 and above) for good results. When combined with several clever shading and blending tricks it is possible to get very impressive results with this method.

Back to shadow volumes, in order for the algorithm to give acceptable results there are several requirements. Firstly the hardware must support stencil buffers (preferably with 4 or 8 bits of precision), this isn't too hard to find on modern hardware. The second requirement is a little less obvious - but important nonetheless, the geometry casting shadows must be of a medium to high tessellation. At low tessellations the shadow volume generation doesn't always give correct results (look at the teapot mesh at low-detail in this articles sample program), which are usually very obvious to the user. However, at high tessellations it can be quite demanding on the processing power. The third requirement is that you don't use any clever tessellation methods - in particular high-order primitives and displacement mapping. Because these are calculated on the GPU it is very difficult to calculate the correct silhouette to use, methods do exist, but they are prohibitively slow.

How to create a shadow volume

The algorithm behind creating a shadow volume is actually extremely simple, its time complexity is due to the potentially large amount of data it has to process while generating the volume. Take the following image:

On the left shows the original source mesh, along with the position of the light this is all the information we have available to us. In the right-hand image we see the same mesh, but with a shadow volume extruded. Basically, we pick the edges highlighted in red, and extrude them some distance beyond the mesh.

The key point to appreciate is that when you view the scene from the light's point of view (looking at the mesh) you should see no shadow. When you view the mesh this way it should be fairly easy to tell where the edge of the mesh is - the silhouette of the object. Take the following illustration:

Notice in the above illustration that the hi-lighted edges correspond with distinct edges and vertices on the mesh? All we need from our algorithm is a way to identify these edges and we'll be in business.

If we think about the basic geometry maths - vectors and matrices - we will often come up with the Dot Product of two vectors. In this particular case it is the key equation to identifying the silhouette edges. If we take the normal for the triangle that the edge belongs to (process the edge twice if it belongs to two triangles) and 'dot' it with the vector from the vertex to the light we will get the cosine of the angle. If this value is less than or equal to 0.0 then it is facing away from the light. We can then take this to be a silhouette edge.

Once we've collected a list of silhouette edges, which can also be thought of as the start or "top" of the shadow volume, we need to extrude them. It is this process that actually gives us the geometry we can use to calculate where shadows lie in our scene. This is done by taking the vertex at each end of the edge and subtracting a multiple of the vertex-to-light vector. This has the effect of extruding it into the distance - preferably a long way into the distance.

As an overview:
1. For every triangle in the mesh calculate its normal
2. Calculate the dot-product of this normal with the vertex-to-light vector
3. IF the dot-product is negative or zero we add the edges to a list
4. Where possible remove duplicate edges and/or "interior" edges
5. When we have a list of silhouette edges we extrude them some distance.
6. We store these triangles in a vertex buffer.

As an important note, this is just a programming algorithm - it can be implemented in almost any programming language, for almost any graphics platform / API.

There are various optimizations that can be implemented in this algorithm, some more complicated than others.

The most obvious optimization to make is that of caching regularly used data. In the code I wrote for this article, the mesh data never actually changes - so there is no point in identifying all the edges (locking buffers, processing all vertices/triangles etc…) each frame / update. Instead, it makes sense to create a list of edges that is updated only if/when the mesh data is altered. We can then run through this array on each frame, rather than mess about with Direct3D vertex buffers and waste time calculating another 2000 normals.

A slightly modified version of caching the data is to reduce the amount of data cached. During testing I changed the algorithm to weld edges where the two normal's were within 30o of each other. Given that an awful lot of meshes have quite highly tessellated yet smooth surfaces you may not need to include every single edge. My testing found that I could knock 20-30% of the edges off my list with this technique - with minimal effect on the shadow volume generated.

The final optimization I wish to discuss is that of vertex shaders. If you are using a programmable architecture (I chose not to for this sample) then you could develop a vertex shader script to extrude shadows on the GPU. There are a few inefficiencies in doing this (multiple extrusions required), however the speed of GPU's almost eliminates these side effects. Discussing vertex shader based extrusion is a bit too lengthy for this article, so you'll have to look at the references and/or search for some different articles.

---

1 People often make a big thing about so-called "soft shadows", where the edges around a shadow appear to blend smoothly between shadowed and lit. My statement that there is no in-between therefore sounds a little odd. The reasoning for this is actually very simple, soft shadows are a feature of very complicated (comparatively) global illumination systems - ones where they factor in the reflection and transmission of light as energy (ray tracing and radiosity are good examples). If you calculate a light as being able to reflect off surrounding surfaces, or to be emitted from area-lights you will get these blurred edges. At the time of writing this article, real-time dynamic global illumination is difficult if not impossible.





The rendering process

Contents
  Introduction
  The rendering process
  Crucial Optimization Tips

  Source code
  Printable version
  Discuss this article