Archived

This topic is now archived and is closed to further replies.

sross

Optimizing Shadow Volume

Recommended Posts

sross    109
Hi, Ok I just succeed to implement shadow volume in my scene... altho it is so damn slow... around 1-2 fps when there is shadow on all my objects. What I don't understand is that I have taken the same algorithm coming from the dx9 sdk and the dx9 sdk sample is running at 20+fps while the shadow it has to render has more vertices/triangles than my the shadow in my scene. Also I'm not even rebuilding the shadow volume everytime the object rotates cuz I just do a rotate world and the lights rotate with the object... and in the dx9 sdk they are rebuilding it everytime the object is rotated and its not even causing any lag... All the lag in my app seems to be caused by the DrawPrimitive line in my render shadow function, if I comment out that line my app runs at around 40-45 fps... and all that line has to render is around 500 to 1k triangles... so why does my framerate drops of 40fps just to render 1k triangles? Here's the code I'm using: Class Shadows, this is a class that contains all the shadow objects in my scene and all it does is rendering the array of shadows.
        public class Shadows
	{
		private ArrayList shadows = new ArrayList();

		public Shadows()
		{
			//do nothing

		}

		public ArrayList GetShadows
		{
			get
			{
				return this.shadows;
			}
		}

		public void Render()
		{
			if( this.shadows.Count > 0 )
			{
				this.RenderShadow();
				this.DrawShadow();
			}
		}

		private void RenderShadow()
		{
			// Disable z-buffer writes (note: z-testing still occurs), and enable the

			// stencil-buffer

			Graphics.device.RenderState.ZBufferWriteEnable = false;
			Graphics.device.RenderState.StencilEnable = true;

			// Dont bother with interpolating color

			Graphics.device.RenderState.ShadeMode = ShadeMode.Flat;

			// Set up stencil compare fuction, reference value, and masks.

			// Stencil test passes if ((ref & mask) cmpfn (stencil & mask)) is true.

			// Note: since we set up the stencil-test to always pass, the STENCILFAIL

			// renderstate is really not needed.

			Graphics.device.RenderState.StencilFunction = Compare.Always;
			Graphics.device.RenderState.StencilZBufferFail = StencilOperation.Keep;
			Graphics.device.RenderState.StencilFail = StencilOperation.Keep;

			// If ztest passes, inc/decrement stencil buffer value

			Graphics.device.RenderState.ReferenceStencil = 0x1;
			Graphics.device.RenderState.StencilMask = unchecked((int)0xffffffff);
			Graphics.device.RenderState.StencilWriteMask = unchecked((int)0xffffffff);
			Graphics.device.RenderState.StencilPass = StencilOperation.Increment;

			// Make sure that no pixels get drawn to the frame buffer

			Graphics.device.RenderState.AlphaBlendEnable = true;
			Graphics.device.RenderState.SourceBlend = Blend.Zero;
			Graphics.device.RenderState.DestinationBlend = Blend.One;

			// With 2-sided stencil, we can avoid rendering twice:

			Graphics.device.RenderState.TwoSidedStencilMode = true;
			Graphics.device.RenderState.CounterClockwiseStencilFunction = Compare.Always;
			Graphics.device.RenderState.CounterClockwiseStencilZBufferFail = StencilOperation.Keep;
			Graphics.device.RenderState.CounterClockwiseStencilFail = StencilOperation.Keep;
			Graphics.device.RenderState.CounterClockwiseStencilPass = StencilOperation.Decrement;

			Graphics.device.RenderState.CullMode = Cull.None;

			// Draw both sides of shadow volume in stencil/z only

			//Graphics.device.Transform.World = objectMatrix;

			Graphics.device.VertexFormat = VertexFormats.Position;
			
			for( int i = 0; i < this.shadows.Count; i++ )
			{
				Graphics.device.SetStreamSource( 0, ((Shadow)this.shadows[i]).ShadowVertices, 0 );
                                //if i comment the next line i get 40-45 fps... else i'm only getting 1-2 fps

				Graphics.device.DrawPrimitives(PrimitiveType.TriangleList, 0, ((Shadow)this.shadows[i]).NumShadowVertices / 3 );
			}

			Graphics.device.RenderState.TwoSidedStencilMode = false;
			

			// Restore render states

			Graphics.device.RenderState.ShadeMode = ShadeMode.Gouraud;
			Graphics.device.RenderState.CullMode = Cull.CounterClockwise;
			Graphics.device.RenderState.ZBufferWriteEnable = true;
			Graphics.device.RenderState.StencilEnable = true;
			Graphics.device.RenderState.AlphaBlendEnable = false;
		}

		private void DrawShadow()
		{
			// Set renderstates (disable z-buffering, enable stencil, disable fog, and

			// turn on alphablending)

			Graphics.device.RenderState.ZBufferEnable = false;
			Graphics.device.RenderState.StencilEnable = true;
			Graphics.device.RenderState.FogEnable = false;
			Graphics.device.RenderState.AlphaBlendEnable = true;
			Graphics.device.RenderState.SourceBlend = Blend.SourceAlpha;
			Graphics.device.RenderState.DestinationBlend = Blend.InvSourceAlpha;

			Graphics.device.TextureState[0].ColorArgument1 = TextureArgument.TextureColor;
			Graphics.device.TextureState[0].ColorArgument2 = TextureArgument.Diffuse;
			Graphics.device.TextureState[0].ColorOperation = TextureOperation.Modulate;
			Graphics.device.TextureState[0].AlphaArgument1 = TextureArgument.TextureColor;
			Graphics.device.TextureState[0].AlphaArgument2 = TextureArgument.Diffuse;
			Graphics.device.TextureState[0].AlphaOperation = TextureOperation.Modulate;

			// Only write where stencil val >= 1 (count indicates # of shadows that

			// overlap that pixel)

			Graphics.device.RenderState.ReferenceStencil = 0x1;
			Graphics.device.RenderState.StencilFunction = Compare.LessEqual;
			Graphics.device.RenderState.StencilPass = StencilOperation.Keep;
			
			// Draw a big, gray square

			Graphics.device.VertexFormat = VertexFormats.Transformed | VertexFormats.Diffuse;

			Graphics.device.SetTexture(0 , null);

			for( int i = 0; i < this.shadows.Count; i++ )
			{
				Graphics.device.SetStreamSource(0, ((Shadow)this.shadows[i]).ShadowVertexBuffer, 0);
				Graphics.device.DrawPrimitives(PrimitiveType.TriangleStrip, 0, 2);
			}

			// Restore render states

			Graphics.device.RenderState.ZBufferEnable = true;
			Graphics.device.RenderState.StencilEnable = false;
			Graphics.device.RenderState.FogEnable = true;
			Graphics.device.RenderState.AlphaBlendEnable = false;
		}
	}
  
Class Shadow, this is a class that builds the vertex buffers of a shadow volume.
        public class Shadow
	{

		private VertexBuffer shadowVertexBuffer = null;
		private ArrayList shadowVerts;
		private VertexBuffer shadowVertices = null;

		private int numShadowVertices = 0;

		public Shadow()
		{
			
		}

		public int NumShadowVertices
		{
			get
			{
				return this.numShadowVertices;
			}
		}


		public VertexBuffer ShadowVertices
		{
			get
			{
				return this.shadowVertices;
			}
		}

		public VertexBuffer ShadowVertexBuffer
		{
			get
			{
				return this.shadowVertexBuffer;
			}
		}

		public void Create( VertexBuffer vb, int[] ib, PrimitiveType type )
		{
			// Note: the MeshVertex format depends on the FVF of the mesh

			Vector3 light = new Vector3();
			Matrix m = Matrix.Invert(Graphics.device.Transform.World);
			float x = Graphics.device.Lights[0].Position.X;
			float y = Graphics.device.Lights[0].Position.Y;
			float z = Graphics.device.Lights[0].Position.Z;
			light.X = x*m.M11 + y*m.M21 + z*m.M31 + m.M41;
			light.Y = x*m.M12 + y*m.M22 + z*m.M32 + m.M42;
			light.Z = x*m.M13 + y*m.M23 + z*m.M33 + m.M43;

			CustomVertex.PositionNormalTextured[] tempVertices = null;
			int[] index = null;
			int[] edges = null;

			int numFaces;
			if( type == PrimitiveType.TriangleList )
			{
				numFaces = ib.Length / 3;
			}
			else
			{
				numFaces = ib.Length - 2;
			}

			int numVerts = vb.Description.Size / 32;
			int numEdges = 0;

			// Allocate a temporary edge list

			edges = new int[numFaces * 6];

			// Lock the geometry buffers

			tempVertices = (CustomVertex.PositionNormalTextured[])vb.Lock( 0, 0 );

			// For each face

			for(int i=0; i<numFaces; i++)
			{
				int face0;
				int face1;
				int face2;
				if( type == PrimitiveType.TriangleList )
				{
					face0 = ib[3*i+0];
					face1 = ib[3*i+1];
					face2 = ib[3*i+2];
				}
				else
				{
					face0 = ib[i+0];
					face1 = ib[i+1];
					face2 = ib[i+2];
				}

				
				Vector3 v0 = new Vector3( tempVertices[face0].X, tempVertices[face0].Y, tempVertices[face0].Z );
				Vector3 v1 = new Vector3( tempVertices[face1].X, tempVertices[face1].Y, tempVertices[face1].Z );
				Vector3 v2 = new Vector3( tempVertices[face2].X, tempVertices[face2].Y, tempVertices[face2].Z );

				// Transform vertices or transform light?

				Vector3 vCross1 = v2 - v1;
				Vector3 vCross2 = v1 - v0;
				Vector3 vNormal = Vector3.Cross(vCross1, vCross2);

				if (Vector3.Dot(vNormal, light) >= 0.0f)
				{
					AddEdge(edges, ref numEdges, face0, face1);
					AddEdge(edges, ref numEdges, face1, face2);
					AddEdge(edges, ref numEdges, face2, face0);
				}
			}

			shadowVerts = new ArrayList();

			for (int i=0; i<numEdges; i++)
			{
				Vector3 v1 = new Vector3( tempVertices[edges[2*i+0]].X, tempVertices[edges[2*i+0]].Y, tempVertices[edges[2*i+0]].Z );
				Vector3 v2 = new Vector3( tempVertices[edges[2*i+1]].X, tempVertices[edges[2*i+1]].Y, tempVertices[edges[2*i+1]].Z );
				Vector3 v3 = v1 - light*10;
				Vector3 v4 = v2 - light*10;

				// Add a quad (two triangles) to the vertex list


				shadowVerts.Add( v1 );
				shadowVerts.Add( v2 );
				shadowVerts.Add( v3 );

				shadowVerts.Add( v2 );
				shadowVerts.Add( v4 );
				shadowVerts.Add( v3 );

				numShadowVertices += 6;
			}

			// Unlock the geometry buffers

			vb.Unlock();

			if ((shadowVertexBuffer == null) || (shadowVertexBuffer.Disposed))
			{
				// Create a big square for rendering the mirror, we don't need to recreate this every time, if the VertexBuffer

				// is destroyed (by a call to Reset for example), it will automatically be recreated and the 'Created' event fired.

				shadowVertexBuffer = new VertexBuffer(typeof(CustomVertex.TransformedColored), 4, Graphics.device, Usage.WriteOnly, VertexFormats.Transformed | VertexFormats.Diffuse, Pool.Default);
				CustomVertex.TransformedColored[] verts = (CustomVertex.TransformedColored[])shadowVertexBuffer.Lock(0, 0);
				float xpos = (float)Graphics.device.PresentationParameters.BackBufferWidth;
				float ypos = (float)Graphics.device.PresentationParameters.BackBufferHeight;
				verts[0] = new CustomVertex.TransformedColored( new Vector4(0, ypos, 0.0f, 1.0f), Color.FromArgb( 70, Color.Black ).ToArgb() );
				verts[1] = new CustomVertex.TransformedColored( new Vector4(0,  0, 0.0f, 1.0f), Color.FromArgb( 70, Color.Black ).ToArgb() );
				verts[2] = new CustomVertex.TransformedColored( new Vector4(xpos, ypos, 0.0f, 1.0f), Color.FromArgb( 70, Color.Black ).ToArgb() );
				verts[3] = new CustomVertex.TransformedColored( new Vector4(xpos,  0, 0.0f, 1.0f), Color.FromArgb( 70, Color.Black ).ToArgb() );
				shadowVertexBuffer.Unlock();
			}

			this.shadowVertices = new VertexBuffer( typeof( Vector3 ), shadowVerts.Count, Graphics.device, 0, VertexFormats.Position, Pool.Default );
			this.shadowVertices.SetData( shadowVerts.ToArray( typeof( Vector3 ) ), 0, 0 );
		}

		private void AddEdge(int[] edges, ref int numEdges, int v0, int v1)
		{
			// Remove interior edges (which appear in the list twice)

			for(int i=0; i < numEdges; i++)
			{
				if ((edges[2*i+0] == v0 && edges[2*i+1] == v1) ||
					(edges[2*i+0] == v1 && edges[2*i+1] == v0))
				{
					if (numEdges > 1)
					{
						edges[2*i+0] = edges[2*(numEdges-1)+0];
						edges[2*i+1] = edges[2*(numEdges-1)+1];
					}
					numEdges--;
					return;
				}
			}
			edges[2*numEdges+0] = v0;
			edges[2*numEdges+1] = v1;
			numEdges++;
		}
	}
Anyone has any clue why it's so slow? Stéphane Ross Game Gurus Entertainment [edited by - sross on September 1, 2003 3:20:01 PM] [edited by - sross on September 1, 2003 3:20:55 PM]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster
When you use the stencil buffer on some cards you need to run in 32bit color mode (and/or 32bit depthbuffer mode) or it will be slow.

/__fold

Share this post


Link to post
Share on other sites
sross    109
My video card does not support the 32 bit depthbuffer mode... so thats not the problem... and the dx9 sample would also be slow if that would be the problem cuz its using the D24S8 mode when i''m running it, the same i''m using in my app.

Stéphane Ross
Game Gurus Entertainment

Share this post


Link to post
Share on other sites
mohamed adel    174
how many pixels of the screen are occupied by the shadow volume?the stencil test is a per pixel effect and the shadow volume in the directx sample occupis a small no of pixels on the screen (this can''t be counted to know the exact no but can be known by sight).

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster
How many shadows do you render?

I use the same technique as you do and with 10 objects in the scene it runs at 50 fps.

But I''m planning on batching visible shadows together in order to minimize the number of draw primitive calls...

Maybe you could consider doing the same

Share this post


Link to post
Share on other sites
sross    109
thx for the replies,

quote:

how many pixels of the screen are occupied by the shadow volume?



Well do you mean only visible shadow pixels or only the visible shadow, cuz I think there is a lot of shadow that is hidden behind parts of the model.

quote:

How many shadows do you render?



well there are one shadow per side of window in my scene, so for a square window there are 4 shadows, also there is 1 shadow per grill, so in a normal scene I end up with around 6 shadows.

quote:

But I'm planning on batching visible shadows together in order to minimize the number of draw primitive calls



Yea thats probly something I should try to do, but how would you proceed to takeoff the invisible shadow and shadows that overlaps from the vertex buffers?

EDIT: here are some screenshot if you want to see the sahdow in my scene...
Window with Shadow Volume 1
Window with Shadow Volume 2
Window with Shadow Volume 3

Stéphane Ross
Game Gurus Entertainment

[edited by - sross on September 3, 2003 9:33:26 AM]

Share this post


Link to post
Share on other sites
mohamed adel    174
I think the no of pixels the shadow volume occupies on the screen as the depth pass test is done whether the shadow volume is visible or hidden behind something.you can make sure of that by turning the camera away so that the shadow volume is behind the camera , if the frame rate increased then this is the cause of the problem.

Share this post


Link to post
Share on other sites
sross    109
ok yea, that seems to be the problem, the framerate is at 55 fps when all the shadow volume is behind the camera... so If I understand what you mean, I would have to take off all the shadow that is not visible in my scene? How should I proceed to eliminate that useless shadow? Use a z-buffer testing? Use a bsp tree? or any other method? what would you suggest?

thx for the help

Stéphane Ross
Game Gurus Entertainment

Share this post


Link to post
Share on other sites
mohamed adel    174
you can turn off the color drawing by setting the D3DRS_COLORWRITEENABLE render as you are just filling the stencil buffer.I did so using a vertex shader and It incresed the frame rate slightly from 240 to 270 fps.also decreasing the resolution would affect the frame rate dramatically.I didn''t use a hidden surface removal technique before so I won''t be able to tell which is the best.

Share this post


Link to post
Share on other sites
sross    109
I was just looking into my code... could changeing this line( in Shadows.RenderShadow method described above):

Graphics.device.RenderState.StencilZBufferFail = StencilOperation.Keep;

to

Graphics.device.RenderState.StencilZBufferFail = StencilOperation.Zero;

could take off all the shadow that is not visible? I'm not at home right now so I can't try it... I'll try this tonight...

Stéphane Ross
Game Gurus Entertainment

[edited by - sross on September 4, 2003 11:29:19 AM]

Share this post


Link to post
Share on other sites