Jump to content
  • Advertisement
Sign in to follow this  
kovacsp

Shadow volume optimization problem

This topic is 5407 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi all, I'm trying to optimize my code for FPS, since I think it's terribly slow. The main bottleneck is the stencil shadow part as I figured out. I've created a benchmark that makes the camera rotate around an object by 1 degrees - so I have 360 RenderScreens, and measure the total time (and fps). The scene consists of an object having 5534 vertices and 3088 tris. It uses about 5 materials, and they are rendered sorted by material. I have one pass for rendering non-transparent parts and another for rendering transparent ones. I use only one light source. If I don't switch on anything except transparency, I get 2.35 secs (152 fps). This is not that much, however this would be enough. But when I switch on z-fail shadows, it drops to 23,68 secs (15,2 fps). So I tried commenting all render state changes that are in connection with the stencil shadows, and rendered the shadow volume as it is (white), I got 9.3 sec (38 fps). The object and the shadow volume together consist of 21100 vertices and 11536 tris. I post relevant code here, maybe you can spot something. (It compiles with both dx8 and dx9 by some #define-s, now I'm talking about the dx9 part, that's what I'm measuring) device creation params:
void DDRENDERING_VIEW::BuildPresentParamsFromSettings(CD3DSettings& d3dSettings)
{
    d3dParams.Windowed               = true;
    d3dParams.BackBufferCount        = 1;	// means double buffering (1 back buffer)
    d3dParams.MultiSampleType        = D3DMULTISAMPLE_NONE;

    d3dParams.SwapEffect             = D3DSWAPEFFECT_DISCARD;
    d3dParams.EnableAutoDepthStencil = true;
    d3dParams.hDeviceWindow          = m_pView->GetSafeHwnd();
#if defined(DX81M_MODULE)
	d3dParams.Flags					 = 0;
	d3dParams.FullScreen_PresentationInterval = 0;
	d3dParams.SwapEffect			 = D3DSWAPEFFECT_COPY;
#elif defined(DX9M_MODULE)
	d3dParams.Flags					 = D3DPRESENTFLAG_DISCARD_DEPTHSTENCIL;
    d3dParams.MultiSampleQuality     = d3dSettings.MultisampleQuality();
	d3dParams.PresentationInterval = D3DPRESENT_INTERVAL_IMMEDIATE; 
#endif
	d3dParams.AutoDepthStencilFormat = d3dSettings.DepthStencilBufferFormat();
	d3dParams.BackBufferWidth		= clientRect.right - clientRect.left;
	d3dParams.BackBufferHeight		= clientRect.bottom - clientRect.top;
	d3dParams.BackBufferFormat		= d3dSettings.PDeviceCombo()->BackBufferFormat;
	d3dParams.FullScreen_RefreshRateInHz = 0;
} //BuildPresentParamsFromSettings


this is how I create the shadow volume to render: (The shadow volume is computed in our own geometry module, and data is loaded from there. But that code is irrevelant now, since it's computed only once, it's just rendered then (static world, only camera is moving)) I post this part here because of the createmeshfvf and the optimizeinplace
void GEOMETRYELEMDD::InitializeShadowVolumes(void)
{
	EM::GeometryElem* ge = GetGeometryElem();
	assert(ge != NULL);
	if (ge == NULL)
		return;

	for (long iShadowVol = 0; iShadowVol < ge->GetShadowVolumeCount(); iShadowVol++) {
		if (ge->GetShadowVolume(iShadowVol) == NULL) 
			continue;

		const Mesh3D& mesh = *(ge->GetShadowVolume(iShadowVol));

		nShadowTriangleVertices	= mesh.GetTriVertexCount();
		nShadowTriangles			= mesh.GetTriangleCount();

		HRESULT hr;
		ID3DXMesh* shadowVolumeMesh = NULL;
		if (FAILED(hr = D3DXCreateMeshFVF(nShadowTriangles, nShadowTriangleVertices, D3DXMESH_SYSTEMMEM | D3DXMESH_WRITEONLY, D3DFVF_XYZ, rView->pd3dDevice, &shadowVolumeMesh)))
			continue;

		assert(shadowVolumeMesh != NULL);
		if (shadowVolumeMesh == NULL)
			continue;

		// II.1. Fill in the vertex data
		D3DXVECTOR3*  triVertices = NULL;
		if (FAILED(hr = shadowVolumeMesh->LockVertexBuffer(0, (DX89(BYTE,void)**)&triVertices)))
			continue;

		for (long i = 0; i < nShadowTriangleVertices; i++) {
			const GM::TriVertex3D& trivertex = mesh.GetTriVertex(i);
			CCVector3D pos = trivertex.pos;
			ConvertVector3D2D3DXVector(pos, triVertices); 
			ConvertMeter2MiliMeter(triVertices);
		}

		// II.2. Fill in index, texture data
		WORD* triangleIndices = NULL;
		m_userIdPerTriangle = new long[nTriangles];
		if (FAILED(hr = shadowVolumeMesh->LockIndexBuffer(0, (DX89(BYTE,void)**)&triangleIndices)))
			continue;

		for (long i = 0; i < nShadowTriangles; i++) {
			const GM::Triangle3D& triangle = mesh.GetTriangle(i);
			triangleIndices[3 * i]	    = triangle.triVert1Id;
			triangleIndices[3 * i + 1]	= triangle.triVert2Id;
			triangleIndices[3 * i + 2]	= triangle.triVert3Id;
		}

		shadowVolumeMesh->UnlockVertexBuffer();
		shadowVolumeMesh->UnlockIndexBuffer();

		// opt
		DWORD *adjacency = new DWORD[3*shadowVolumeMesh->GetNumFaces()];
		hr = shadowVolumeMesh->GenerateAdjacency(0.0f, adjacency);
		hr = shadowVolumeMesh->OptimizeInplace(D3DXMESHOPT_COMPACT | D3DXMESHOPT_ATTRSORT | D3DXMESHOPT_VERTEXCACHE, adjacency, NULL, NULL, NULL);

		dxShadowVolumeMeshes.Add(shadowVolumeMesh);
	}
} // InitializeShadowVolumes


This is where I set up things for shadow rendering (both single and two sided stencil):
void DDRENDERING_VIEW::RenderZFailShadow(const EM::Camera& camera)
{
	CCDX9::CD3DStateGuard guard(pd3dDevice);

	long currLight = 0;
	long maxLights = GetShadowCastableLightMaxIndex();
	for (long iLight = 0; iLight < maxLights; iLight++) {
		EM::Light* light = m_elementManager->GetLight(iLight);
		if (light == NULL || !light->IsCastShadow())
			continue;
		SetLight(iLight, m_graphicsSettings.MustRenderShadow());
		//pd3dDevice->LightEnable(0, FALSE);

		// depth buffer writing OFF
		guard.SetRenderState( D3DRS_ZWRITEENABLE, FALSE );
		// clear stencil buffer
		//pd3dDevice->Clear(0, NULL, D3DCLEAR_STENCIL, D3DCOLOR_XRGB(0,0,0), 1.0f, 0);
		// turn OFF colour buffer (now we wanna write to the stencil buffer only)
		guard.SetRenderState(D3DRS_ALPHABLENDENABLE, TRUE);
		guard.SetRenderState(D3DRS_SRCBLEND, D3DBLEND_ZERO); 
		guard.SetRenderState(D3DRS_DESTBLEND, D3DBLEND_ONE); 
		// disable lighting (not needed for stencil writes anyway)
		guard.SetRenderState(D3DRS_LIGHTING, FALSE);
		// turn ON stencil buffer
		guard.SetRenderState( D3DRS_STENCILENABLE, TRUE );
		guard.SetRenderState( D3DRS_STENCILFUNC,  D3DCMP_ALWAYS );
		guard.SetRenderState( D3DRS_CCW_STENCILFUNC,  D3DCMP_ALWAYS );

		if (m_use2SidedStencil) {
			//set stencil to increment if z test fails, else keep
			guard.SetRenderState( D3DRS_STENCILPASS, D3DSTENCILOP_KEEP );
			guard.SetRenderState( D3DRS_STENCILFAIL, D3DSTENCILOP_KEEP );
			guard.SetRenderState( D3DRS_STENCILZFAIL, D3DSTENCILOP_INCR );
			//set stencil to decrement if z test fails, else keep
			guard.SetRenderState( D3DRS_CCW_STENCILPASS, D3DSTENCILOP_KEEP );
			guard.SetRenderState( D3DRS_CCW_STENCILFAIL, D3DSTENCILOP_KEEP );
			guard.SetRenderState( D3DRS_CCW_STENCILZFAIL, D3DSTENCILOP_DECR );
			//render back faces
			guard.SetRenderState( D3DRS_CULLMODE, D3DCULL_NONE );
			guard.SetRenderState( D3DRS_TWOSIDEDSTENCILMODE, TRUE );

			// render shadow volumes for this light
			for (long i = 0; i < m_elementManager->GetElemCount(); i++) {
				EM::Renderable* elem = dynamic_cast<EM::Renderable*>(m_elementManager->GetElem(i));
				if(elem == NULL || !elem->IsDrawable() || (elem->GetShowType() != EM::Shaded))
					continue;
				guard.SetRenderState(D3DRS_SLOPESCALEDEPTHBIAS, F2DW(m_zSlopeScaleZFail));
				guard.SetRenderState(D3DRS_DEPTHBIAS, F2DW(m_zBiasZFail)); // 0.000001 egesz jo
				RenderShadow(camera, elem, currLight);
				guard.SetRenderState(D3DRS_SLOPESCALEDEPTHBIAS, F2DW(0.0));
				guard.SetRenderState(D3DRS_DEPTHBIAS, F2DW(0.0));
			}

			// switch off 2 sided stencil
			guard.SetRenderState( D3DRS_TWOSIDEDSTENCILMODE, FALSE );
			// from now, render front faces again (normal operation)
			guard.SetRenderState( D3DRS_CULLMODE, D3DCULL_CCW );
		}
		else {
			//set stencil to decrement if z test fails, else keep
			guard.SetRenderState( D3DRS_STENCILPASS, D3DSTENCILOP_KEEP );
			guard.SetRenderState( D3DRS_STENCILFAIL, D3DSTENCILOP_KEEP );
			guard.SetRenderState( D3DRS_STENCILZFAIL, D3DSTENCILOP_INCR );
			//render back faces
			guard.SetRenderState( D3DRS_CULLMODE, D3DCULL_CW );

			// render shadow volumes for this light
			for (long i = 0; i < m_elementManager->GetElemCount(); i++) {
				EM::Renderable* elem = dynamic_cast<EM::Renderable*>(m_elementManager->GetElem(i));
				if(elem == NULL || !elem->IsDrawable() || (elem->GetShowType() != EM::Shaded))
					continue;
				guard.SetRenderState(D3DRS_SLOPESCALEDEPTHBIAS, F2DW(m_zSlopeScaleZFail));
				guard.SetRenderState(D3DRS_DEPTHBIAS, F2DW(m_zBiasZFail));
				RenderShadow(camera, elem, currLight);
				guard.SetRenderState(D3DRS_SLOPESCALEDEPTHBIAS, F2DW(0.0));
				guard.SetRenderState(D3DRS_DEPTHBIAS, F2DW(0.0));
			}

			//set stencil to increment if z test fails, else keep
			guard.SetRenderState( D3DRS_STENCILPASS, D3DSTENCILOP_KEEP );
			guard.SetRenderState( D3DRS_STENCILFAIL, D3DSTENCILOP_KEEP );
			guard.SetRenderState( D3DRS_STENCILZFAIL, D3DSTENCILOP_DECR );
			//render front faces
			guard.SetRenderState( D3DRS_CULLMODE, D3DCULL_CCW );

			// render shadow volumes for this light
			for (long i = 0; i < m_elementManager->GetElemCount(); i++) {
				EM::Renderable* elem = dynamic_cast<EM::Renderable*>(m_elementManager->GetElem(i));
				if(elem == NULL || !elem->IsDrawable() || (elem->GetShowType() != EM::Shaded))
					continue;
				guard.SetRenderState(D3DRS_SLOPESCALEDEPTHBIAS, F2DW(m_zSlopeScaleZFail));
				guard.SetRenderState(D3DRS_DEPTHBIAS, F2DW(m_zBiasZFail));
				RenderShadow(camera, elem, currLight);
				guard.SetRenderState(D3DRS_SLOPESCALEDEPTHBIAS, F2DW(0.0));
				guard.SetRenderState(D3DRS_DEPTHBIAS, F2DW(0.0));
			}

			// from now, render front faces again (normal operation)
			guard.SetRenderState( D3DRS_CULLMODE, D3DCULL_CCW );
		} // if (m_use2SidedStencil) 

		// switch lighting ON
		guard.SetRenderState(D3DRS_LIGHTING, TRUE);

		// now draw only if stencil buffer enables it
		guard.SetRenderState( D3DRS_STENCILENABLE, TRUE );

		// reset stencil ops
		guard.SetRenderState( D3DRS_STENCILREF,  0x0 );
		guard.SetRenderState( D3DRS_STENCILFUNC, D3DCMP_EQUAL );
		guard.SetRenderState( D3DRS_STENCILPASS, D3DSTENCILOP_KEEP );
		guard.SetRenderState( D3DRS_STENCILFAIL, D3DSTENCILOP_KEEP );
		guard.SetRenderState( D3DRS_STENCILZFAIL, D3DSTENCILOP_KEEP );
		guard.SetRenderState( D3DRS_CCW_STENCILPASS, D3DSTENCILOP_KEEP );
		guard.SetRenderState( D3DRS_CCW_STENCILFAIL, D3DSTENCILOP_KEEP );
		guard.SetRenderState( D3DRS_CCW_STENCILZFAIL, D3DSTENCILOP_KEEP );

		// turn on the current light, disable all others
		EM::DirectionalLight* dlight = dynamic_cast<EM::DirectionalLight*>(m_elementManager->GetLight(iLight));
		if (dlight != NULL && iLight == 0 && dlight->GetDirection().z > 0)		// Sun doesn't shine at night
			pd3dDevice->LightEnable(0, FALSE);
		else
			pd3dDevice->LightEnable(0, TRUE);
		guard.SetRenderState( D3DRS_AMBIENT, 0x00000000 );

		//turn ON colour buffer, additive
		guard.SetRenderState( D3DRS_ALPHABLENDENABLE, TRUE );
		guard.SetRenderState( D3DRS_SRCBLEND, D3DBLEND_ONE );
		guard.SetRenderState( D3DRS_DESTBLEND, D3DBLEND_ONE );

		// grass, render if only no background
		Grid*	grid = m_elementManager->GetGrid();
		if (m_backgroundDD == NULL && grid->GetShowType() == EM::Shaded && grid->IsDrawable())
			Render(grid);

		// render scene
		for (long i = 0; i < m_elementManager->GetElemCount(); i++) {
			EM::Renderable* elem = dynamic_cast<EM::Renderable*>(m_elementManager->GetElem(i));
			if (elem == NULL || !elem->IsDrawable() || elem->GetShowType() == EM::LinesOnly)
				continue;
			Render(elem);
		}

		SetLight(iLight, false);
		currLight++;
	}	// END FOR

	SetLight(0, m_graphicsSettings.MustRenderShadow());
}


And this is the actual shadow volume rendering (render(elem))above:
void GEOMETRYELEMDD::RenderShadowVolume(long iLight)
{
	if (iLight < 0 || iLight >= dxShadowVolumeMeshes.GetCount())
		return;

	SetTransform(NULL);
	dxShadowVolumeMeshes[iLight]->DrawSubset(0);

	rView->m_verticesDrawn += nShadowTriangleVertices;
	rView->m_trisDrawn += nShadowTriangles;
}


Now I tried everything I ever read about on forums and tuts. I know that the drawsubset isn't that good, but it must draw the entire shadow volume here, so I don't think that something else would be significantly perform better. As you can see, I tried optimizing the mesh, creating the vertex and index buffers writeonly. I think I don't have too much renderstate changes (I cannot do it with less, I need these all to achieve correct operation). So why is this code so terribly slow? Any ideas? Thanks for your kind help, Peter

Share this post


Link to post
Share on other sites
Advertisement
For your information:

the application runs windowed, around 1100x800, r8g8b8, d24s8
the machine is a radeon 9600 with a p4 2,4GHz

Peter

Share this post


Link to post
Share on other sites
I have only one suggestion.

In my engine i use shadow buffer for every critical vb. It means i've got another copy in the system mem. So i don't use any lock to get the primitives.
I think it would boost your fps too.

Share this post


Link to post
Share on other sites
Ironicaly I was just reading an entire chapter in the book GPU Gems about optimizing shadow volumes. I would pick it up at the book store and check it out. It included numerous pages on this subject and was very informative. I'm sure if you follow the guide you'll achieve significant improvements in performance.

Share this post


Link to post
Share on other sites
Thanks for your answers, guys!

Mohamed: Yes, fps decreases when the shadows are visible. What's the implication then?

EverIce: how do I do that exactly?

toysnob: I'll try to find the book - unfortunately books like this aren't sold here in book stores.. thanks for the tip!

Peter

Share this post


Link to post
Share on other sites
just one thing that I want to make sure of in your message, when you disable the stencil buffer the frame rate is 38 fps? This means that when rendering a simple mesh (or meshes) wiht no materials or effects, then you get 38 fps?

Share this post


Link to post
Share on other sites
I get 38 fps when I render the meshes as they are and render the shadow volumes as simple meshes too (that is, I don't switch on stenciling, don't change lights, alpha blending, zbias, nothing). This way the shadow volumes appear as white meshes.

What are you thinking about? I'm very interested now :)

Peter

Share this post


Link to post
Share on other sites
Iam thinking that the problem is not in the shadow volume rendering , but in the mesh them selves.
The impact due to shadow volume rendering should be very much less than what you have,specially that your shadow volume is a static mesh and your card has a high fill rate.
try to search for the reason why the rendering of the shadow volume mesh itself is slow (regardless of shadow volume issues), because I couldn't find a reason why it is so slow.
another thing : do you have any messages from the debug run time in the output window?

Share this post


Link to post
Share on other sites
Thanks for your answer, I will try to render only the sahdow volume and nothing else.. unfortunatley only tomorrow.
Since the code isn't here I can't tell you exactly, but as far as I can remember, i get some rendeundant renderstate warnings (only a few), and sometimes "unable to create hardware indexbuffer", but this was told to be "don't care" on the dxdev list. Maybe I should care?

Peter

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!