Jump to content
  • Advertisement
Sign in to follow this  
Paul C Skertich

Why does my static mesh drop FPS from 60 to 4?

This topic is 1329 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

At first thought my graphics card was puking chips. The static mesh I created is a cup that has 18,000 total polygons. I noticed the FPS to drop from 60 to 4-8. The cup is being rendered in the deferred renderer.

 

I also wanted to see how Unreal Engine 4 handle it - so I loaded it inside Unreal Engine 4 and it was just fine. I'm not the expert level as in Epic Games but my game rendering is suffering a bunch of frame drops.

 

I loaded a game character from another game that is about 12K total of polygon - I loaded three of them inside my level editor - frame dropped from 60 to 10 frames per second. In the game Amnesia they have multiple of those creatures chasing the player - so obviously my renderering system blows. 

 

I disabled the deferred shading and lowered down the shadow map resolution to 1024 x 1024 which was 2048 x 2048. Didn't really help much.

 

I use std::vectors to store the mesh data and such.

 

Each rendering scene object and I will get around to changing it:

void SceneObject::RenderSceneMesh(GraphicsDevice *device, MaterialShader &mShader, XMMATRIX &world, XMMATRIX &view, XMMATRIX &proj) {

		if (isCulled) {

			UINT stride = sizeof(ObjectVertexData);
			UINT offset = 0;

			device->devcon->IASetVertexBuffers(0, 1, &vertexBuffer, &stride, &offset);
			device->devcon->IASetIndexBuffer(indexBuffer, DXGI_FORMAT_R32_UINT, 0);
			device->devcon->IASetInputLayout(mShader.eInputLayout);


			D3D11_MAPPED_SUBRESOURCE map;
			D3D11_MAPPED_SUBRESOURCE camMap;

			cameraConstantBuff *cameraCBuffer;
			MESHES_CB_BUFFER *sceneCBuffer;

			XMMATRIX mWorld;
			mWorld = XMMatrixIdentity();
			XMMATRIX invView;


		//	ZeroMemory(&map, sizeof(D3D11_MAPPED_SUBRESOURCE));
			device->devcon->Map(PrimaryConstantBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &map);

			sceneCBuffer = (MESHES_CB_BUFFER*)map.pData;

			world = scaleMatrix * rotationMatrix * translationMatrix;
			XMMATRIX wvp = world * view * proj;

			sceneCBuffer->WVP = XMMatrixTranspose(wvp);
			sceneCBuffer->WorldMatrix = XMMatrixTranspose(world);
			sceneCBuffer->viewMatrix = XMMatrixTranspose(view);
			sceneCBuffer->projectionMatrix = XMMatrixTranspose(proj);
			sceneCBuffer->modelWorld = XMMatrixTranspose(world);

			XMVECTOR det;
			XMMATRIX invWorld = XMMatrixInverse(&det, world);
			sceneCBuffer->invWorldMatrix = invWorld;

			sceneCBuffer->UVTile.x = gMaterial.TextureTile.x;
			sceneCBuffer->UVTile.y = gMaterial.TextureTile.y;

			if (!isPlaced) {
				sceneCBuffer->ghostModeEnabled = XMFLOAT2(1.0f,0.0f);
			}
			else {
				sceneCBuffer->ghostModeEnabled = XMFLOAT2(0.0f,0.0f);
				isPlaced = true;
			}
			sceneCBuffer->isSelected = XMFLOAT2(isSelected, 0.0f);
			sceneCBuffer->padding = XMFLOAT2(0, 0);

			device->devcon->Unmap(PrimaryConstantBuffer, 0);

			//ZeroMemory(&camMap, sizeof(D3D11_MAPPED_SUBRESOURCE));
			device->devcon->Map(cameraConstantBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &camMap);

			cameraCBuffer = (cameraConstantBuff*)camMap.pData;

			//cameraCBuffer->reflectionMatrix = XMMatrixLookAtLH(reflectionPosition, reflectionLookAt, reflectionAim);

			cameraCBuffer->cameraPosition = cameraPosition;
			cameraCBuffer->padding = XMFLOAT4(0, 0, 0, 0);

			device->devcon->Unmap(cameraConstantBuffer, 0);

			D3D11_MAPPED_SUBRESOURCE lightMapped;
			//ZeroMemory(&lightMapped, sizeof(D3D11_MAPPED_SUBRESOURCE));

			device->getDeviceContext()->Map(lightCB, 0, D3D11_MAP_WRITE_DISCARD, 0, &lightMapped);

			lightConstantBuffer *lightCBuff;
			lightCBuff = (lightConstantBuffer*)lightMapped.pData;
			
			XMMATRIX lightProjWS = XMLoadFloat4x4(&ShadowProjWS);
			XMMATRIX ShadowWS = world * lightProjWS;
			lightCBuff->lightViewMatrix = XMMatrixTranspose(ShadowWS);
			lightCBuff->lightProjMatrix = XMLoadFloat4x4(&ShadowProjWS);

			device->getDeviceContext()->Unmap(lightCB, 0);

			device->devcon->VSSetShader(mShader.eVertexShader, 0, 0);
			device->devcon->PSSetShader(mShader.ePixelShader, 0, 0);

			ID3D11Buffer *constantbuffers[3] = { PrimaryConstantBuffer, cameraConstantBuffer, lightCB };

			device->devcon->VSSetConstantBuffers(0, 3, constantbuffers);
			device->devcon->PSSetConstantBuffers(0, 3, constantbuffers);
			
			ID3D11ShaderResourceView* srvs[7] = { diffuseSRV, normalSRV, specularSRV, ambientOccSRV, displacementSRV, ShadowMapSRV , SRV3D};
			device->devcon->PSSetShaderResources(0, 7, srvs);

			ID3D11SamplerState *samplers[2] = { device->pointTextureSampleState, device->clampTextureSampleState };
			
			device->devcon->PSSetSamplers(0, 2, samplers);
		
			//device->devcon->RSSetState(device->SoldMode);
			device->enableDepthBuffer();

			device->devcon->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST);

			device->devcon->DrawIndexed(renderObject.numIndices(), 0, 0);

			aabb.isBuilt = true;
			ID3D11ShaderResourceView *nullSRVS[7] = { 0 };
			device->getDeviceContext()->PSSetShaderResources(0, 7, nullSRVS);
			
		}

	}

Could it be the dyanmic constant buffers that are hurting performance? Thanks for chiming in.

 

Share this post


Link to post
Share on other sites
Advertisement

You set a lot of GPU parameters and resources, apparently for each object. One improvement would be to sort your objects by shader, and by textures. If you have a lot of objects using the same shader, the same view/projection, etc., no need to update the resources, reload the same shader and resources, etc.

 

I can't for the life of me find the reference for a list of rendering tips here on gamedev, but, among them that may be applicable:

 

- Don't use 32 bit indices. Use 16-bit. Why send 4 bytes per index to the GPU when 2 will suffice? E.g., for 18000 triangles, that's 54000 indices, within range of a 16-bit index. 54000 indices = 216K @ 32-bit, but only 108K @ 16-bit - just half the throughput for a single object.

- Sort opaque objects front-to-back to reduce overdraw. You can do a lot of CPU sorting in the time it takes to overdraw an occluded object in the GPU.

- If you have several duplicate objects (characters, etc.) which have the same vertex/index buffer, render them with the same shader, vertex/index buffers, etc.

Edited by Buckeye

Share this post


Link to post
Share on other sites

Can you confirm that the presented code is the bottle neck?

 

I can see that you are setting lots of redundant parameters such as render targets and camera and light constants. Still I'm not totally convinced that the above code could hurt the fps as much as you have observed. How many times the above code is called when rendering your big object?

 

Are you using debug flags for the device? If not, you should maybe check out for invalid function calls? If yes, it hurts performance too.

 

Cheers!

Edited by kauna

Share this post


Link to post
Share on other sites

Why are you using:

 

device->devcon

 

In part of the code, but also using:


device->getDeviceContext()

 

In other parts of it?  Does your getDeviceContext call do anything special?  At the very least this is a code smell, at worst getDeviceContext may be doing a whole lot of unnecessary extra work.

Share this post


Link to post
Share on other sites

Have you tried to use the frame analyzer from Visual Studio 2013 or use the RenderDoc? With these programs you can analayze a frame and see the performance of the GPU for every draw call. It is very useful to know where is the bottle neck exactly.

Share this post


Link to post
Share on other sites

I use std::vectors to store the mesh data and such.

 

I'm no expert but maybe its something simple like you're accidentally recreating your resources every frame instead of once.  I'm most likely wrong but I figured I'd try to help.

 

edit- I'm wrong I misread you and didn't realize you said framerate drops, sorry.

Edited by Infinisearch

Share this post


Link to post
Share on other sites

You set a lot of GPU parameters and resources, apparently for each object. One improvement would be to sort your objects by shader, and by textures. If you have a lot of objects using the same shader, the same view/projection, etc., no need to update the resources, reload the same shader and resources, etc.

 

I can't for the life of me find the reference for a list of rendering tips here on gamedev, but, among them that may be applicable:

 

- Don't use 32 bit indices. Use 16-bit. Why send 4 bytes per index to the GPU when 2 will suffice? E.g., for 18000 triangles, that's 54000 indices, within range of a 16-bit index. 54000 indices = 216K @ 32-bit, but only 108K @ 16-bit - just half the throughput for a single object.

- Sort opaque objects front-to-back to reduce overdraw. You can do a lot of CPU sorting in the time it takes to overdraw an occluded object in the GPU.

- If you have several duplicate objects (characters, etc.) which have the same vertex/index buffer, render them with the same shader, vertex/index buffers, etc.

Attempted to use 16 bit - got back a weird looking mesh. I globalized most of the constantbuffers[3] - the srvs[7] and samplers[7] instead of initializing in the render loop;. That didn't help either.

Share this post


Link to post
Share on other sites

Have you tried to use the frame analyzer from Visual Studio 2013 or use the RenderDoc? With these programs you can analayze a frame and see the performance of the GPU for every draw call. It is very useful to know where is the bottle neck exactly.

I looked at renderdoc for a few minutes. I came up with the notion that possible having srvs[5] and samplers[2] and constants[3] were killing the performance value but it didn't. However, I do have these D3D11 warning messages from the D3D11_DEBUG :

 

D3D11 WARNING: ID3D11DeviceContext::PSSetShaderResources: Resource being set to PS shader resource slot 5 is still bound on output! Forcing to NULL. [ STATE_SETTING WARNING #7: DEVICE_PSSETSHADERRESOURCES_HAZARD]
D3D11 WARNING: ID3D11DeviceContext::OMSetRenderTargets: Resource being set to OM RenderTarget slot 0 is still bound on input! [ STATE_SETTING WARNING #9: DEVICE_OMSETRENDERTARGETS_HAZARD]
D3D11 WARNING: ID3D11DeviceContext::OMSetRenderTargets[AndUnorderedAccessViews]: Forcing PS shader resource slot 5 to NULL. [ STATE_SETTING WARNING #7: DEVICE_PSSETSHADERRESOURCES_HAZARD]
D3D11 WARNING: ID3D11DeviceContext::PSSetShaderResources: Resource being set to PS shader resource slot 5 is still bound on output! Forcing to NULL. [ STATE_SETTING WARNING #7: DEVICE_PSSETSHADERRESOURCES_HAZARD]
The program '[1096] SICStudio.exe' has exited with code 0 (0x0).
 

 

I'm trying to fix them.

Share this post


Link to post
Share on other sites

I disabled the shadow map and increased fps and no D3D11 Warning Messages. So the issue likes also in the deferreredRendering.cpp

 

So you set the render targets

-clear render targets

-render scene to render targets

-reset render target

-reset viewport

-set back to backbuffer and depth stencil

- render scene to default render target

 

How would I normally reset the render target for the shadow map? So the shadow map then can be used for the shader resource input

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!