Jump to content
  • Advertisement
GalacticCrew

DX11 Why does Indexed Drawing slow down when I zoom in?

Recommended Posts

Hi folks,

I have a problem and I really could use some ideas from other professionals! I am developing my video game Galactic Crew including its own game engine. I am currently working on improved graphics which includes shadows (I use Shadow Mapping for that). I observed that the game lags, when I use shadows, so I started profiling my source code. I used DirectX 11's Queries to measure the time my GPU spends on different tasks to search for bottlenecks. I found several small issues and solved them. As a result, the GPU needs around 10 ms per frame, which is good enough for 60 FPS (1s / 60 frames ~ 16 ms/frame). See attachment Scene1 for the default view.

However, when I zoom into my scene, it starts to lag. See attachment Scene2 for the zoomed view. I compared the times spent on the GPU for both cases: default view and zoomed view. I found out that the render passes in which I render the full scene take much longer (~11 ms instead of ~2ms). One of these render stages is the conversion of the depth information to the Shadow Map and the second one is the final draw of the scene.

So, I added even more GPU profiling to find the exact problem. After several iteration steps, I found this call to be the bottleneck:

if (model.UseInstancing)
    _deviceContext.DrawIndexedInstanced(modelPart.NumberOfIndices, model.NumberOfInstances, 0, 0, 0);
else
    _deviceContext.DrawIndexed(modelPart.NumberOfIndices, 0, 0);

Whenever I render a scene, I iterate through all visible models in the scene, set the proper vertex and pixel shaders for this model and update the constant buffer of the vertex shader (if required). After that, I iterate through all positions of the model (if it does not use instancing) and iterate through all parts of the model. For each model part, I set the used texture maps (diffuse, normal, ...), set the vertex and index buffers and finally draw the model part by calling the code above. In one frame for example, 11.37 ms were spent drawing all models and their parts, when I zoomed it. From these 11.37 ms 11.35ms were spent in the drawing calls I posted above.

As a test, I simplified my rather complex pixel shader to a simple function that returns a fixed color to make sure, the pixel shader is not responsible for my performance problem. As it turned out, the GPU time wasn't reduced.

Does anyone of you have any idea what causes my lag, i.e. my long GPU time in the drawing calls? I don't use LOD or anything comparable and I also don't use my BSP scene graph in this scene. It is exactly the same content, but with different zooms. Maybe I missed something very basic. I am grateful for any help!!

Scene1.png

Scene2.png

Share this post


Link to post
Share on other sites
Advertisement

Have you tried using the visual studio graphics debugger? I'm not sure why its slower when you zoom in, but it does make sense that the draw calls are slower because everything you tell the gpu to do before you call draw, gets put into a command list, then actually only executed once you call draw, and the cpu is waiting for the draw call to finish before it continues executing.

This might be obvious, but do you have a scissor rect specified?

Share this post


Link to post
Share on other sites
8 hours ago, GalacticCrew said:

Does anyone of you have any idea what causes my lag, i.e. my long GPU time in the drawing calls?

Draw calls (and copies) are the only functions that actually ask the GPU to do drawing work. They should result in 99% of your GPU time. 

In the two pictures you posted, one has the entire screen filled with 3D objects, while the other has large amounts of background pixels, not covered with 3D geography. Is is possible that you're simply just drawing more pixels? 

Share this post


Link to post
Share on other sites

I haven't tried the graphics debugger of Visual Studio. I will try it out on Thursday (I'll take a day off tomorow).

As it turned out, I did not use Scissor Rectangles so far. I tested it today by setting it to the same size as my viewport. I did not see any change in the GPU for the different scenarios.

The background is a set of images drawn with Direct2D. Everthing except the background is in 3D. When I zoom in I draw more pixels in 3D than when I zoom out. Is this a problem? I though the calls for the vertex shader are the same, because all models are drawn in both cases (zoom in and zoom out). The pixel shader for my tests were set to a simple return of a static color.

Share this post


Link to post
Share on other sites

Lag is a confusing word to use here, it is normally used in games to refer to delay such as that due to internet ping, as opposed to a straight out performance problem which is what you seem to be describing.

10 hours ago, GalacticCrew said:

When I zoom in I draw more pixels in 3D than when I zoom out. Is this a problem? I though the calls for the vertex shader are the same, because all models are drawn in both cases (zoom in and zoom out).

Absolutely, drawing pixels can be, and often is the bottleneck.

10 hours ago, GalacticCrew said:

The pixel shader for my tests were set to a simple return of a static color.

This can be helpful, although you were still presumably running the shadow mapping? Have you tried turning off the shadow mapping and repeating the test?

The simplest way to determine whether you are fill rate bound is usually just to render to a postage stamp sized window, and compare the frame rate to running full screen (uncapped to vsync of course).

Share this post


Link to post
Share on other sites

The term "lag" is older than internet video games. It simply describes a situation in which there is no fluent movement. It can be caused by different situations. In my case, rendering a frame simply takes too long.

GPU Profiler

I started my game using Microsoft Visual Studio's built-in GPU performance profiler. After the game loaded, I waited in the default view for several second before zooming into the scene. After that, I stopped the profiler and checked the numbers. First, I had a look at a 1 second interval when my game was rendered in the default view. I sorted all tasks by their GPU time. The most time-consuming tasks were "GPU Work". The first DrawIndexed and DrawIndexedInstanced calls (the ones that took the most time) used 745,467 ns of GPU time.

Then, I selected 1s interval from the time when I had zoomed in. The most time consuming calls were DrawIndexed and DrawIndexedInstanced with 4,370,433 ns. So it is safe to say that drawing my models is what makes the game slow when I zoom in.

Scissor Rectangle

I must have missed this topic when I created the foundation of my game engine three years ago. I've read several tutorials and guides about it and added support for Scissor Tests into my game engine. As a first test, I used the same rectangle for Scissoring as I use for my Viewport. When I zoom in, I get basically the same numbers as I had before I used the Scissor Rectangle. However, when I zoom out into the default view, all numbers are the same except for the SwapChain-Present call. It takes around 11 ms instead 0d 0.5 as usual. I use the Present call like this:

_swapChain.Present(1, SharpDX.DXGI.PresentFlags.None);

As a test, I decreased the edge size of the Scissor rectangle by 50% and placed it in the center of the screen. So, only the 25% of the screen in the center are drawn. I started the game again and checked the numbers. I have the same result for the default view, which makes sense because almost the entire spaceship is in the central area of the screen. When I zoom in, my shadow map rendering stage takes 3,3ms instead of 11, which is around 28%. This means that rendering too many 3D pixels causes my increased PGU time.

Conclusion

After my tests, I know that drawing my (instanced) models use most of the GPU time. I also know using more screen space to render 3D models results in increasing GPU time. Now, I need to figure out how to reduce the GPU time for my draw calls.

Render steps

When rendering my scene, I do the following steps:

  • Get depth info. I render my scene from the point of view of my primary light source into a depth buffer. I will use this buffer for Shadow Mapping. This step takes round 2.2 ms in both scenarios (zoom in and out).
  • Get lighting info. I render the scene from the point of view of the player with the depth buffer texture I obtained in the previous step to create a light map. This light map indicates which pixels are colored and which are not. This is the step that is causing my troubles. It takes 2,5 ms when zoomed out, but more than 11ms when zoomed in.
  • Render scene. I render the scene again and use the light map to create shadows (Shadow Mapping). It takes 1,9 ms when zoomed out and 4 ms when zoomed in.
  • SwapChain.Present. This became more time consuming since I use Scissor Rectangles as described before.

Open question

When zoomed in, rendering my scene with texturing, lighting, Shadow Mapping, etc. takes twice as much time. However, building the light map (which does not use multi-texturing, lighting, etc.) takes almost 5 times more time. I need to figure out why! The vertex and pixel shader are not that complicated..

Share this post


Link to post
Share on other sites

Hodgman's question holds: in the zoomed out shot, only about 50% of the screen has pixels that require shadow depth tests where as the closer shot all pixels rendered will require shadow depth testing. It might be best to just show your shader code. 

Share this post


Link to post
Share on other sites

Sure! The code is pretty similar to examples you can find on sites like Rastertek. I moved the creation of a light map into its own render pass to enable operations on the light map like smoothing. However, I need to make it more efficient first.

Vertex shader

cbuffer ConstantBuffer : register(b0)
{
	matrix World;
	matrix View;
	matrix Projection;
	float4 Transparency;
	matrix ReflectionView;
	float4 LightPosition;
	matrix LightViewMatrix;
	matrix LightProjectionMatrix;
	float4 Instancing;
}

struct VS_IN
{
	float4 pos : POSITION;
	float3 nor : NORMAL;
	float3 tan : TANGENT;
	float3 bin : BINORMAL;
	float4 col : COLOR0;
	float4 TextureIndices : COLOR1;
	float2 TextureUV : TEXCOORD0;
	matrix TestMatrix : POSITION1;
};

struct PS_IN
{
	float4 pos : SV_POSITION;
	float3 nor : NORMAL;
	float4 lightViewPosition : TEXCOORD1;
};

PS_IN VS(VS_IN input)
{
	PS_IN output = (PS_IN)0;

	matrix worldMatrix = World;
	if (Instancing.r == 1.0f)
		worldMatrix = input.TestMatrix;

	// Calculate position.
	matrix worldViewProjection = mul(mul(Projection, View), worldMatrix);
	output.pos = mul(worldViewProjection, input.pos);

	// Calculate the position of the vertice as viewed by the light source.
	matrix lightViewProjection = mul(mul(LightProjectionMatrix, LightViewMatrix), worldMatrix);
	output.lightViewPosition = mul(lightViewProjection, input.pos);

	// Calculate the normal vector against the world matrix only and then normalize the final value.
	output.nor = mul((float3x3)worldMatrix, input.nor);
	output.nor = normalize(output.nor);

	return output;
}

Pixel shader

cbuffer ConstantBuffer : register(b0)
{
	float4 Settings;
	float4 CameraDir;
	float4 ViewDir;
	float4 LightPos;
	float4 LightDir;
	float4 LightCol;
}

Texture2DArray ShadowTextures : register(t4);

SamplerState SamplerWrap : register(s0);
SamplerState SamplerClamp : register(s1);

struct PS_IN
{
	float4 pos : SV_POSITION;
	float3 nor : NORMAL;
	float4 lightViewPosition : TEXCOORD1;
};

float4 PS(PS_IN input) : SV_Target
{
	// Set the default output color to the ambient light value for all pixels.
	float4 result = float4(0.2f, 0.2f, 0.2f, 1.0f);

	// Calculate the projected texture coordinates.
	float2 projectTexCoord;
	projectTexCoord.x = input.lightViewPosition.x / input.lightViewPosition.w / 2.0f + 0.5f;
	projectTexCoord.y = -input.lightViewPosition.y / input.lightViewPosition.w / 2.0f + 0.5f;

	// Determine if the projected coordinates are in the 0 to 1 range.  If so then this pixel is in the view of the light.
	if ((saturate(projectTexCoord.x) == projectTexCoord.x) && (saturate(projectTexCoord.y) == projectTexCoord.y))
	{
		// Sample the shadow map depth value from the depth texture using the sampler at the projected texture coordinate location.
		float shadowValue = ShadowTextures.Sample(SamplerClamp, float3(projectTexCoord, 0)).r;

		// Calculate the depth of the light.
		float lightDepthValue = input.lightViewPosition.z / input.lightViewPosition.w;

		// Set the bias value for fixing the floating point precision issues.
		float bias = 0.001f;

		// Subtract the bias from the lightDepthValue.
		lightDepthValue = lightDepthValue - bias;

		// Compare the depth of the shadow map value and the depth of the light to determine whether to shadow or to light this pixel.
		// If the light is in front of the object then light the pixel, if not then shadow this pixel since an object (occluder) is casting a shadow on it.
		if (lightDepthValue < shadowValue)
		{
			// Invert the light direction for calculations.
			float3 lightDir = -LightDir;
			lightDir = normalize(lightDir);

			// Calculate the amount of light on this pixel based on the bump map normal value.
			float lightIntensity = saturate(dot(input.nor, lightDir));

			// Determine the final diffuse color based on the diffuse color and the amount of light intensity.
			if (lightIntensity > 0.0f)
				result = float4(1.0f, 1.0f, 1.0f, 1.0f);
		}
	}

	return result;
}

 

Share this post


Link to post
Share on other sites

 

7 hours ago, GalacticCrew said:

It takes 2,5 ms when zoomed out, but more than 11ms when zoomed in.

You could try a depth only pre pass. Then overdrawn pixels would be skipped before the shadow calculation. If this indicates a win, replace the lightmap pass with a deferrad approach using depth buffer instead geometry.

You could also use just a single float instead a float4 to store the lightmap.

 

But i still do not understand the large difference in performance. Eventually your geometry is too fine, and for the zoomed out view the GPU is able to skip over many tiny triangles that become smaller than a pixel.

Share this post


Link to post
Share on other sites

Hi,
I am no expert but,
I am thinking that work done in the pixel shader is expensive since it is per/pixel.
Could it have something to do with the two if statements you have?

My next step would be to comment out the two if statements and always run all the pixel shader code and see what processing time you then would get. (I know gfx will not be as wanted, but could give hints).

/Kim

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!