# DrawIndexedPrimitives takes 20 to 40 FPS to draw

This topic is 1431 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hello,

I'm trying to display a terrain in my XNA game which is made of 512*512 vertices.

On my laptop, without the terrain, I have 60 FPS. When I display it, I'm running between 25 and 40 FPS.

I'm drawing 2 times the terrain: once for the camera, and once for the water reflection (I have an ocean).

I optimized a maximum my shaders, and I can tell you the lag doesn't come from here.

My draw function is only composed of this:

GraphicsDevice.DrawIndexedPrimitives(PrimitiveType.TriangleList, 0, 0, numberVertices, 0, numberIndices / 3);

I've seen a lot of people drawing million of faces in their game. I "only" have 512288 vertices to draw...

What could I do to optimize this function and get some more FPS?

Thank you!

##### Share on other sites

I optimized a maximum my shaders, and I can tell you the lag doesn't come from here.

If that's an assumption I suggest you use a profiler like Intel GPA to check it.

##### Share on other sites

No it's not an assumption, I tried removed them and the problem wasn't gone..

##### Share on other sites

There are so many things that could be causing this.  With the amount of code you've posted the best you're going to get are guesses, so here's a few to start with:

• Try lower resolution textures.
• Try drawing without your water reflection.
• Are you reading anything back from the GPU?  (Perhaps for that reflection?)
• Are you using Sleep calls to control framerate?

It's highly unlikely that the DrawIndexedPrimitive call itself is the culprit.  Instead, there's something running slow that the call depends on, or something that depends on the call and must wait for the call to fully finish before it can start, or something bad in your main loop that's being highlighted by the call.  That's the kind of thing you need to look for.

Edited by mhagain

##### Share on other sites

When you have trouble determining the root cause for why <x> is slowing your game down, you need to profile it. That is, using one of the various tools out there, get the numbers behind all the of the directx calls you are making to determine what exactly is at fault. I would recommend using the Intel Graphics Performance Analyzer (it's free) as mentioned above. I have just started using it and I can tell you it is amazing, once you capture a frame for analysis it displays all of the render calls based on which render target they are operating on. It lets you edit the shader code live, perform experiments such as swapping in a simple pixel shader, using 2x2 size input textures, it even lets you edit the current render states bound to each section of the pipeline.

##### Share on other sites

I guess you're not culling anything?

##### Share on other sites

No I'm not culling anything.

RnaodmBiT I used your profiler, and as nVidia's one it tells me DrawIndexedPrimitives is taking all the GPU ressources.

So I really don't know where it comes from...

##### Share on other sites

Are you recreating and/or resending data between draw calls, rather than once up front?

##### Share on other sites

RnaodmBiT I used your profiler, and as nVidia's one it tells me DrawIndexedPrimitives is taking all the GPU ressources.

I'm not sure about nVidia's tool, but Intel GPA should tell you whether or not the time is in the vertex shader or pixel shader. It should also let you perform experiments to see what happens if you swap things out with a 1x1 texture (to test if texture cache is the bottleneck), for instance. You might need an Intel GPU for some of the more advanced features to work.

You can also do this manually of course. For instance, replace your terrain pixel shader with a dummy one that just outputs a color. How does that affect your frame time?

All we can do on this forum is make wild guesses for you.

Edited by phil_t

##### Share on other sites

512*512 is actually a lot of vertices. That translates to 262.144, which forces to use 32-bit indices.

Use instancing to draw four chunks of 256x256 verts (which allow to use 16-bit indices) in one call.

Furthermore, if your terrain vertices are generated in a random order, it could be that it's cache unfriendly.

Also, if you're updating your terrain vertices once per frame (and the buffer not created with the wrong flags, i.e. static vs dynamic) it could be causing the hit. When you remove the DrawPrimitive call you notice the performance goes up because the driver may be skipping your terrain vertex updates CPU->GPU entirely.

It could also be that when you put the DP call, your bad uploads cause the driver to stall (force the CPU wait for the GPU to complete)

Finally, you don't say that a car takes 100 km/h to arrive to destination, but rather that it took an hour.

Same here, measure your frames in time taken in milliseconds, no it in frames per second.

##### Share on other sites

Thank you very much for this idea!

I tried to create chunks for my terrain, but it doesn't render the exact way I want. It works pretty great, except for one thing: there are lines between every chunks...

Screen:

http://puu.sh/6MD7M.jpg

Code:

http://pastebin.com/i8i0ajn9

It's been 7 hours I'm on it, and I just can't find it. Help !

I see that it's not coming from distance between them, but missing vertices..

Edited by LemonBiscuit

##### Share on other sites

How large are your chunks? If you're using 32x32 chunks, for instance, you'll need each chunk to have 33x33 vertices.

Another option - instead of having to deal with discrete chunks - is to instead have a fixed grid that is just large enough to cover the viewable area, and which moves along with the camera. Then (assuming a heightmap texture) use vertex texture fetch, and calculate the height of the vertex in the vertex shader, sampling from the appropriate point in the heightmap.

##### Share on other sites

I indeed set 33x33 vertices, but the problem come from getHeights() function I guess.

As you can see I loop to length-1 and width-1 otherwise it crashes (overflow).

What should I do?

##### Share on other sites

As you can see I loop to length-1 and width-1 otherwise it crashes (overflow).

Yeah, that's probably the problem. You need length X width heights in each chunk height data, not (length - 1) X (width - 1).

So of course if your entire map is 512x512 (composed of 32x32 chunks, which is 33x33 vertices), then you'll need 513x513 height points. You can just "clamp" the last row/column of your height map to accommodate for this. So loop to length/width, and then this:

1. // Get color value (0 - 255)
2.                         float amt = heightMapData[(+ offsetX) * oWidth + x + offsetY].R;

becomes

1. // Get color value (0 - 255)
2.                         float amt = heightMapData[Math.Min(+ offsetX, oHeight - 1) * oWidth + Math.Min(x + offsetY, oWidth - 1)].R;

Also note that you don't need a separate index buffer for each chunk. Every chunk's index buffer will be identical, so you just need one.

Edited by phil_t

##### Share on other sites

Aaaaand it works! Thank you very much for your help!

I think I'll manage to get all the things working the way I want now :)

##### Share on other sites

I'm still stuck on one point and I'll need your help, again

I managed to divide my weightmap (colors associated to textures) for every chunks, but I still have a tiling offset:

http://i.imgur.com/NLmNczt.png

(It is the seperation between two chunks)

Do you know how I could fix it?

Thank you

Edited by LemonBiscuit

##### Share on other sites

Is one of your sampler states not set to wrap or something? I can't tell you how to fix it, but I can tell you how to start diagnosing it. Just have your shader directly output one of your 5 input textures and see which ones have that discontinuity.

##### Share on other sites

Here are my samplers:

http://pastebin.com/0nCE4nzL

I tried to remove each one one-per-one, but nothing seems to change.

I have to say I followed a tutorial for this part of texturing (as I said before, I'm a newbie with HLSL :( )

##### Share on other sites

If you just output your weightmap, it still has that obvious discontinuity? Then your UV coordinates are wrong.

##### Share on other sites

I can't get them correct, but anyway, now even the normals in the side of the chunks are incorrect (I should get vertices from other chunks to get the normal, etc.) It's getting a little too complex for the little optimization I want to do..

I have another idea to optimize my things.

I generate a few bounding boxes all over my map (16 or 32 for instance).

Then I check at each frame which boxes are in view, and I then adapt the amount of vertices to show.

For instance:

If the boundingbox containing the last vertices of my terrain is not in view, then the amount of primitives to draw is the amount of vertices - the amount of vertices contained in the box.

Same thing for the first ones.

Edited by LemonBiscuit

##### Share on other sites

I can't get them correct, but anyway, now even the normals in the side of the chunks are incorrect (I should get vertices from other chunks to get the normal, etc.) It's getting a little too complex for the little optimization I want to do.

Yeah, you need to use vertices outside the chunk to calculate normals. But, you can do this offline, and make it part of your heighmap data. So it doesn't really complicate your chunk algorithm too much.

If the boundingbox containing the last vertices of my terrain is not in view, then the amount of primitives to draw is the amount of vertices - the amount of vertices contained in the box.

I don't quite understand how this would work. Like, how you would order your vertices in order to make this work.

Another alternative, which I suggested before, is to have a terrain grid just large enough to cover the area viewable by the camera, and move it with the camera. This is what I used in my game engine, and in terms of managing stuff on the CPU, it's extremely straightforward. I just have a single vertex and index buffer, and I set a world matrix to offset the terrain grid by the right amount when drawing. Height calculation is done by sampling the heightmap in the vertex shader. But I think it works best if you have a camera that doesn't allow a great variety in viewing angles (otherwise you'll have to account for different numbers of vertices being seen).

##### Share on other sites

Well here's how I do it:

I generate a certain amount of bounding boxes, each containing a certain amount of vertices.

They are generated in the same order of the vertices:

First bounding box contains the first 1000 vertices, etc.

Then I check for each draw which bounding box is the first in view and the last.

Example: If the first one is not in view, but the second one is, then I start rendering the vertices starting fromm the 2000th vertice.

If the last bounding box is not in view, the the one before is, then I skip rendering the last 1000th vertice.

This is not a huge optimization at all, is it won't work in many angles of the camera, but it was the easiest way I found  and the fastest one, the deadline for my project is pretty soon :( I may get back on this on another project anyway

But I didn't really get your other alternative..

##### Share on other sites

This is not a huge optimization at all, is it won't work in many angles of the camera, but it was the easiest way I found  and the fastest one, the deadline for my project is pretty soon I may get back on this on another project anyway

So each bounding box contains a grid of vertices? And they are all the same size? And they are arranged top to bottom, left to right or something?

But I didn't really get your other alternative..

Suppose the largest patch of terrain you ever see at one time in your world is 200 X 200. Make a 200 X 200 terrain vertex grid, positioned at (-100, -100) to (100, 100), say (assuming each grid square corresponds to one world unit). So now, if your camera is centered looking at (173, 192), draw the grid centered at that location (So (73, 92) to (273, 292)).

This means you just need to add an offset to your vertex positions in your vertex shader (to move the grid so it's centered when the camera is looking). And of course you can't associate a height with your grid vertices, since the same grid is used to render any patch of terrain. So instead you sample the height from your height map in your vertex shader.

I render all my terrain with a single 97x97 grid (so 9409 vertices):

If I zoom out, you can see the patch:

##### Share on other sites

That looks pretty nice, but my game is a FPS, so the angles will be really different, and I'll need to modify the grid everytime (as you stated before).

Btw, nice render!

Edited by LemonBiscuit

##### Share on other sites

Hey, I'm back again.

The problem comes from my Tex2d in my shader.

	float div = clamp(0.2*clamp(0.7f*(input.Depth), 1, 400), 1, 400);
if(div < 10)
{
rTex = tex2D(RTextureSampler, input.UV * TextureTiling) / div;
gTex = tex2D(GTextureSampler, input.UV * TextureTiling) / div;
bTex = tex2D(BTextureSampler, input.UV * TextureTiling) / div;
base = tex2D(BaseTextureSampler, input.UV * TextureTiling) / div;
}

float clamp2 = clamp(0.01f*input.Depth, 0, 1);
float3 rTex2 = tex2D(RTextureSampler, input.UV * 0.1) * clamp2;
float3 gTex2 = tex2D(GTextureSampler, input.UV * 0.1) * clamp2;
float3 bTex2 = tex2D(BTextureSampler, input.UV * 0.1) * clamp2;
float3 base2 = tex2D(BaseTextureSampler, input.UV * 0.1) * clamp2;

float3 weightMap = tex2D(WeightMapSampler, input.UV);

float3 output = clamp(1.0f - weightMap.r - weightMap.g - weightMap.b, 0, 1)
* base
+ base2 + weightMap.r * rTex2 + weightMap.g * gTex2 + weightMap.b * bTex2
+ weightMap.r * rTex + weightMap.g * gTex + weightMap.b * bTex;


As my terrain is pretty big, I only draw highly tiled textures around the player (div < 10). For far textures, I draw only 1 texture all over the map.

If I remove the if block above, I gain almost 30 FPS (~20 FPS to ~50FPS)

Here are my samplers:

http://pastebin.com/nxJNeyXG

Why does TEX2D takes so much from the GPU???