DrawPrimitive CPU load

Started by
6 comments, last by Zoner 11 years, 11 months ago
Hello!

I noticed that drawing tens of thousands triangles with DrawPrimitive produces a high CPU load. With an AND X4 955 the number of triangles is limited to ~300,000 at 30fps.

The triangles are used for terrain and are stored in a VerticleBuffer which is filled at startup. There are no textures (only color and light). Terrain is devided in pieces of 600 verticles (which make 200 triangles) to reduce computing when parts of the terrain get changed.

Important code:
[spoiler]types:

struct CUSTOMVERTEX {FLOAT X, Y, Z; D3DVECTOR NORMAL; DWORD COLOR;};
#define CUSTOMFVF (D3DFVF_XYZ | D3DFVF_NORMAL | D3DFVF_DIFFUSE)


init:

d3ddev->SetRenderState(D3DRS_LIGHTING, true);
d3ddev->SetRenderState(D3DRS_ZENABLE, TRUE);
d3ddev->SetRenderState(D3DRS_CULLMODE, D3DCULL_NONE);
d3ddev->SetRenderState(D3DRS_AMBIENT, D3DCOLOR_XRGB(50, 50, 50));
d3ddev->SetRenderState(D3DRS_AMBIENTMATERIALSOURCE, true);

CUSTOMVERTEX verticles[10*10*6];
// vertex filling

void* pvoid;
vBuffer->Lock(0, 0, &pvoid, 0);
std::memcpy(pvoid, verticles, countPrimitives*3*sizeof(CUSTOMVERTEX));
vBuffer->Unlock();


render:

// view etc

d3ddev->SetFVF(CUSTOMFVF);
d3ddev->SetStreamSource(0, vBuffer, 0, sizeof(CUSTOMVERTEX));
d3ddev->DrawPrimitive(D3DPT_TRIANGLELIST, 0, countPrimitives);

[/spoiler]


How can I increase the performance? Currentry it's not sufficient, especially for slower PCs. Will the behavior change if I use greater/smaller terrain chunks? Thank you.
Advertisement
How many draw calls you make per frame?

I mean, you show your draw code, where you can't do much of improvements. You should show bigger parts of the logic.
The key is to reduce the number of draw. Either you should use larger chunks or different sorts of geometry instancing.

Also can you verify the CPU load while the program is running (check from Task Manager gives you rough indication whether you are CPU bound).

Seems that you push roughly 9 million triangles per second which is quite low year 2012.

Best regards!
One thing to note is that DrawIndexedPrimitive() is significantly faster than DrawPrimitive() when rendering larger numbers of primitives because the hardware vertex cache doesn't work for non-indexed primitives.
Seems odd. When creating your VB, what is your usage flag being set to? Do you update the VB every frame? I'd also change to DrawIndexedPrimitive() like was stated in a prev post.

With an AND X4 955 the number of triangles is limited to ~300,000 at 30fps.

The triangles are used for terrain and are stored in a VerticleBuffer which is filled at startup. There are no textures (only color and light). Terrain is devided in pieces of 600 verticles (which make 200 triangles) to reduce computing when parts of the terrain get changed.
300,000 : 60 = 5000 batches / frame. I think you need to at least halve them. At least.
Rendering less than 1000 unique vertices per batch is a waste on anything that is at least a GeForce 4 - it will never reach optimal performance. Even in GL - which is far more efficient than D3D9 at dispatching batches - there is going to be performance loss.

Previously "Krohm"

It helped to avoid DrawPrimitive calls.

But what helped a lot more was switching from D3DCREATE_SOFTWARE_VERTEXPROCESSING to D3DCREATE_HARDWARE_VERTEXPROCESSING.


Thank you for the help!

But what helped a lot more was switching from D3DCREATE_SOFTWARE_VERTEXPROCESSING to D3DCREATE_HARDWARE_VERTEXPROCESSING.


Well, that has definetely something to do with poor performance ...
Avoiding API calls setting the same state as the previous draw call is pretty import in D3D8 and D3D9. Basically D3D keeps a set of dirty flags any state changes have been made since the last draw call, and the Draw functions look at what is dirty and blindly do additional processing. D3D generally does not check if things really changed or not, as the application is more able to do it at a higher level and earlier, as calling D3D itself has a lot of overhead which is good to avoid. Setting things like the FVF or vertex decl, textures, buffers, unnecessarily has a pretty large impact on performance as a result.
http://www.gearboxsoftware.com/

This topic is closed to new replies.

Advertisement