Jump to content
  • Advertisement
Sign in to follow this  
RobMaddison

DrawIndexedPrimitive spike

This topic is 3894 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

This may be a rhetorical question, but DrawIndexedPrimitive is synchronous when called from c++ isn't it? I'm assuming it is, but some strange behaviour is making me think maybe it's not. If I start a high performance timer prior to a bunch of DrawIndexedPrimitive calls (around 25 or so) and take the reading when the calls are finished, is that indicative of how long in milliseconds my actual rendering literally takes? I'm running in optimized release mode using DX9 and unmanaged c++. Without going into too much detail, my view is drawing a terrain which is held in around 25 vertex buffers (60 bytes per vertex, 37249 vertices per buffer). The terrain is drawn using simple filled tris (no textures) and everything is indexed. There are around 288 tris per vertex buffer, so around 7200 tris in total for the entire draw set. I know this can [and will] be optimized, but as I haven't implemented the different LODs yet, I took the average LOD for the whole view. The issue I'm seeing is a sporadic spike in the millis it takes to draw the terrain (25 DrawIndexedPrimitive calls). Mostly it's around 0.5-1ms but every second or so it jumps up to between 10-30ms. I haven't put NVPerfHUD on it yet, that's my next task, but I thought i'd throw it out there to see if anyone can think of anything or have seen similar things happen. (I'm running on a DELL Latitude D820 with 2gb RAM, and a NV250 Quadro 120M with 250mb of VRAM - which I think may actually litterally be 128mb - apparently some kind of marketing ploy). Thanks for any help/suggestions

Share this post


Link to post
Share on other sites
Advertisement
DrawIndexedPrimitive is asyncronous. However, it takes a reasonably long time to make the call. DrawIndexedPrimitive makes a transition to kernel mode I belive, which is a pretty expensive thing to do. It then pushes the data into a buffer for the video card driver to handle, and returns. The video card then reads batches out of the buffer and renders them.

Timing the duration of individual D3D calls in pointless, because it's mostly async. NVPerfHUD and/or PIX should help point out what the problem is.


How many DrawIndexedPrimitive calls are you making per frame? You really want to keep it under 500 or so.

Share this post


Link to post
Share on other sites
Hi, thanks for the response

I'm only making 25 calls per frame - it's the only thing I'm drawing at the moment. If it's asynchronous, that makes more sense, but I'll hook up NVPerfHUD and see if I can see what's happening.

I would have thought 25 DrawIndexedPrimitive calls with a total of 7200 tris with no texturing would be super quick, so I guess there must be something not quite right.

Cheers

Share this post


Link to post
Share on other sites
I believe DrawIndexedPrimitive() doesn't always do the swap to kernel mode - it'll only do that every so often to improve performance. Also some of the rendering will probably be done during the Present().

Those long delays could well be texture uploads, or vertex buffer uploads - you have around 56MB of vertex buffers there...

Share this post


Link to post
Share on other sites
Actually, thinking about this a bit more, my loop is currently simple:

1) Adjust geometry and indices (~1ms) (this includes locking vertex/index buffers)
2) Draw geometry (DrawIndexedPrimitive x 25)

I assume my timing of part 1) is realistic, unless vertex/index buffer locking is asynchronous which it can't be. I don't use any kind of frame timing at the moment so frames are drawn frame after frame regardless of timing.

So if the DIP calls are asynchronous, it could be that the second, third, umpteenth, etc. call to DIP is locking against itself causing a build-up and, consequently, the spike.

How are you supposed to know when it is okay to make the next set of DIP calls? Is there some kind of signal or callback in the API that will let you know when it's next available for drawing?

Share this post


Link to post
Share on other sites
Hi Adam

Yes, it's around 56mb of vertices (xyz, normal, diffuse + 4 sets of texture coords), but shouldn't these stay in VRAM once created?

I do lock a number of the vertex buffers each frame (the new ones that come into the view frustum) and adjusting the y values (on the CPU, not the vertex shader). If I lock the entire contents of a vertex buffer and change the values, does that mean it has to go across the bus to the GPU again when it's used in a DIP call? If so, that'll be my problem.

Share this post


Link to post
Share on other sites
They should stay in VRAM as long as you haven't run out of VRAM, when it will then swap out data on a least recently used basis. Managed resources may also not be uploaded until they are first used.

Locking a non-dynamic buffer will force an upload of the new vertices to the GPU the next time it's used. I'd suggest either doing all the modification up front, or using the vertex shader, since you probably don't want the terrain in a dynamic buffer.

Share this post


Link to post
Share on other sites
My terrain is completely represented by dynamic vertex buffers. I don't see any other way of doing it. To have static buffers pre-built would take up far too much memory, so I rotate the use of a set number of vertex buffers. Most of them don't get updated each frame (the portions that remain on screen), I only update the buffers that include new 'chunks' into the frustum.

Share this post


Link to post
Share on other sites
You should make sure you read and understand the Accurately Profiling Direct3D API Calls (Direct3D 9) paper in the SDK documentation. Performance measurement for D3D applications is an extremely complex beast - not only have you got coarse level parallelism (CPU/GPU) but the GPU has a complex pipeline that can be off-balance...

PIX for Windows and NVPerfHUD are invaluable [grin]

Jack

Share this post


Link to post
Share on other sites
Ahh, in that case there's a simple way to speed things up.

The simplest solution is to make sure you're locking with the NOOVERWRITE or DISCARD flags as appropriate, and that the vertex buffer is write only.

If that doesn't fix it, or isn't possible then you'll probably need to manage the buffers manually:

What's probably happening is that when you go to lock your dynamic VB the hardware is sometimes still drawing from it. When that happens the CPU has to wait for the GPU to finish before it can obtain the write lock.

Normally using the right lock flags will get round that, as long as the buffer is write only. The other way around that is to avoid reusing a buffer until a frame or two after it was last drawn from, which means you'll probably need an extra buffer or two spare for that purpose.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!