• Advertisement
Sign in to follow this  

Vertexbuffer - huge lag.

This topic is 1293 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm using XNA but I assume the problem is analogous in DX9?

 

So I'm having a huge problem rendering a model that's created dynamically at runtime. It renders fine but it's creating a huge lag.

 

Here's the issue illustrated through a comparison:

 

1) Model with 10,000 vertices created in 3ds max and rendered with shader X in XNA ---> 200fps (after everything else has happened in the game)

 

2) Similar 10,000 vertex model constructed at runtime with a dynamic vertex buffer and rendered with shader X ---> 100fps

This is a completely unacceptable drop and I assume I'm doing something wrong. Something like minecraft would be impossible to run if this was a necessary drop. I can post my code if necessary but I'm just using the same approach used by the 3D particles sample. I've profiled the projected with NProf and all of the time is being spent in GraphicsDevice.Present() 

 

Help?

Share this post


Link to post
Share on other sites
Advertisement

Are you setting things up to get debug warnings? When I use a debug device in DX11 (I don't remember how that works in DX9) and I do something not completely legit, my framerate plummets from all the reporting of warnings every frame.

Share this post


Link to post
Share on other sites

No, I'm not setting up any debug warnings. Like I say, I'm just using the same code found in the 3D particles sample from microsoft so there's really no room for me to be doing anything wrong in that sense. It must be more of a conceptual problem that I'm fundamentally missing. 

And I've also tried using the 'NoOverwrite' option as described in Shawn Hargreaves' blog. No change.

Edited by gchewood

Share this post


Link to post
Share on other sites

2) Similar 10,000 vertex model constructed at runtime with a dynamic vertex buffer and rendered with shader X ---> 100fps


Are you recreating/reuploading the model each frame? Uploading a model takes time, so uploading once will necessarily be faster than uploading it repeatedly.

Did you create the vertex buffer with similar properties for both models? The properties of a vertex buffer (e.g. dynamic vs non-dynamic) can affect speed.

Are you using the same vertex attributes between both models? The same shaders? Different shaders have different performance characteristics.
 

Like I say, I'm just using the same code found in the 3D particles sample from microsoft so there's really no room for me to be doing anything wrong in that sense.


That's so cute that you think Microsoft's code samples are necessarily the best way to do things. tongue.png

Share this post


Link to post
Share on other sites

Thanks for the reply Shaun, even if you called me cute (maybe I should change my avatar)

No, the SetData function is only called when necessary. Not every frame.
No, I didn't create the vertex buffer similarly for both models. I just use the inbuilt approach for rendering a model in the pre-fabricated example. I wasn't making the comparison to say they should be identical in the frame rate. If the run-time version was at 180fps or something, I'd just assume that was a necessary price to pay. But 100fps seems criminal. Otherwise, the vertex types and shaders are the same.

I don't think Microsofts codes are necessarily the best. But the fact that I've used that particular 3D particle sample a lot, and have found it performs very nicely, seems to be a good sign.

Share this post


Link to post
Share on other sites

It's normal enough to see all (or most) of your time being spent in Present: have a read of this: http://tomsdxfaq.blogspot.com/

 

As for causes of your performance drop, the first thing to do is check the vertex buffer creation and locking flags.  For a dynamic buffer using in this manner, you should be creating with D3DUSAGE_WRITEONLY | D3DUSAGE_DYNAMIC, and locking with D3DLOCK_DISCARD.  You must definitely should not be calling CreateVertexBuffer each frame; create it once and reuse it (with a discard lock) each time you need to update.  Also be careful that you don't attempt to read from the buffer while you have it locked.

 

Assuming that these are all correct, you'll need to talk a little about how you're creating the new vertex data (CPU-side) to load into the buffer.  My own guess is that you're possibly doing CPU-side skinning or frame interpolation; if the former then there's a high probability that the slow down is not from your usage of a dynamic buffer, but more simply because CPU-side skinning is slow.  If the latter you can quite easily switch frame interpolation to run on the GPU and thereby keep your vertex data entirely static.  Either way, there is a probability that your performance issue is coming from extra CPU-side work associated with using dynamic data, and thinking a little about how you can make this data (or as much of it as possible) static can reap huge rewards.

Share this post


Link to post
Share on other sites

Cool, I'll have a read through that page.

 

Yep, those flags are all set (or their equivalents in XNA) and I've tried with the Discard option. 

I'm not calling createvertexbuffer every frame, just once at the beginning. I'm not entirely clear on the part where you say "careful that you don't attempt to read from the buffer while you have it locked". What do you mean by 'read' in this context? As I say, the model is dynamically being added to. But not each frame. 

 

The vertex creation CPU side it quite simple. There are about 4-6 base models that I'm combining to make a larger structure. Those base models are loaded into vertex and index arrays at the beginning. Based on the user input, these base arrays are then combined into the larger array which is set to the vertex/index buffers when it's changed.

Share this post


Link to post
Share on other sites

From what you've mentioned seems like you are not doing anything out of the ordinary. Do you have the actual draw timings for comparison?, FPS is reliable performance metric especially when developing.

Share this post


Link to post
Share on other sites

What do you mean by the draw timings? As in, the number of ms taken in the Present() function? 

Why will that give any more info than the FPS I've already mentioned? Everything else about the solution is identical.

Share this post


Link to post
Share on other sites

these base arrays are then combined into the larger array which is set to the vertex/index buffers when it's changed.

 

Can you give more detail on how you're doing this part?  Are you, for example, combining them to an std::vector (or whatever the equivalent container object is in your C# code), then copying them to a locked vertex buffer?  You'll get better performance if you lock the buffer first, then write directly to the locked memory; for one you'll avoid an extra memory copy, another thing is that you'll also avoid a lot of runtime allocation and garbage collection.

Share this post


Link to post
Share on other sites

Yeah, I'm combining them into an array and then copying that to the vertex buffer. But that's not every frame, it's just when the model is occasionally updated. So I assumed that couldn't be the cause of the frame rate issue?

As for your suggestion, can you explain that a little more in depth. It sounds like you just said the same thing twice. How is copying them to a locked vertex buffer different than locking the buffer then writing directly to the locked memory?

Edited by gchewood

Share this post


Link to post
Share on other sites

1) Model with 10,000 vertices created in 3ds max and rendered with shader X in XNA ---> 200fps (after everything else has happened in the game)

 

2) Similar 10,000 vertex model constructed at runtime with a dynamic vertex buffer and rendered with shader X ---> 100fps

I wouldn't call a difference of 5 ms a huge lag when you've changed from a static to a dynamic vertex buffer, and the amount of data you sent per frame is of 10.000 vertices. Not to mention the GPU now has to sync more often (i.e. wait the CPU's data to arrive; or the CPU having to wait the GPU to finish); which means that if you do more work, the framerate won't drop (because one of your components were idle waiting).

 

If your game was originally running at 60 fps (no vsync), it would've drop to 46.29 fps. Measure your timings in milliseconds.

 

As long as you're not recreating the vertex buffer every frame, and creating it with D3DUSAGE_WRITEONLY | D3DUSAGE_DYNAMIC, and locking with D3DLOCK_DISCARD; there's not much you can do.

Reducing the size of the vertex should help.

Share this post


Link to post
Share on other sites

Thanks for the response Matias. I'm quite shocked that you say that's not a big drop actually. 

 

Further evidence to support my general suspicion that I'm fundamentally doing something wrong:

 

I just re-ran the game and drew each of the individual pieces (the base pieces I mentioned previously) separately with the appropriate locations and the same shader.

As I'd mentioned before, if I drew everything as a single model, I got 200fps. With the vertex buffer I was getting 100fps. With this new test I just ran (which I would think should be the most inefficient by far), I got about 180fps. 

 

So yeah, I'm not buying that it's a necessary cost at all, there must be something else to it!

Edited by gchewood

Share this post


Link to post
Share on other sites

I got 200fps. With the vertex buffer I was getting 100fps.

I repeat: Do not time in FPS, time in milliseconds.
You did not drop by 100 FPS, you dropped by 5 milliseconds.
 
Before we can really give you any meaningful advice you are long overdue in giving as real information.
How often is “occasionally” updating?
Show the update routine.
Show the code for making the vertex buffer.
Are you using index buffers?
 
 

As I'd mentioned before, if I drew everything as a single model, I got 200fps.

What does that mean?
A single draw call?

 

With the vertex buffer I was getting 100fps.

What does that mean? You were drawing without a vertex buffer before?

 

So yeah, I'm not buying that it's a necessary cost at all, there must be something else to it!

Without clarification on what you actually changed to go from 5 milliseconds per frame to 5.55555 milliseconds per frame (see how we don’t measure things in FPS because it’s misleading?), I assume:
#1: You were drawing everything in a single draw call originally.
#2: You went from that to drawing some things in a single draw call and other things in multiple draw calls with a dynamic buffer. That is when you noticed a decrease of 5 milliseconds per frame (200 FPS -> 100 FPS).


Then it makes perfect sense.
Drawing everything all at once with a single draw call and no state changes is of course absurdly fast.  It’s also impossible in a real game, which means there is no point in benchmarking it.

When you break the scene into multiple draw calls with state changes, that is where your performance is going.

There is a heavy hit for changing shaders (even if it is the same shader being set twice in a row), textures, render targets, vertex buffers, etc.

It’s not just because you switched to a dynamic vertex buffer.  You also added multiple draw calls and state changes.

 

That is, assuming my #1 and #2 are correct.

Again, the amount of information we don’t have here is extremely high.

 

 

L. Spiro

Share this post


Link to post
Share on other sites

But that's not every frame, it's just when the model is occasionally updated. So I assumed that couldn't be the cause of the frame rate issue?

 

It should be very easy to test this. Just update your dynamic vertex buffer once. What's the framerate then? (Put a breakpoint on the code that updates it to be absolutely sure it's not being called accidentally).

 

If it really is a 5ms drop even without updating the vertex buffer, then yes, you're doing something wrong somewhere. That wouldn't be expected. (edit: but as L. Spiro mentioned, there's so much we don't know about how you're drawing things).

Edited by phil_t

Share this post


Link to post
Share on other sites

@phi_t

 

Ok, yep tried just updating the vertex buffer in 1 big jump. Same outcome. Around 100fps.

 

@L.Spiro

 

Yes, I realise fps isn't a linear measurement. But I assumed everyone knows that and therefore there would be no issue? Or am I missing something other than the semantic preference?

 

By occasional, I mean it's not a set time interval. It's based on the user input. Like in something like minecraft. Yet the framerate remains entirely proportional to the vertex count. Whether or not it's being constantly updated or left for a few minutes.

 

Yes, I'm using an index buffer. And the code for the vertex buffer is 


vb = new DynamicVertexBuffer(GraphicsDevice, VertexPositionNormalTexture.VertexDeclaration, vertices.Length*100, BufferUsage.WriteOnly);

By 'single model', I mean it was loaded as the standard XNA model and only uses a single material. That means just 1 draw call right? Yes, I realise that uses a vertex buffer too, I meant 'vertex buffer' for the explicit one that I created, as opposed to the one generated automatically by an XNA model.

And I'm pretty sure the number of state changes was identical in both situations. Why do you say I added multiple draw calls and state changes?
As for you not having much information, obviously the issue is that it's currently part of a fairly complicated program. Distilling the part that's problematic so I could show you the code takes time, so I was hoping it would be resolved easily without requiring that. But now that it hasn't, maybe I should just upload some code?

Edited by gchewood

Share this post


Link to post
Share on other sites

What happens if the only thing you do is draw the stuff in your DynamicVertexBuffer? Does it still take 5ms?

 

Does your model use textures? Are the textures identical in both cases (mipmaps, dimensions,  format, etc...). Are you using the same sampling states? What happens if you change the shader so it doesn't sample textures? What's the difference between the two methods then?

 

Take a capture in PIX and compare the two scenarios. Does anything stand out w.r.t draw calls, state changes, etc?

 

Have you tried anything that can profile GPU performance, like Intel GPA?

Share this post


Link to post
Share on other sites

If the only thing I draw is the dynamic vertex buffer, it jumps to 10ms (for the same 10,000 vertex buffer). Isn't that to be expected? There were various other things going on as well.

 

Yes, the models use the identical textures. They're using the same shader, which is where the sampling states are set. So the same.

If I change the shader to not use textures, the outcome is more or less the same, only slightly better performance which I guess is to be expected.

I'm completely new to using PIX so I'll certainly try that but it'll take a while to familiarise myself with it!

Edited by gchewood

Share this post


Link to post
Share on other sites

You have a scene S and you draw it all in X calls and Y state changes (which could both be 1) using a static vertex buffer at 5 milliseconds per frame.

You draw the same scene S with the same number of calls (X) and state changes (Y) using a dynamic vertex buffer which is not updated after being set at 10 milliseconds per frame.

 

The difference between a static vertex buffer and the same dynamic vertex buffer in scene S is 5 milliseconds.

 

Case closed.  The end.

 

 

Unless you want to show exactly what flags you are passing when creating the dynamic vertex buffer, which, regardless of how complex your project is, is just one line of code.

Or unless you try PIX and find out some other difference between the 2 render methods.

 

 

L. Spiro

Share this post


Link to post
Share on other sites

Lol, well everyone else's responses don't seem to match your certainty on that? 

 

I thought I already did show you the code used to create the vertex buffer. Again:

vb=new DynamicVertexBuffer(GraphicsDevice, VertexPositionNormalTexture.VertexDeclaration, vertices.Length*100, BufferUsage.WriteOnly);

and then to set the vertices:

vb.SetData(activeVertices, 0, activeVertices.Length, SetDataOptions.Discard);

And that performs the same no matter what I set for SetDataOptions. That's what you were asking for right? 

So why is a dynamic vertex buffer with just 1 draw call slower than doing like 20 or so from the individual models? That doesn't seem right?

Edited by gchewood

Share this post


Link to post
Share on other sites


Lol, well everyone else's responses don't seem to match your certainty on that? 


So why is a dynamic vertex buffer with just 1 draw call slower than doing like 20 or so from the individual models? That doesn't seem right?

 

It really should be about as fast, from my experience. But with the information you have given us, the only conclusion is that it is much slower (I think that's what L. Spiro was getting at).

 

No one's going to be able to help you anymore at this point, given in the info in this thread. Either you'll have to upload a repro to some repository somewhere and hope someone is nice enough to look at it, or you'll need to do more detective work yourself.

Share this post


Link to post
Share on other sites

Yeah, I'm gonna run through some tutorials with PIX and see if that can help me. I'm also gonna see if I can cut the polycount a bit and then try geometry instancing instead. If I'm really not making any big mistakes in my use of the vertex buffer, it just seems that's not the ideal solution to my problem.

It's that last comparison though that's still making me skeptical. My, admittedly uninformed, intuition doesn't seem to accept that a dynamic vertex buffer which is supposedly created for this very purpose, is slower than drawing each piece independently.

Share this post


Link to post
Share on other sites

gchewood, why you are creating such massive vertex buffer?

vb=new DynamicVertexBuffer(GraphicsDevice, VertexPositionNormalTexture.VertexDeclaration, vertices.Length*100, BufferUsage.WriteOnly);

I thought you said your mesh is 10,000 vertices, doesn't it mean you're creating 1,000,000 vertex buffer here? I wonder if this can cause FPS drop.

Share this post


Link to post
Share on other sites

Hi.

Looks like it's time to see your create mesh function.

 

may be lots of duplicated vertices.

 

no index buffer.

Share this post


Link to post
Share on other sites

gchewood, why you are creating such massive vertex buffer?

vb=new DynamicVertexBuffer(GraphicsDevice, VertexPositionNormalTexture.VertexDeclaration, vertices.Length*100, BufferUsage.WriteOnly);

I thought you said your mesh is 10,000 vertices, doesn't it mean you're creating 1,000,000 vertex buffer here? I wonder if this can cause FPS drop.

No, sorry. That's my mistake for not making it clear. The vertex buffer is around 20,000 vertices as the array 'vertices' at that point is of size 200. It's that size as that's around the maximum it will need to be.

 

 

 

Hi.

Looks like it's time to see your create mesh function.

 

may be lots of duplicated vertices.

 

no index buffer.

Hmmmm, ok. This is a potential problem. Just checked the index buffer for when I'm up to about 10,000 vertices. The index buffer is at about 200,000.

Right.

So that seems problematic. 

Yep, the issue is with the code I'm using the create the base vertex and index arrays. They're being extracted from a .X model. I just checked, the vertices, normals and texture coordinates are all fine. But it's just creating a hideous number of indices for some reason. Right, at least I know where the issue is. I'm glad my intuition was right.

Thanks for all of the help, given my very vaguely described problem everyone. If anyone has ever extracted the index data from a model in xna before, please post with any info. Otherwise, I'm sure I'll figure it out.

Thanks

Edited by gchewood

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement