• Create Account

## Vertex and Index Buffer Locking

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

10 replies to this topic

### #1noodleBowl  Members

Posted 20 October 2013 - 12:08 PM

I was cleaning up my dynamic vertex and index buffer code, when I started wondering something

Am I locking these buffers right? This is a little hard to explain for me so please let me know what does not make sense

currently I just lock the entire buffer like this:

vBuffer->Lock(0, 0, (void**) &vertices, bufferLockFlag);
iBuffer->Lock(0, 0, (void**) &indices, bufferLockFlag);


But since these are dynamic buffers wouldnt I want to be locking like this?

vBuffer->Lock(vertexDataAmountUsed, vertexBufferSize - vertexDataAmountUsed, (void**) &vertices, bufferLockFlag);
iBuffer->Lock(indexDataAmountUsed, indexBufferSize - indexDataAmountUsed, (void**) &indices, bufferLockFlag);


For example lets say we are:

- Each quad has 4 vertices and 6 indices, since we are using a index buffer

- That each quad has a vertex data size of 10

- That each quad has a index data size of 2

- That our vertex buffer can hold 3 quads before it is "full"

- Meaning that our buffers sizes are:

vertexBufferSize: 'Vertex Data size per quad'  *  'number of quads the buffer can hold' = 10 * 3 = 30

indexBufferSize: 'Index Data size per quad'  *  'number of quads the buffer can hold' = 2 * 3 = 6

- That we are starting with fresh empty buffers

vertexDataAmountUsed = 0;
indexDataAmountUsed = 0;

We lock the buffers like so

vBuffer->Lock(vertexDataAmountUsed, vertexBufferSize - vertexDataAmountUsed, (void**) &vertices, bufferLockFlag);
iBuffer->Lock(indexDataAmountUsed, indexBufferSize - indexDataAmountUsed, (void**) &indices, bufferLockFlag);


Meaning that we are locking the entire buffer for both the index and vertex buffers because:

- vertexDataAmountUsed and indexDataAmountUsed is 0. This tells the lock call to use a 0 offset for the lock (First param in the lock calls)

- We are locking the entire buffers worth for both buffers, based on the amount of data to lock (Second param in the lock calls)

vertexBufferSize - vertexDataAmountUsed = 30 - 0 = 30

indexBufferSize - indexDataAmountUsed = 6 - 0 = 6

So down the line, lets say we do not need a fresh buffer. That we still have room left.

Lets say there are 2 quads worth of data in for each buffer meaning:

vertexDataAmountUsed = 'Vertex data per quad' * 2 = 10 * 2 = 20

indexDataAmountUsed = 'Index data per quad' * 2 = 2 * 2 = 4

So this means we can hold one more quad before we have to use the DISCARD flag to get a fresh buffer

Meaning that when we lock again using

vBuffer->Lock(vertexDataAmountUsed, vertexBufferSize - vertexDataAmountUsed, (void**) &vertices, bufferLockFlag);
iBuffer->Lock(indexDataAmountUsed, indexBufferSize - indexDataAmountUsed, (void**) &indices, bufferLockFlag);


We are locking like so:

- We are starting the lock at the 20 data amount offset mark for the vertex buffer (First param in the vertex buffer lock)

- We are starting the lock at 4 data amount offset mark for the index buffer  (First param in the index buffer lock)

- For the vertex buffer we are only locking what we have left available (Second param in the vertex buffer lock). In this case:

vertexBufferSize - vertexDataAmountUsed = 30 - 20 = 10

- For the index buffer we are only locking what we have left available (Second param in the vertex buffer lock). In this case:

indexBufferSize - indexDataAmountUsed = 6 - 4 = 2

Meaning that only the remaining data quad spot left in the buffers was locked

Now, what am I really asking?

Well, I want to know if what I just described above is correct? Is that how I should be locking dynamic vertex and index buffers?

That I can say vertexBufferSize - vertexDataAmountUsed to lock my whole buffer assuming it matches my max vertex buffer size?

I know if you use 0, 0 in the first and second param it locks the whole thing, but can this be used as an alternative?

Or should I just stick with locking the entire thing?

Edited by noodleBowl, 20 October 2013 - 12:19 PM.

Just in case I forget to say it, I'm targeting OpenGL ES 2.0

### #2N.I.B.  Members

Posted 21 October 2013 - 01:55 AM

From API point-of-view, you are correct. But - the OffsetToLock and SizeToLock parameters are hints to the driver, and there's no guranttee that it will actually lock only those parts of the buffer.

From performance POV, locking the entire buffer with the DISCARD flag is better. The driver will most likely just allocate a new buffer, then it doesn't have to merge the new and old parts of the buffer. It does have the drawback of using more memory, so take care when mapping very large buffers.

The real question is why are using dynamic VB/IB? Drivers don't really like that...

### #3noodleBowl  Members

Posted 21 October 2013 - 02:25 PM

From performance POV, locking the entire buffer with the DISCARD flag is better. The driver will most likely just allocate a new buffer, then it doesn't have to merge the new and old parts of the buffer. It does have the drawback of using more memory, so take care when mapping very large buffers.

The real question is why are using dynamic VB/IB? Drivers don't really like that...

I'm using dynamic buffers because I'm creating a spritebatcher. And since the data in them is almost always changing a dynamic buffer should be the way to go

Your statement actually makes me really confused because it's a performance optimization according to this article by Microsoft

http://msdn.microsoft.com/en-us/library/windows/desktop/bb147263%28v=vs.85%29.aspx#Using_Dynamic_Vertex_and_Index_Buffers

Just in case I forget to say it, I'm targeting OpenGL ES 2.0

### #4N.I.B.  Members

Posted 21 October 2013 - 03:45 PM

- Don't use dynamic VB if you don't need to. This is the most common case.

- Use MAP_DISCARD. Like I said, what really happens is that the driver will allocate a new buffer, so you don't interfere the current draw, and don't need to merge the old copy of the buffer with the new one.

- Use MAP_NOOVERWRITE. This is useful if you have large buffer but only need to change a small portion of it. This has some overhead, and in some cases can cause the GPU to stall, but in most cases there are no performance implications.

I used to do driver optimzations, we hated when games mapped VB/IB, we disabled some optimizations for dynamic VBs. I only saw one game that used dynamic IB, and a mere few that used dyamic VB.

If you are using DX10+, consider using GS for billboarding, especially if you have fixed number of sprites. DX10 SDK has a sample called ParticleGS - it implements a particle system using GS, SO and DrawAuto().

Edited by satanir, 21 October 2013 - 03:53 PM.

### #5mhagain  Members

Posted 21 October 2013 - 05:38 PM

<snip>

Sorry, but this doesn't make much sense.  What would you recommend for, say, text rendering, where the text being rendered may change every frame?  Or a dynamic particle system where the number of particles being drawn may change every frame?

For sure static buffers are preferable where possible, and keeping as much of your geometry as possible static is the right thing to do, but there are scenarios where no approach other than dynamic offers itself as a reasonable solution.  Similarly in D3D10+ you simply must use dynamic buffers (or default with UpdateSubresource) for some scenarios.  Issuing what looks like a blanket ban on dynamic buffers "just because" seems to me to be denying the existence of those scenarios.

Edited by mhagain, 21 October 2013 - 05:39 PM.

It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.

### #6hdxpete  Members

Posted 21 October 2013 - 08:41 PM

what i do is create a number of dynamic VBO equal to the swap frame count. so if im double buffered i create 2. frame A i write to 0 and frame b i write to 1 rinse wash repeat. hopefully the driver wont stall as much since i'm not reusing the same VBO info on the next draw.

### #7N.I.B.  Members

Posted 21 October 2013 - 11:19 PM

What would you recommend for, say, text rendering, where the text being rendered may change every frame? Or a dynamic particle system where the number of particles being drawn may change every frame?

For dynamic particles, use GS/SO/DrawAuto().

For text, instacing will do (even if you don't want to use GS).

Issuing what looks like a blanket ban on dynamic buffers "just because" seems to me to be denying the existence of those scenarios

I never said don't used them, I said they are more costly then static buffers, so use them with caution, and know that you have alternatives.

Edited by satanir, 21 October 2013 - 11:22 PM.

### #8Tom KQT  Members

Posted 22 October 2013 - 12:36 AM

For dynamic particles, use GS/SO/DrawAuto().

From the code in the first post, I would guess that noodleBowl is using DX9, so that won't help him too much.

For text, instacing will do (even if you don't want to use GS).

You don't need dynamic buffers when you do instancing? I mean the per-instance buffer. And as each letter requres just 4 vertices, I don't think instancing would help here. You would be filling the per-instance buffer with as much data as you would the main buffer, wouldn't you?

### #9mhagain  Members

Posted 22 October 2013 - 01:15 AM

What would you recommend for, say, text rendering, where the text being rendered may change every frame? Or a dynamic particle system where the number of particles being drawn may change every frame?

For dynamic particles, use GS/SO/DrawAuto().

For text, instacing will do (even if you don't want to use GS).

And how does any of that handle the fact that the text and/or the number of particles may change each frame?  Not forgetting the other points raised above?

The discard/no-overwrite pattern is well-known and has been advised for as long as dynamic buffers have existed in D3D.  Microsoft recommend it, the major GPU vendors recommend it and have even published papers discussing it; this is the one pattern where so much has been written hinting "use this, it's the fast path", that claims against it which only emerge now must be viewed with suspicion.

If we were talking OpenGL and glMapBuffer (not glMapBufferRange) then yes, warnings against it are appropriate, but the D3D buffer locking mechanism has never had those problems when used properly.

It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.

### #10N.I.B.  Members

Posted 22 October 2013 - 01:28 AM

You would be filling the per-instance buffer with as much data as you would the main buffer, wouldn't you?

No, even in DX9 you can save 50% of the bandwidth.

claims against it which only emerge now must be viewed with suspicion

GPUs evolve. APIs evolve. And so techniques evolve.

And none of this dicussion makes any sense, because I never said not to use dynamic buffers...

### #11mhagain  Members

Posted 22 October 2013 - 02:43 AM

No, even in DX9 you can save 50% of the bandwidth.

But bandwidth isn't the primary bottleneck with either text rendering or particles; the bottleneck is fillrate and ROP.  Even with OpenGL's immediate mode there's next to no performance difference.

It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.