Sign in to follow this  
CGameProgrammer

Does DISCARD trash the entire buffer, or only the locked portion?

Recommended Posts

I believe the entire thing is thrown out. No overwrite might be a more appropriate flag for you to use.

[EDIT] Yeah, here we go:
Quote:
The application overwrites (with a write-only operation) every location within the locked surface. This is a valid option when using dynamic textures, dynamic vertex buffers, and dynamic index buffers. You may not use this option to update a portion of a surface.
For vertex and index buffers, the application discards the entire buffer.

Share this post


Link to post
Share on other sites
But I am overwriting the portion that I'm locking, so I can't do that. I only update a vertex buffer once per frame of course, but only a subset of it might need to be updated.

If you're right, then I have two options: lock it with no flags, or discard it and update the entire vertex buffer each frame. I'm not sure which method is fastest... probably the former?

Share this post


Link to post
Share on other sites
Discard and no overwrite are useful in the situation that the GPU may be in the process of using the buffer that you're writing to. If you've caused a pipeline flush since using the buffer to render (for example, changed render target or called Present), then you can be reasonably assured that the buffer is not in use, and locking without flags is fine.

These flags are for situations where you do a draw call using that buffer and then start writing to the buffer shortly afterwards.

Share this post


Link to post
Share on other sites
Quote:
Original post by CGameProgrammer
But I am overwriting the portion that I'm locking, so I can't do that?

I'm not sure exactly how NOOVERWRITE works, but I don't think it is as simple as that. It is the way to go if you are trying to append to a large VB where you are essentially promising not to overwrite data that is IN USE. If you're going to use discard, why bother starting at index 256? My impression is that you want to progressively lock with NOOVERWRITE until the VB is full and then use DISCARD.

Share this post


Link to post
Share on other sites
I know that Discard effectively allocates a brand new buffer so you can write to whichever portion you wish, but that's not my issue. I'm not appending data. I'll give an example of possible usage for a vertex buffer with 10 elements, with one Lock call per frame. Pseudocode:

Lock(vertices #0-9)
Lock(vertices #4-9)
Lock(vertices #3-9)
Lock(vertices #5-9)

Is that understandable? Basically, the entire buffer is written the first time, obviously. The next frame, vertices 0 to 3 were not modified so the Lock can start at index 4. The next frame, vertices 0-2 still weren't modified, etc.

Obviously I could just write the entire buffer each frame with Discard, but I want to optimize it by not locking the vertices that precede the first modified vertex.

Share this post


Link to post
Share on other sites
Quote:
Original post by CGameProgrammer
Obviously I could just write the entire buffer each frame with Discard, but I want to optimize it by not locking the vertices that precede the first modified vertex.

Just my opinion/guess, but the optimization for dynamic data might not favor that approach. Even if you are rewriting some redundant data each frame, you still might be better off using NOOVERWRITE because it avoids stalls. The idea would be to do something like:

-allocate massive dynamic vb (say 20000 vertices), and a corresponding index buffer (if you want even more speed)
-write dynamic data to next available vertices using NOOVERWRITE. For instance, on the first frame you use vertices 0-197. Then the next frame you might use 198-304. Etc...
-render only the relevant vertices
-after several frames you will fill the vertex buffer, at which point you lock with DISCARD and start again at vertex 0

This way you get one lock call per frame, but it's a non-gpu-stalling lock call and it is FAST! With the method you are proposing, you could easily cause a stall by attempting to access and modify data that could still be in the rendering pipeline. Perhaps in your case it wouldn't be a problem? I would be curious to see the performance comparison, but I definitely think the speed more than makes up for the fact that some of the data remains static from frame to frame and is thus somewhat redundant. I would guess that even if only a small (<25%) portion of the data was dynamic it would still be faster.

Either way, good luck... I would be interested to hear what kind of dynamic data you're dealing with. This is NOT 'A simple enough question' as you said but it is an interesting one and I'm always interested in new ways to optimize VB performance!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this