Using VBOs for dynamic geometry

Started by
10 comments, last by PunCrathod 9 years, 10 months ago

Hello,

I currently try to implement realtime CSG. This means that I have a 3D mesh and want to apply boolean operations on it. After each boolean operation, new vertices are added/removed to/from the mesh. This should happen in realtime, so my vertex data changes every frame. Currently, it is extremely slow, because I'm rendering my mesh in the following way (C# with OpenTK):


GL.Begin(PrimitiveType.Triangles);
foreach(Polygon p in PolygonVector) {
  foreach(Vertex v in p) {
   GL.Normal3(v.Normal.x, v.Normal.y, v.Normal.z);
   GL.Vertex3(v.x,v.y, v.z);
  }
}
GL.End();

So my idea was to use VBOs instead. But this is not easy, because my geometry data is changing every frame and VBOs have a fixed size. So how do I handle this?

Would it be the best way to create a new vertex buffer object every frame with my current mesh?

If someone could point me in the right direction I'd very much appreciate it.

Advertisement

How about a pool of preallocated vertex buffers of fixed sizes, for example, 16Kb, 64Kb, 256Kb etc.. And then each frame, you grab the smallest vertex buffer that can hold all your vertex data, and upload the data to that buffer and then draw with that buffer. To get around GPU stalling, you can have multiple of each size (e.g. 10 of each buffer size), and store them in a linked list and grab the buffer at the start of the linked list. At the end of the frame you can put it back at the end of the list. The list will naturally be ordered by least recently used and means you won't use the same buffer more than once between frames, and more or less guarantees that the frame using that buffer has been rendered.

If you find you have too much vertex data to fit in the largest buffer, then allocate N number of buffers large enough (perhaps twice as large as the previous largest?) so the cost of the allocation isn't too problematic.

You could just treat it as if it were a dynamic array.

Start with a fixed size, and when you modify the vertex data, if its too much for the buffer to handle, orphan it and create a new one with a new size (you might want a "grow strategy", say, twice the previous size, or 1.5 times the previous size, whatever works best).

First few resizes won't be fast but after a while resizing won't happen that often.

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

My journals: dustArtemis ECS framework and Making a Terrain Generator

and VBOs have a fixed size.

Every call to glBufferData() changes the size of the VBO, so you are working on false assumptions.

Double-buffer or triple-buffer the vertex buffers and call glBufferData() every frame to update them.

L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

If vertex data changes every frame (i.e. you're streaming), you have four options:

1. one buffer, three times the size, persistently mapped (GL_MAP_WRITE_BIT|GL_MAP_PERSISTENT_BIT|GL_MAP_COHERENT_BIT)

2. one buffer, three times the size, mapped with GL_MAP_UNSYNCHRONIZED

3. one buffer, three times the size, using glBufferSubData

4. one buffer, invalidated using glBufferData(..., 0), and replaced by another one (again, glBufferData)

(1) is the fastest, but requires ARB_buffer_storage and setting fences, a third of the buffer you write to, the second third is in transfer, the last third is being drawn from

(2) avoids a CPU-GPU sync, but causes a client-server thread sync, requires GL 3.x and setting fences

(3) same as (2) but surprisingly faster on most IHVs, no fence needed

(4) compatibility mode, should probably even work with GL 1.5/2.0, somewhat slower but not that bad

Creating a pool and resizing is a good idea. However there is another problem I'm facing when I use VBOs:

I'm currently using a List to store vertex data and render them as described above. This is very comfortable, because I can add/remove vertices from the list every frame. The downside is that rendering a huge amount of vertices takes ages. So VBOs are much faster for rendering, but a lot of flexibility is taken away, because now I need to use arrays to store vertex/index data. With VBOs I have to build a new vertex/index/normal buffer whenever the geometry changes and in my case, this can happen every frame. I know that this is not a VBO/OpenGL problem per se, but maybe someone has good ideas to solve that.

Well, it does not matter a lot. You can keep your list and still use VBOs if you use any form of mapping (it's just 3 more lines of code, and somewhat slower copying the elements into the buffer due to cache misses traversing the list). Only glBuffer(Sub)Data won't work with a list, obviously.

That said, very small objects such as vertices, a list is almost always a bad choice for a storage container (even though in theory, according to the textbook, it is the "correct one"). Despite anything that big-O might tell you, vectors (arrays) or deques (basically vectors of vectors) will perform equally or better for most operations most of the time. This, unintuitively, includes inserting at random locations, unless the objects are quite large (see for example this guy's benchmarks where list only breaks even for random inserts at an object size of 128 and loses everywhere else).

Why not try a vector (or a deque, if you will) and see how it works out? Notable C++ people nowadays suggest using vector as the default unless there is really a very urgent reason for something different.

And indeed, unless the dataset is huge, there's no issue with a vector for any operation (even random inserts), and it is very cache and copy friendly. A deque is somehow in the middle and combines the best (or worst, in some cases, e.g. frequent reallocations) of the two. Just give it a try, it doesn't really cost a lot of work.

Creating a pool and resizing is a good idea. However there is another problem I'm facing when I use VBOs:

I'm currently using a List to store vertex data and render them as described above. This is very comfortable, because I can add/remove vertices from the list every frame. The downside is that rendering a huge amount of vertices takes ages. So VBOs are much faster for rendering, but a lot of flexibility is taken away, because now I need to use arrays to store vertex/index data. With VBOs I have to build a new vertex/index/normal buffer whenever the geometry changes and in my case, this can happen every frame. I know that this is not a VBO/OpenGL problem per se, but maybe someone has good ideas to solve that.

Your code looks like c# so I'm assuming you haven't looked at the list.ToArray() function wich is almost free so you can GL.bufferdata(list.toarray,count). Also you don't need to generate new buffers. Just update the old ones.

I tried it now and it works. smile.png I use list.ToArray() to convert my list to an array.


To get around GPU stalling, you can have multiple of each size (e.g. 10 of each buffer size), and store them in a linked list and grab the buffer at the start of the linked list. At the end of the frame you can put it back at the end of the list.

But I don't understand what the advantage of multiple VBOs of same size is. Why does my GPU stall when I use just 1 vbo?

To understand you correctly: Lets say i allocate 10 vertex buffers and store them in a list. In the first frame, I take the first vbo in the list and store vertex data in it and send it to the graphics card. At the next frame I take the next empty VBO in the list and store my new vertex data in this one. Why shouldn't I just overwrite the first VBO with my new vertex data?

But I don't understand what the advantage of multiple VBOs of same size is. Why does my GPU stall when I use just 1 vbo?

The whole point in using a buffer object in the first place is decoupling the rendering on the GPU from your drawing loop. If you use immediate mode (GL.Begin / End), the server conceptually must wait for you to submit one vertex after another, and it does not know when you'll be done before it sees GL.End(). At that point, it can upload the whole block of vertex data that it has collected and tell the GPU to do something with it. Which means that in the mean time, the GPU is doing nothing, which is not what you want. Ideally, you want the GPU and the CPU to work at the same time.

Similar thing when you draw with a vertex array (client side, not a buffer object). You save some API calls because instead of submitting every vertex one by one, you only submit one array and one draw command. Which is better already, but still the GPU has to wait. You could modify the data in that array at any time, so when is it safe for OpenGL to access this? The only time this is safe is within the draw call. As your thread is executing the draw call, the server knows that it can't execute something different, such as code that modifies the array. So, it has to wait until the draw call before it can make a copy and upload it.

A buffer object is owned by OpenGL. You cannot modify the contents except via the BufferData API or by mapping the buffer object. Which means that OpenGL knows that the buffer's contents are valid at all times. It can therefore upload the buffer without having to wait, and the GPU can start processing it as soon as it's done with whatever it was doing before.

In theory.

In practice, OpenGL must still make sure that "things work correctly", and it must fulfill the guarantees that the API provides. One such guarantee is that you are allowed to load data into a buffer and issue some drawing commands, then load different data into the buffer (while drawing isn't finished yet!) and issue some other drawing commands, and this must work "as expected". Which means no more and no less than if you use a single buffer, the server again has to synchronize.

Invalidating the buffer, or using several buffers or buffer sub-regions removes this need to synchronize. If, for example, you invalidate the buffer object with glBufferData(...,0) then you're telling OpenGL that you are done with this one, and it can do whatever it wants. OpenGL will keep the buffer contents around for as long as it still has unfinished drawing commands that read from it, and then it will throw it away. In the mean time, whenever you talk of that buffer, you are really talking of a new, different one. Which, of course, does not need to be synchronized, since no draw commands depend on it -- it's a totally different buffer.

Similar stuff with mapping persistent buffer subranges and such, except synchronizing properly (using fences) is your responsibility. In the average case, this does nothing because using 3 buffers is just good, and by the time you try to synchronize, it's all over already anyway. However, you must still do it to guarantee that everything still works correctly in the worst case.

This topic is closed to new replies.

Advertisement