Implementing a fast dynamic vertex/index buffer pool

Started by
2 comments, last by Yourself 7 years, 1 month ago

Hi!

I'm implementing a destructible voxel terrain.

For rendering changeable, dynamic geometry I need to frequently allocate and free dynamic vertex and index buffers.

What is the preferred way to organize VBs/IBs for quick allocations and frees?

(The usual approach is to organize buffers in several bins, round the size of each chunk up to the next power of two and iterate the linked list in the bin. But it degenerates to linear search if many chunks have almost equal size.)

Advertisement

Each bin should contain only the VBs/IBs that are the same size. That way to get/release a bin you just grab/return the last one, which is O(1) or constant time.

What is the preferred way to organize VBs/IBs for quick allocations and frees?

Simple - don't allocate and free stuff ! <g>.

A few possible approaches:

1. a cache. Array of structs. a struct has a static VB pointer, a static IB pointer, numverts, numtris, and any other necessary info such as is_active and age for LRU paging and the chunk location in the game world. VBs and IBs are all the same size, and are "big enough" for the worst case scenario. They get allocated one time at program start, and released one time at program shutdown. when its time to generate a "chunk", you get the index of the first inactive struct, or the least recently used one. you lock the VB and IB and overwrite with new data, set it to active, set the age to zero, and set the chunks location. Unless you change VBs every frame, static buffers should be fast enough. A function that looks up a chunk index given a world location is helpful. Caveman 3.0 uses this method to generate the ground mesh on the fly.

2. same as option 1, but using dynamic buffers, due to frequent updates. Probably use more buffers to insure you have a buffer not being used by the GPU when it comes time to lock. If you don't have extra buffers around, you'll cause a stall.

3. If you can't afford the memory for a real cache, you can allocate and de-allocate exact size buffers on the fly, but it will run slower. If all chunks tend to be the same size, you probably won't save much RAM.

4. linked lists tend to be slow for many/most operations compared to other types of lists.

For Caveman 3.0, i simply draw all possibly visible chunks. I do a trivial cull based on camera direction. For each chunk that passes that, i do a lookup to get its cache index. If the chunk is not in the cache, i generate it right then and there - right in the middle of a begin scene / end scene block. As many as six at once. Then i do a nested loop that is basically:

for each texture in the chunk:

set texture chunk[c].tex[t]

for each mesh that uses that texture:

if mesh passes frustum cull:

set mesh chunk[c].tex[t].mesh[m].VB

set index chunk[c].tex[t].mesh[m].IB

set world mat based on chunk[c].location

DrawIndexdPrimitive DevicePtr chunk[c].tex[t].mesh[m].numverts chunk[c].tex[t].mesh[m].numtris

In my case, a chunk is a list of 25,000 draw calls (max), and the four interleaved meshes that make up the ground mesh for a chunk are just four of those 25,000 possible draw calls. The rest are rocks, plants, trees, etc.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

Depending on the api you use, you should try a single memory buffer combined with a specific allocator.

I recently had great success using this on a buffer of about 512mb and a buddy allocator

That 512 is the highest setting (medium was 256, low 128) and on a higher level I made sure that you never need more memory (aka bias on culling/LODs etc). Cool thing is that you always know the exact amount of memory needed and only have a single allocation at loading time

It does depends on your graphics api as not all of them allow you to use a 'random' memory address as IB/VB/...

This topic is closed to new replies.

Advertisement