Sign in to follow this  
Nairou

Geometry buffer management

Recommended Posts

This feels like a trivial issue, but I'm a little stuck on memory management of geometry vertex data, and was hoping for some input. I've got my mesh class, which loads an object as a collection of geometry chunks (one material per chunk). For each chunk, there are one or more geometry streams, where each stream is a single type of vertex data (position, normals, texture0, etc.). The streams are stored separately in order to reduce the memory used for materials that don't need every stream loaded. When a geometry stream is needed in order to render an object, it is written into the vertex buffer and set for rendering using SetStreamSource. There are two problems though. The first is that I'm now having to dynamically manage the data within my vertex buffer, keeping track of what portions are used by which geometry streams. Additionally, if a mesh object is deleted, I have to somehow catch that and clear it from the vertex buffer as well. The other problem is memory duplication. So far I've been storing the geometry in system memory, and copying it to the vertex buffer when needed. However, while it is in the vertex buffer, it is wasting system memory by existing twice. Which means that I should probably deallocate it from system memory once it has been uploaded to the vertex buffer. But it order to do this, it seems like I would need some sort of syncronization between mesh objects and the renderer's vertex buffer in order for the mesh to know why it's data is being dumped without the whole mesh being deleted and without it resulting in invalid pointers. The only alternative I can think of is to promote the geometry streams to full-blown classes and have the renderer ask them to load/unload themselves to/from the vertex buffer rather than the renderer reading them as just data. But this also starts to sound a bit messy with unnatural dependencies. How do other people handle this? I've googled around for some time but haven't found a whole lot on this topic, but I know everyone has to deal with this sort of thing in one way or another.

Share this post


Link to post
Share on other sites
Nothing? Am I perhaps going about this completely different from everyone else? Even if you don't have an answer to my specific question, I'd love to see how you deal with geometry buffer management in your game as a point of comparison and ideas..

Share this post


Link to post
Share on other sites
Quote:
Original post by Nairou
There are two problems though. The first is that I'm now having to dynamically manage the data within my vertex buffer, keeping track of what portions are used by which geometry streams. Additionally, if a mesh object is deleted, I have to somehow catch that and clear it from the vertex buffer as well.


Could you explain this a bit more? Maybe give an example?

Quote:
The other problem is memory duplication. So far I've been storing the geometry in system memory, and copying it to the vertex buffer when needed. However, while it is in the vertex buffer, it is wasting system memory by existing twice. Which means that I should probably deallocate it from system memory once it has been uploaded to the vertex buffer.


Sorry if I'm missing something, but why not just stick the data in a vertex buffer in the first place? Why do you ever need it to be in system memory?

Share this post


Link to post
Share on other sites
Quote:
Original post by Gage64
Quote:
Original post by Nairou
There are two problems though. The first is that I'm now having to dynamically manage the data within my vertex buffer, keeping track of what portions are used by which geometry streams. Additionally, if a mesh object is deleted, I have to somehow catch that and clear it from the vertex buffer as well.


Could you explain this a bit more? Maybe give an example?

Well I'm trying to treat the vertex buffer as more of a cache than a per-object buffer (kinda like what is described here). I allocate a single large vertex buffer, then treat it as a pool for uploading geometry to the video card. Since I treat each type of vertex data separately (block of vertex positions, block of vertex colors, block of vertex normals, etc.), depending on what data is needed to render an object, there are potentially multiple blocks of data that need to be copied to the vertex buffer.

When I delete the object in memory, once it is no longer needed, all of it's geometry will of course need to be deleted, including the data that was copied to the vertex buffer (if it is still there). So I'll need some sort of notification of when an object is deleted so that the vertex buffer can be emptied. The only thing I can think of so far to handle this is to put the geometry struct itself in charge of copying it's data to the vertex buffer, so that it can handle the removal itself in it's destructor.

Quote:
Quote:
The other problem is memory duplication. So far I've been storing the geometry in system memory, and copying it to the vertex buffer when needed. However, while it is in the vertex buffer, it is wasting system memory by existing twice. Which means that I should probably deallocate it from system memory once it has been uploaded to the vertex buffer.


Sorry if I'm missing something, but why not just stick the data in a vertex buffer in the first place? Why do you ever need it to be in system memory?

I plan to be dealing with a very large world. My big assumption in all of this is that I will have more geometry loaded in system memory than I will be able to fit in video memory, and so rather than stick everything into individual vertex buffers, I only upload them to a vertex buffer when I'm ready to render them.

Share this post


Link to post
Share on other sites
Quote:
Original post by Nairou
I plan to be dealing with a very large world. My big assumption in all of this is that I will have more geometry loaded in system memory than I will be able to fit in video memory, and so rather than stick everything into individual vertex buffers, I only upload them to a vertex buffer when I'm ready to render them.


I think D3D handles this for you. From the SDK regarding D3DPOOL_DEFAULT:

Quote:
When creating resources with D3DPOOL_DEFAULT, if video card memory is already committed, managed resources will be evicted to free enough memory to satisfy the request.


Share this post


Link to post
Share on other sites
Quote:
Original post by Nairou
When I delete the object in memory, once it is no longer needed, all of it's geometry will of course need to be deleted, including the data that was copied to the vertex buffer (if it is still there). So I'll need some sort of notification of when an object is deleted so that the vertex buffer can be emptied. The only thing I can think of so far to handle this is to put the geometry struct itself in charge of copying it's data to the vertex buffer, so that it can handle the removal itself in it's destructor.

If you actually follow Yann's scheme (the one you've cited above), then there is the LRU handling this problem automatically. However, if you even though mark pages outdated manually at mesh object deletion, it may have an optimization effect. That should be no problem, since you've to manage which pages hold which geometry anyway.

[Edited by - haegarr on July 6, 2008 5:29:00 AM]

Share this post


Link to post
Share on other sites
I am once again implementing a somewhat different buffer management scheme than Yann L's

The idea is to couple materials and geometry, to reduce material changes and buffer bind operations, therefore I subdivide render_objects into material_groups.
The render_object is just an instance of a mesh/model
and the material_groups represent the tiny submeshes with different materials each.
Thats the only way I can think of, that allows you to efficiently reduce the overhead of material and vbo changes.

I basically use 2 allocators as desribed in Yann L's post.

Now I associate each slot with (materialid,time), for LRU
Everytime I want to render something the process looks like this:

1) find all visible render_objects
2) generate lists of material_groups
2a) sort the material_groups lists by their assign VBO slot
3) iterate over all material lists, cache and bind the buffers when needed and render the material_groups.

Cons:
- update the modelview stack more often, since the mesh is instanced in render_object and you loose the correspondence material_group<->render_object during generation of the material_lists.
- you should consider the world position for sorting as well(early z!!!!!!!!!!!!)


Pros:
- You reduce texture and VBO switches to a minimum.


I am currently implementing this scheme, so if you have any suggestions please post them. Once its done I can probably give some feedback about its efficieny.

Share this post


Link to post
Share on other sites
Quote:
Original post by Gage64
When creating resources with D3DPOOL_DEFAULT, if video card memory is already committed, managed resources will be evicted to free enough memory to satisfy the request.

True to some extent, except that I'd rather free space within the vertex buffer than free some other random resource. I'm trying to manage the space within an existing vertex buffer rather than allocate a new vertex buffer.

Though that does bring up a good question, what the performance difference is between allocating hundreds of vertex buffers, one per object, and switching buffers to render each object, or storing them all in one big vertex buffer and making hundreds of SetStreamSource calls to access everything within the buffer. Still though, if I keep everything sorted by material (vertex format), then the SetStreamSource calls should be reduced some, and I can make several draw calls without changing anything.

@haegarr:
That's a very good point, I hadn't thought too much on the LRU part yet. But yeah, it will probably free up more space quicker if I can just mark the geometry in the vertex buffer as no longer needed so that it can be replaced immediately without LRU searching first.

@Basiror:
What you describe is actually very similar to Yann's method (your material_group and Yann's geometry chunk seem to be the same thing). We all seem to be doing variations on the same concept.

Share this post


Link to post
Share on other sites
Quote:
Original post by Nairou
True to some extent, except that I'd rather free space within the vertex buffer than free some other random resource. I'm trying to manage the space within an existing vertex buffer rather than allocate a new vertex buffer.


I don't think it will be a random resource. D3D is probably smarter than that and uses a policy (like LRU) for this sort of thing. Also, it might have a better knowledge of resource utilization so it might be in a better position than you to decide what to free. Than again, it might not be as simple as that in more "advanced" situations.

Quote:
Though that does bring up a good question, what the performance difference is between allocating hundreds of vertex buffers, one per object, and switching buffers to render each object, or storing them all in one big vertex buffer and making hundreds of SetStreamSource calls to access everything within the buffer. Still though, if I keep everything sorted by material (vertex format), then the SetStreamSource calls should be reduced some, and I can make several draw calls without changing anything.


According to some stuff I read a long time ago (which I think is still applicable), creating a huge 500MB vertex buffer to store all geometry is a big mistake, because the GPU will not be able to move it back and forth from video memory, so it will always remain in system memory, severely degrading performance.

Creating one vertex buffer for each object will result in many vertex buffer switches, which are relatively expensive (as far as I know).

So, creating reasonably sized vertex buffers to hold several objects is the best choice. It might clash with material sorting, but it will probably still be less VB switches than with the one-VB-per-object method.

Share this post


Link to post
Share on other sites
@Nairou: Yes that is correct, however Yann L doesn t take the material id into consideration when choosing a VBO slot to place the geometry inside.

He assumes that the VBO slots are large enough that you don t have to rebind them that often statistically, so he accepts a reduction of material bind operations on the cost of VBO rebinding.
I however prefer a technique, where I bind materials and VBOs simulatenously without rebinding during the same frame

e.g.: slot1 and slot2 are used by material 1, it is guaranteed that I bind slot1 and slot2 only once this frame.
The trick here is to use many smaller VBOs instead of a few large ones, that are more expensive to move around from system memory to VRAM. This also reduces the time needed to proceed with rendering, since moving a VBO of only half the size costs only half the time and offers the nice side by effect that you can do this while binding the next materials too.


Another nice effect is, you can degenerate my management scheme to Yann L's original idea within a few lines of code, just get rid of the material<-->slot constraint

Share this post


Link to post
Share on other sites
@Basiror:
How do you guarantee that you bind the slots only once per frame? Do you do some sort of calculation for how much data needs to be stored in the VBO ahead of time? What if you suddenly have a lot of objects on the screen using the same material, and it won't all fit in your allocated slot for that material?

Share this post


Link to post
Share on other sites
Well, each render_object is composed of material_groups and each material_group contains a reference to which vbo you need to bind

So you can sort your material_groups accordingly.

Just pseudo code

map< materialid,map<vbo_id,list<material_group*> > > materiallistsmap;
foreach(m in material_groups)
{
materiallistsmap[m->materialid][m->vbo_id].push_back(m);
}




Your list<material_group> could once again be a
multimap<float,material_group> where float corresponds to the distance from the camera to make use of early z.

Another extension would be to partially invalided the map<float,material_group> each frame for material_groups that become invisible(are culled away).

I havent implemented it yet, so it is just an idea, but I think it is worth to evaluate its realtime efficiency.
Maybe you need to implement a selfmade container to speed up insertion and deletion.

There is another optimization I did not yet mention,
the render_objects obviously can contain material_groups with the same materials, so it would be straight forward to merge those to reduce function calls.


In my upcoming project this is impossible to do though, because I plan to implement some very destructible buildings out of premade static_meshes that form a deformation hierarchy. So my mesh contains lots of tiny submeshes and it is near to impossible to reduce the function call overhead.
I could rebuild the geometry buffer on change of the deformation state, but this might cause lags for larger buildings.

Share this post


Link to post
Share on other sites
Quote:
Original post by Nairou
@Basiror:
How do you guarantee that you bind the slots only once per frame? Do you do some sort of calculation for how much data needs to be stored in the VBO ahead of time? What if you suddenly have a lot of objects on the screen using the same material, and it won't all fit in your allocated slot for that material?


Thats just a matter of how large your slots are and how many slots your are using per material. You may use multiple slots.

It would also be interesting to share slots between consecutive materials.

As Yann L already mention, this all depends on your parameterization. Allocate a pool large enough to hold your visible geometry.
Without buffer management you would simply store all the geometry in VBOs and let the driver handle the system<-->VRAM transfer, if they don t fit into your VRAM. So there speaks nothing against allocating lots of VBO slots and do the transfer on your own.

Your aim should be to reduce the buffer binding operations and to avoid trashing the buffer cache(no repeated rebuilding during the same frame).
Even if you provide infinitely many slots and store all the geometry there, you will still be faster than using per object VBOs, because you reduce the binding operations tremendously.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this