Jump to content
  • Advertisement
Sign in to follow this  
GuyCalledFrank

Streaming mesh data in background

This topic is 2507 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi. I want to get rid of loading screens in game and load anything in background while moving from one sector/chunk/location/whatever to another one. It is not so difficult with textures, because there are not too many texture sizes used, so I can preallocate some number of 256/512/1024 textures and update them on demand. But the main difficulty is VB/IB, because size of meshes can vary a lot. The obvious way is to divide the world on equal-sized pieces and store in VB/IB pool much like textures, but it is not really good idea when you have a lot of duplicated objects like streetlights, trashcans, trees, etc; It is difficult to imagine a game with totally unique geometry. So, somehow I need to manage these duplicated objects with rules like:
- avoid allocating/freeing meshes in runtime;
- do not store one object multiple times;
- do not waste too much VRAM (for example I can go with pools again and store small objects in pool limited by much larger chunk size, resulting in waste of memory);
- do not load more than needed (like loading new location with 50% of objects from an old one)

What is a best way to accomplish it using DX9?
---
I know how to load data to RAM in separate thread, the question is about updating VB/IB only.

The best solution I can find is to have one huge VB/IB, upload anything to it, when deleting something - defragment the buffer by reupload existing data to new locations to close the gap from deleted object.

Share this post


Link to post
Share on other sites
Advertisement

The best solution I can find is to have one huge VB/IB, upload anything to it, when deleting something - defragment the buffer by reupload existing data to new locations to close the gap from deleted object.

Before managing the memory yourself, I would give the video driver a chance to manage it itself. When uploading a mesh, I would do it in several frames, each frame allows only a limited amount of data (i.e. 1000 vertices ) to be uploaded. This way you have a constant impact on performance. A mesh should be marked as hidden during upload time to prevent any artifacts or crashes while rendering.

Share this post


Link to post
Share on other sites
- do not store one object multiple times[/quote]

That part's fairly simple. You can use an asset management system where every asset has a unique identifier (like a file path that is hashed into an unsigned int), and the asset manager keeps all of the assets that are loaded in a hash map. Entities should never own a mesh, but keep a pointer / smart pointer / handle to it. When creating an entity or setting its mesh, you ask for the mesh from the mesh manager, and the manager finds it in the map and returns its handle. If a resource is not found, you can return an error mesh (such as a cube procedurally created at init, thus guaranteed to exist). At the same time, you can start loading the requested mesh. This also works for LOD management. If a high detail mesh is requested but isn't loaded, return the low detail version, and when the high detail one is done loading, switch them (if you use handles this works automagically, the entity doesn't need to change anything because the handle stays the same, but the mesh it is pointing to is now different)

- do not load more than needed (like loading new location with 50% of objects from an old one)
[/quote]

Also fairly simple. Your asset manager should have some basic sort of reference counting. If you have level A loaded and are about to switch to level B, increase the reference count of all assets that are used by level B, then decrease the reference count of all assets that are used by level A. Then delete assets with a reference count of 0. Assets already loaded will not be unloaded, since their reference count never dropped to 0.

so I can preallocate some number of 256/512/1024 textures and update them on demand. But the main difficulty is VB/IB, because size of meshes can vary a lot[/quote]

The fact that texture sizes are more regular doesn't help when talking about uploading data to vram. There is very little difference between creating a texture in vram and then updating it vs creating it when you already have the data. Even if every vb was exactly the same size, creating a vb of the appropriate size and then updating it with the actual data is pretty much the same as doing it all in one go.


Even AGP cards have a massive memory bandwitdth. Unless you are trying to upload over 100MB of data in one frame its not really a problem.

Share this post


Link to post
Share on other sites
would do it in several frames, each frame allows only a limited amount of data (i.e. 1000 vertices ) to be uploaded. This way you have a constant impact on performance. A mesh should be marked as hidden during upload time to prevent any artifacts or crashes while rendering.[/quote]
totally agreed


There is very little difference between creating a texture in vram and then updating it vs creating it when you already have the data.[/quote]
Creation time includes additional API calls and VRAM block allocation, it is better to avoid them if possible (just like we're trying to avoid dynamic allocation in RAM at runtime).
And if I want to avoid these dynamic allocations, then it is a bit more difficult to follow the first two rules, which would be really simple as you mentioned in case of create/release_for_each_mesh scenario.
I think the problem here is not with bandwidth but with additional calculations needed to find a free data block in VRAM, also chaotic allocation/deallocation may bring fragmentation.

So I came with a new idea: store one huge VB and IB for all static meshes and treat it as circular buffer.
If we need to load a new chunk of static meshes, we have 2 ways:
- this mesh is already in the buffer (found through std::map or something) - just link the RAM representation of mesh to the appropriate offset of buffer, increment reference counter for this mesh;
- this mesh is not in the buffer: increment buffer's "tail" by the size of new VB/IB, stream data to that block.
If we need to unload old chunk:
- if no references are linked to this chunk - just increment "head" of the buffer;
- if there are references, reupload shared data to the tail/change offsets in all references, then increment head (delete chunk)

Share this post


Link to post
Share on other sites
Creation time includes additional API calls and VRAM block allocation, it is better to avoid them if possible (just like we're trying to avoid dynamic allocation in RAM at runtime).
[/quote]

Well, there's nothing wrong with dynamic allocation at runtime, what you want to avoid is a lot of dynamic allocation from a tight loop. If you follow a strategy of uploading a max of one asset per frame while streaming in the new data, the overhead would be miniscule. And if you are loading the assets from a hard drive, you'd be getting way less than one per frame.

- if there are references, reupload shared data to the tail/change offsets in all references, then increment head (delete chunk) [/quote]
That would mean the VB would have to be much larger than needed. Worst case, if level A and B share all vertex data, you would need the buffer to be twice as large but only for a short amount of time.

So I came with a new idea: store one huge VB and IB for all static meshes and treat it as circular buffer.[/quote]
This is pretty much the same as the following:
At the beginning of a level, load VBs to vram. They should be contiguous (unless the driver has a reason to make them not, which you don't have control over. This could happen even for a single asset - a single texture or vb might not be contiguous.).
While loading a second level, if a VB is shared, create a new copy in vram. You now have a block of contiguous memory for level A and another for level B.
After switching, unload all VBs from previous level. You now have a large block of unfragmented space.


Unless you have some very specific usage parameters, I'd leave it up to the driver to do all your memory allocation. Each driver has teams of people working on it that get paid a lot of money to solve those problems, and they can do it better than you or I can simply because they have intimate knowledge of the hardware they are working with smile.png

I've worked on a simulation that streamed 100's of GB of data in and out of vram, with data ranging from 1gb down to 4 vertex planes; which was required to run for days at a time, and with some simple allocation strategies we never once ran into vram fragmentation issues.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!