Jump to content
  • Advertisement
Sign in to follow this  
RPTD

Memory-Array vs Dynamic-VBO: which is better?

This topic is 4386 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm trying to optimize my render code. During this I noticed that the copying values to the VBO takes quite some time. The VBO is a dynamic one as the mesh bends around ( creature ). Now I question myself if I am quicker using a memory-array instead of making a VBO. I also would like to keep in mind the memory consumption. With higher resolution meshes the VBO data can quickly explode eating precious texture memory. Is it worth dropping a VBO in favor of cpu memory especially if the data changes every frame?

Share this post


Link to post
Share on other sites
Advertisement
You would probably be better off using VBOs for more static data thats displayed a lot. I would recommend trying both methods and seeing what works best for you, and what is fastest.

Share this post


Link to post
Share on other sites
Quote:
Original post by RPTD
The VBO is a dynamic one as the mesh bends around ( creature ).
Instead of using a dymaic mesh (that is generated in software), it may be possible to use a static mesh and deform it in hardware with matricies (i.e. hardware skinning / skeletal animation).

Share this post


Link to post
Share on other sites
VBO might stand out as better approach, if you go deeper and do stuff like caching, multiple passes per frame. And you shouldn't copy an array of vertices to the VBO, but calculate the data directly to it using memory-mapping. And create the VBO with write-only, rw-mode _will_ kill you. And then there's the performance hints, static/dynamic usage etc..

For single-pass non-cached low-poly models VA's work really well these days at least.

ch.

Share this post


Link to post
Share on other sites
@PhilMorton:
The problem is that I use a complex animation system. For example my dragon player model weights in at roughly 410 weight matrices ( from over 100 bones with vertex bone weights ). While I could do a Float-Texture hack there to calculate the vertices I am at a complete loss what goes for normals and tangents. I have to calculate them all over from the transformed vertices as there exists no way to produce a weight matrix for those ( A simple example situation shows the impossibility immediatly ). Hence transforming on the GPU would be not impossible but would require heavy tricks with Float-Textures and various GLSL scripts. This approach would cost a huge amount of texture memory and I don't know if the speed would really catch up with in the end.

@christian:
And there I heard before that memory mapping is worse than using a copy array: now what is true? And furthermore I am in OpenGL here. Don't know where there would be "write-only" mode and such things. You can only set STATIC or STEAM modes ( 3 in total ).

Share this post


Link to post
Share on other sites
Quote:
Original post by christian h
VBO might stand out as better approach, if you go deeper and do stuff like caching, multiple passes per frame. And you shouldn't copy an array of vertices to the VBO, but calculate the data directly to it using memory-mapping.
ch.


If you use glBufferSubData, then your later method is not possible.
I have read that ATI prefers this over glMapBuffer. I don't really know which method is better.

The OP can make a dynamic VBO.

glBindBuffer(...., VBOID);
glBufferData(..., ..., ..., GL_STREAM_DRAW);
or
glBufferData(..., ..., ..., GL_DYNAMIC_DRAW);

STREAM means you will change very often : change, draw, change, draw
DYNAMIC means you will change les often : change, draw, draw, change, draw, draw, draw, draw, draw

but these are hints to the driver.
For some driver, STREAM and DYNAMIC may be the same thing.

Share this post


Link to post
Share on other sites
Quote:
Original post by V-man
Quote:
Original post by christian h
VBO might stand out as better approach, if you go deeper and do stuff like caching, multiple passes per frame. And you shouldn't copy an array of vertices to the VBO, but calculate the data directly to it using memory-mapping.
ch.


If you use glBufferSubData, then your later method is not possible.
I have read that ATI prefers this over glMapBuffer. I don't really know which method is better.


Yes the glBufferSubData is faster on modern hardware with the latest drivers. The map method was older and is generally a slower method. I have actually seen FPS drop because of mapping VBOs.



Share this post


Link to post
Share on other sites
So if I have a preallocated VBO using glBufferSubData for the entire range is faster than glBufferData on a modern system?

Share this post


Link to post
Share on other sites
Quote:
Original post by RPTD
So if I have a preallocated VBO using glBufferSubData for the entire range is faster than glBufferData on a modern system?


I will indirectly answer your question because frankly I am not sure how glBufferSubData and glBufferData are managed by the drivers and the GPUs internally.

1. glBufferData does an allocation of memory evertime you call it.

2. glBufferSubData will update a part of the data and does no memory allocation or deallocation. So should be faster.

Now the results. I did some changes in our engine's renderer, which uses VBOs whenever possible and falls back on Vertex arrays if VBO support is absent. I made sure the renderer was using VBOs and then tried replacing glBufferSubData and glBufferData. There was a drop in speed, but the results are far from conclusive. Also I currently have only a single gpu to test the code so can't really say if glBufferSubData was of any real value. The other reason may be because the engine batches data aggressively so there is no real difference noticeable. I need to test more with animated meshes and a bunch of other stuff.

Then I tried using Vertex Arrays instead of VBOs on same GPU, speed dropped considerably. This again is with the entire engine and not with one particular instance like the one you are interested in ("With higher resolution meshes")

Share this post


Link to post
Share on other sites
What do you mean exactly by "batching" in this context? ( just to see if I have something similar to compare results )

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!