Archived

This topic is now archived and is closed to further replies.

Gotcha

one glMapBuffer or multiple glBufferSubDataARB's?

Recommended Posts

Gotcha    122
For my terrain engine I created a VBO for the section of terrain that is visible. I partitioned this VBO into about 400 divisions. For each frame some of these partitions need to be updated as terrain enters or leaves the viewing frustrum. This typically is a very low number per frame, but when the camera spins it can be up to 40 partitions. The question is which of the following 2 options would you guess would be faster? option 1: glMapBuffer do all updates glUnmapBuffer option 2: glBufferSubData glBufferSubData glBufferSubData ... ... ... I know Nvidia and ATI reccommend glBufferSubData, however does that hold if you need multiple glBufferSubData''s to do the job? thanks.

Share this post


Link to post
Share on other sites
_the_phantom_    11250
hmmm intresting question, i guess the answer would be the ''glBufferSubData()'' calls would be better.
The reasoning behind this is that the pdf i got it from is about getting max performance which is where they recommend it, even with the glMapbuffer you still have the function calls to copy the data so infact you''ve got 2 extra function calls in there (1 to map, one to unmap and X to copy) and you have overhead of buffer sync as well to contend with where as with the glBufferSubData option you only have X calls to the buffers and in theory it can be streamed into the buffers by the drivers, reducing or removing sync issues.

Share this post


Link to post
Share on other sites
rick_appleton    864
I''m hazarding a guess here (I seem to do that often nowadays ), but I doubt speedwise there would be much difference. However, I think the glMapBuffer introduces a stall in the GPU, whereas glBufferSubData doesn''t, so the latter would probably be preferable, since it lets you run concurrently on both CPU and GPU.

Can anyone prove me right/wrong here?

Share this post


Link to post
Share on other sites
schue    176
I found a good description with some answers to the matter here:

http://developer.nvidia.com/attach/6427

Share this post


Link to post
Share on other sites
_the_phantom_    11250
yeah, pages 12 and 13 pretty much backup what I said about it being faster to 'glBufferSubData()' than to 'glMapBuffer()'

intresting document btw, i'll keep a copy of that lurking around as it could be handy

[edited by - _the_phantom_ on May 25, 2004 12:32:36 PM]

Share this post


Link to post
Share on other sites
rick_appleton    864
_the_phantom: I'm not so sure. It says glBufferData is faster than glMapBuffer since it can make a new buffer and drop the old one. So it won't matter if the GPU is using the buffer at that particular moment. glBufferSub Data however requires a kind of lock on the buffer, and this could have to wait until the GPU is done with the buffer.

So speedwise the only thing that is certain is that glBufferData is the fastest of the three (glBufferData, glBufferSub Data, glMapBuffer), but there is no info on how the other two compare to eachother.

Edit: of course that would require you to replace the entire vertex data (with texture coordinates etc) so it might actually be faster to use a slower call, but not have to replace all the data.

[edited by - rick_appleton on May 25, 2004 2:38:35 PM]

Share this post


Link to post
Share on other sites
_the_phantom_    11250
that depends on how the data is dealt with in the driver.
As that document says, the value given to glMapBuffer is just a hint, if your data is in VRAM then that document implies that the driver could well readback that data to AGP or even system ram, then you use the address given to write to the memory, then when you unmap that data has to be copied back.
On the other hand, glBufferData and glBufferSubData are ONLY able to write data to the VBO, at which point the driver can control its own sync issues and stream the data into the correct points of the VBO.

I''d have said it would have been better to replace all the data, regardless of if only the vertex position data has changed (assuming interleaved format of data), simple because it could aid the driver when it comes to replacing that data (streaming complete data into ram VS replacing sections of data in VRAM etc), more so if its in multiples of 32bytes as thats what AGP naturaly transfers at a time under DMA.


[Phantom Web | OpenGL Window Framework ]
"i wonder why i do that... type words which are nuffin like the word i wanted every now and badger"

Share this post


Link to post
Share on other sites