Memory-Array vs Dynamic-VBO: which is better?

Started by
27 comments, last by christian h 17 years, 6 months ago
Quote:Original post by RPTD
@phantom:
Is re-allocating the memory by the driver ( through BufferData ) what eats time? I did now use the SubData to avoid this reallocation and it showed that it really reduces the processing time. It's just confusing me that it should be better the other way round.


Ah yes, I should have been clear, what this can get around is any sync issues which might arise from reusing the same VBO. Without the discard the system might well have to wait for drawing to be completed from the VBO before it can be updated, this causes a stall and wastes time while the system sits around and waits.

glSubBufferData() also suffers from this sync problem as in this case you are saying 'only change this data, leave the rest intact' which could lead to the driver having to make copies and other such things.

If you are totally replacing a buffer then judging by what the IHVs/driver writers have said the method I outlined should give you the best results.
Advertisement
Quote:Original post by phantom
glSubBufferData() also suffers from this sync problem as in this case you are saying 'only change this data, leave the rest intact' which could lead to the driver having to make copies and other such things.

If you are totally replacing a buffer then judging by what the IHVs/driver writers have said the method I outlined should give you the best results.


Yes that is what the docs say on NVIDIA site too. But for some reason I keep getting more speed with glBufferSubData() than glBufferData(), and yes, I am passing NULL to glBufferData to do a 'render and discard' before doing a glBufferSubData() call to update the buffer.

Oh well, it maybe something to do with my particular implementation!!
++ My::Game ++
It is indeed a bit faster than using the SubData. With SubData I had between 37 and 40 fps. With this version I have between 39 and 42 fps. A difference of 2 but better than nothing.

Is this behaviour stated in the extension specs or is it just an optimization done by the driver makers? Just asking as from the extension specs I did not read out that such a combination of commands yields better performance.

Life's like a Hydra... cut off one problem just to have two more popping out.
Leader and Coder: Project Epsylon | Drag[en]gine Game Engine

Quote:Original post by RPTD
Is this behaviour stated in the extension specs or is it just an optimization done by the driver makers? Just asking as from the extension specs I did not read out that such a combination of commands yields better performance.


I believe it's just something NV and ATI agreed on, it would have been nice for it to be part of the spec however but such is life.
Quote:Original post by _neutrin0_
When the Vertex buffer is small, glMapBuffer might be faster.
Have you tried your method on large chunks of data and big VBOs?

I used a hipoly model, at least dozen-hundred tris maybe more, so at least >300kb.

Quote:
The reason I am asking is that the VBO document here (ref pages 12 and 13) says that the value passed to glMapBuffer is just a hint. glMapBuffer will "map" the data into system RAM. In worst case senario, the whole buffer might get mapped to the system RAM. For small VBOs it might not matter. If the VBO is large, then there could be a performance issue.


It was faster to map it in write-only mode than read-write which was really slow, so it really might be mapping it?

Oh wait, this was on linux, I haven't tried it in windows though :o.

ch.

[edit]
I meant write-only, not read-only!
[/edit]

[Edited by - christian h on October 12, 2006 11:54:35 AM]
Quote:Original post by christian h
It was faster to map it in read-only mode than read-write which was really slow, so it really might be mapping it?

Oh wait, this was on linux, I haven't tried it in windows though :o.


read-only? You mean write-only?

Also, if you are timming the functions then be sure to time map and unmap functions both. If you are using FPS as a benchmark then its a different story.

Sheesh... I really really need to see more clear specs on this. glMapBuffer, glBufferData, glBufferSubData. Each one is producing differnt results for different tests! I think I will leave it at that and get back to this over the weekend. :))
++ My::Game ++
Somehow I am not astonished by this. Some OpenGL specs are rather fuzzy but most of the time the implementation by the manufacturers is as fuzzy as that. Furthermore tainted kernel modules in linux still tend to be rather shitty in terms of speed hence comparing a test value obtained from linux and windows does not really work out. I have the same problem and what I obtimize for one platform does not necessary optimize also for the other one. They really should open up their drivers so linux guys could turn this into something "usable" <.=.<

Life's like a Hydra... cut off one problem just to have two more popping out.
Leader and Coder: Project Epsylon | Drag[en]gine Game Engine

Quote:Original post by christian h
It was faster to map it in read-only mode than read-write which was really slow, so it really might be mapping it?

Oh wait, this was on linux, I haven't tried it in windows though :o.

ch.


Read only means you will read from your VBO. Why would anybody want to read it?
If the VBO happens to be in VRAM, it would need to be copied to AGP memory or RAM.

Yes my friends! GL_WRITE_ONLY is the way!

glBufferSubData might be better if you want to update small regions of a VBO.
I think there is no choice but to benchmark on different hw with different test cases.
Sig: http://glhlib.sourceforge.net
an open source GLU replacement library. Much more modern than GLU.
float matrix[16], inverse_matrix[16];
glhLoadIdentityf2(matrix);
glhTranslatef2(matrix, 0.0, 0.0, 5.0);
glhRotateAboutXf2(matrix, angleInRadians);
glhScalef2(matrix, 1.0, 1.0, -1.0);
glhQuickInvertMatrixf2(matrix, inverse_matrix);
glUniformMatrix4fv(uniformLocation1, 1, FALSE, matrix);
glUniformMatrix4fv(uniformLocation2, 1, FALSE, inverse_matrix);
Quote:Original post by V-man
Read only means you will read from your VBO. Why would anybody want to read it?
If the VBO happens to be in VRAM, it would need to be copied to AGP memory or RAM.


Sorry, I did mean write-only, and fixed that.

ch.

This topic is closed to new replies.

Advertisement