Jump to content

  • Log In with Google      Sign In   
  • Create Account


OpenGL 2.1 / ES 2 streaming vertex buffer update performance


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
3 replies to this topic

#1 h3xl3r   Members   -  Reputation: 155

Like
0Likes
Like

Posted 17 May 2014 - 02:01 AM

Good day!

This is something I've been struggling with for a while now:

I am streaming a lot of dynamic vertex data in a OpenGL 2.1/OpenGL ES 2 compatible renderer that is staying away from fixed function completley (in desktop GL 2.1 where there would still be the option). Now, as I have cpu-side state management I can swap on the fly between implementations using either Vertex Arrays or GPU-buffers and profile the separate paths/methods.

All is well, except that there is no way it seems I can get the vertex buffer updates using proper GL buffer objects as fast as the vertex array way, though theoretically it should come down to exactly about the same actions in the driver if I think about it:

- vertex array:
copy all the data from cpu memory to driver in glVertexAttribPointer

- vertex buffer:
copy all the data from cpu memory in glBufferSubData

I know this is of course rather dependent on the driver as well, but between a rather low-spec desktop machine and multiple versions of iPads, there is no way I can get the same times. The GPU buffer always loses.

I've been of course looking on the web and I tried everything I saw that I actually have available in the GL2-class spec (double/triple buffer, invalidating with null data first, etc.) to no avail. Also I am soon going to start the GL3/ES3 version which removes the vertex array option altogether, so this is going to have to be solved.

Any ideas? Appreciate all hints really! Maybe something extension based that isn't too badly supported?

Edit:

Just want to mention I also tried various ways to "Map/MapRange", but none of the new fancy options to these are available to (vanilla) GL2/ES2 and (especially on the pads) these also always came in late in the profiling. I am obviously causing a sync somewhere but going double or triple and invalidation tricks just don't seem to make a difference so I am a bit lost for ideas smile.png


Edited by h3xl3r, 20 May 2014 - 09:44 AM.


Sponsor:

#2 h3xl3r   Members   -  Reputation: 155

Like
0Likes
Like

Posted 20 May 2014 - 01:34 AM

I've found this excellent article from "OpenGL Insights" discussing exactly this problem and evaluating all the options:

http://www.seas.upenn.edu/~pcozzi/OpenGLInsights/OpenGLInsights-AsynchronousBufferTransfers.pdf

 

Seems like in the end there's separate "fast paths" for different vendors (*sigh*).


Edited by h3xl3r, 20 May 2014 - 09:44 AM.


#3 Erik Rufelt   Crossbones+   -  Reputation: 3139

Like
1Likes
Like

Posted 20 May 2014 - 02:24 AM

I've found it's often faster to use glBufferData to overwrite the entire buffer rather than use glBufferSubData, even if only part of the buffer actually needs to be updated. If you have lots of data that can be updated, divide it into multiple buffers of for example 256 vertices per buffer and try to use a method that updates as few buffers as possible each frame.



#4 h3xl3r   Members   -  Reputation: 155

Like
0Likes
Like

Posted 20 May 2014 - 04:18 AM

Thank you! That's exactly the sort of experience I wanted to hear about!

 

I am guessing you also mean to combine the glBufferData with "discarding" the "old" buffer before filling it each frame by buffering NULL data first with the same parameters? Seems to be an sort of "accepted" hint for the vendor driver to hand out a new memory region instead of using an "in-flight" buffer and causing a sync.






Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS