OpenGL Check if VBO Upload is Complete

Started by
9 comments, last by Matias Goldberg 8 years ago

So can you use a fence to test for the actual status of a BufferSubData call uploading to server? And that works (...) without issuing a draw call against that buffer?

Yes, but no:
1. In one hand, you can reliable test that the copy has ended by calling glClientWaitSync with GL_SYNC_FLUSH_COMMANDS_BIT set. No need to issue a draw call. An implementation that doesn't behave this way could cause hazards or deadlocks and would be therefore considered broken. However...

2. On the other hand, flushing is something you should avoid unless you're prepared to stall. So we normally query these kind of things without the flush bit set. Some platforms may have already began and eventually finish the copy by the time we query for 2nd or 3rd time (since the 1st time we decided to do something else. Like audio processing). While other platforms/drivers may wait forever to even start the upload because it's waiting for you to issue that glDraw call and decide uploading the buffer will be worth it. Thus the query will always return 'not done yet' until something relevant happens.

So the answer is yes, you can make it work without having to call draw. But no, you should avoid this and hope drivers don't try to get too smart (or profile overly smart drivers).

...and that works consistently across platforms...

If you're using fences and unsynchronized access you're targeting pretty much modern desktop drivers (likely GL 4.x; but works on GL 3.3 drivers too), whether Linux or Windows. It works fine there (unless you're using 3-year-old drivers which had a couple fence bugs)
Few android devices support GL_ARB_Sync. It's not available on iOS afaik either. It's available on OSX but OSX lives in a different world of instability.

Does it work reliably across platforms? Yes (except on OSX where I don't know). Is it available widespread in many platforms? No.

If you're using fences and thus targeting modern GL, this brings me my next point: Just don't use BufferSubData. BufferSubData can stall if the driver ran out of its internal memory to perform the copy.

Instead, map an unsynchronized/persistent mapped region to use as a stash between CPU<->GPU (i.e. what D3D11 knows as Staging Buffers); and then perform a glCopyBufferSubdata to copy from GPU Stash to final GPU data. Just as fast, less stall surprises (you **know** when you've run out of stash space; and fences tell you when older stash regions can be reused again), and gives you tighter control. You can even perform the copy from CPU -> GPU stash in a worker thread, and perform the glCopyBufferSubdata call in the main thread to do the GPU Stash->GPU copy.
This is essentially what you would do in D3D11 and D3D12 (except the GPU->GPU copy doesn't have to be routed to the main thread).

This topic is closed to new replies.

Advertisement