I can reproduce this issue on several machines with clean Windows 10 install, fully updated (including 1607 update) and graphics driver that OS installed by itself - each having either Intel HD Graphics card, either Intel HD Graphics 4600 or Intel HD Graphics 5300.
My rendering schedule is basically the following:
- Activate the appropriate shader program.
- Attach UBOs to the appropriate slots of shader program.
- Activate VAO for model 1.
- Update UBOs with the appropriate parameters (matrices, light parameters).
- Draw call for model 1.
- Activate VAO for model 2.
- Update UBOs with new parameters.
- Draw call for model 2.
- Repeat steps #6-8 for models 3, 4, .., N -1, N (it is the same mesh, just using different data in UBO).
I have different code paths for "EXT_direct_state_access", "ARB_direct_state_access" and non-DSA approach, but the issue is exactly the same (on Intel HD Graphics cards with Windows 10, "ARB_direct_state_access" is actually not exposed, so I'm not using it there).
Basically, the non-DSA code that exhibits the issue on that configuration is:
// VAO is activated before this.
// (a) create UBO (note: this chunk of code is called at startup, it is not part of rendering loop)
glGenBuffers(1, &bufferHandle);
glGetIntegerv(bufferTargetToBinding(bufferTarget), reinterpret_cast<GLint*>(&previousBinding)); // simulate DSA way
glBindBuffer(bufferTarget, bufferHandle); // bufferTarget is GL_UNIFORM_BUFFER
glBufferStorage(bufferTarget, bufferSize, nullptr, GL_MAP_WRITE_BIT);
glBindBuffer(bufferTarget, previousBinding);
// (b) bind UBO
glBindBufferRange(bufferTarget, bufferChannel, bufferHandle, bufferOffset, bufferSize); // bufferOffset is 0
// (c) update UBO
glGetIntegerv(bufferTargetToBinding(bufferTarget), reinterpret_cast<GLint*>(&previousBinding)); // simulate DSA way
glBindBuffer(bufferTarget, bufferHandle);
mappedBits = glMapBufferRange(bufferTarget, mapOffset, mapSize, GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_BUFFER_BIT); // mapOffset == 0, mapSize == bufferSize
std::memcpy(mappedBits, data, mapSize);
glUnmapBuffer(bufferTarget);
glBindBuffer(bufferTarget, previousBinding);
// (d) draw call
glDrawArrays(topology, baseVertex, vertexCount);
// (e) Repeat (c) and (d) for other models.
If instead of "glMapBufferRange", I use "glBufferSubData" to update the contents, the issue is less pronounced, but still exists (some models seem to jump back and forth between old and new positions as specified in UBO).Note that the issue occurs only on Windows 10 / Intel HD Graphics cards, not anywhere else.
I have found two workarounds: one is to call "glFinish" right after "glDrawArrays", which seems to fix the problem; another workaround is to call "glBindBufferBase" to unbind UBO before updating its contents, then bind it again:
glBindBufferBase(bufferTarget, bufferChannel, 0); // unbind UBO
// Update UBO contents here as in code above, step (c)
glBindBufferRange(bufferTarget, bufferChannel, bufferHandle, bufferOffset, bufferSize); // bind buffer back
However, both of these workarounds seem to impact he performance. I couldn't find anything in the GL spec mentioning that buffer objects need to be unbound first or "glFinish" be called, before updating their contents. So my question would be: is the issue I'm experiencing just a driver bug, or should buffer objects be really unbound before updating their contents?
P.S. I'm using a very similar code to update VBOs as well, and they also exhibit the same issue on Intel HD Graphics cards and Windows 10, albeit to a less degree simply because I don't update them often.