Hello!
I am trying to find efficient map-write-discard implementation in OpenGL, but with no luck so far. Following suggestions on this page, I tried both buffer orphaning with glBufferData(NULL) as well as glMapBufferRange(GL_MAP_INVALIDATE_BUFFER_BIT), both with depressing performance (approximately 40x slower than similar implementation in Direct3D). The method should ideally be suitable for OpenGL4.2 / GLES3.0 API, so glBufferStorage is not desirable, though initializing the buffer with glBufferStorage(GL_DYNAMIC_STORAGE_BIT|GL_MAP_WRITE_BIT) makes no difference. As an experiment I also tried mapping with GL_MAP_UNSYNCHRONIZED_BIT flag with no difference either. Various usage hints (I use GL_DYNAMIC_DRAW) also have zero effect as well.
In my benchmark I render 32K objects with individual draw calls and map/unmap constant buffer before every call. On my 970GTX, this benchmark runs at ~4 ms/frame in D3D11 mode and 140 ms/frame in OpenGL.
Here is what my map/unmap functions look like:
void BufferGLImpl :: Map(MAP_TYPE MapType, Uint32 MapFlags, PVoid &pMappedData)
{
m_uiMapTarget = ( MapType == MAP_READ ) ? GL_COPY_READ_BUFFER : GL_COPY_WRITE_BUFFER;
glBindBuffer(m_uiMapTarget, m_GlBuffer);
GLbitfield Access = 0;
switch(MapType)
{
case MAP_READ:
Access |= GL_MAP_READ_BIT;
break;
case MAP_WRITE:
Access |= GL_MAP_WRITE_BIT;
if (MapFlags & MAP_FLAG_DISCARD)
{
if (m_bUseMapWriteDiscardBugWA)
{
glBufferData(m_uiMapTarget, m_Desc.uiSizeInBytes, nullptr, m_GLUsageHint);
Access |= GL_MAP_WRITE_BIT;
}
else
{
Access |= GL_MAP_INVALIDATE_BUFFER_BIT;
}
}
if (MapFlags & MAP_FLAG_DO_NOT_SYNCHRONIZE)
{
Access |= GL_MAP_UNSYNCHRONIZED_BIT;
}
break;
case MAP_READ_WRITE:
Access |= GL_MAP_WRITE_BIT | GL_MAP_READ_BIT;
break;
}
pMappedData = glMapBufferRange(m_uiMapTarget, 0, m_Desc.uiSizeInBytes, Access);
glBindBuffer(m_uiMapTarget, 0);
}
void BufferGLImpl::Unmap()
{
glBindBuffer(m_uiMapTarget, m_GlBuffer);
glUnmapBuffer(m_uiMapTarget);
glBindBuffer(m_uiMapTarget, 0);
m_uiMapTarget = 0;
}
Am I doing something obviously wrong here? Is there a clear way to tell OpenGL that I want to discard all previous contents of the buffer?