If anyone has lots of questions, you can just compile and try Ogre 2.1, then disect its source code to see how we're handling it. It's Open Source after all.
Doing what I'm saying is not impossible, otherwise we wouldn't be doing it.
To answer The Chubu's question, glDrawElementsInstancedBaseVertexBaseInstance has THREE key parameters:
- baseInstance: With this I can send an arbitrary index as I explained, which I can use to index whatever I want from a constant (UBO) or texture (TBO) buffer. I can even perform multiple indirections (retrieve an index from an UBO using the index from baseInstance)
- baseVertex: With this I can store as many meshes as I want in the same buffer object; and select them individually by providing the offset location to the start of the mesh I want to render. With this, I don't need to alter state at all (unless vertex format changes). The meshes don't even need to be contiguous in memory; they just need to be in the same Buffer Object and aligned to the vertex size.
- indices: With this I can store as many meshes' index data as I want in the same buffer object, and select them individually by providing the offset location to the start of the index data. Remember to keep alignment to 4 bytes. Bonus points: You can keep the vertex and index data in the same buffer object.
The DX11 equivalent of this DrawIndexedInstanced and the analogous parameters are StartInstanceLocation, BaseVertexLocation & StartIndexLocation respectively.
We treat all of our draws with these functions.
The DX11 function works on DX10 hardware just fine. glDrawElementsInstancedBaseVertexBaseInstance was introduced in GL 4.2; however it is available to GL3 hardware via extension. The most notable remark is that OS X doesn't support this extension, at the time of writing.
The end result is that we just map the buffer(s) once; write all the data in sequence; bind these buffers and then issue a lot of consecutive glDrawElementsInstancedBaseVertexBaseInstance / DrawIndexedInstanced calls without any other API calls in between.
We only need to perform additional API calls when:
- We need to bind a different buffer / buffer section (i.e. we've exhausted the 64kb limit)
- We need to change state (shaders, vertex format, blending modes, rasterizer states; we keep them sorted to reduce this)
- We're using more than one mesh pool (pool = a buffer where we store all our meshes together), and the next mesh is stored in another pool (we sort by pools though, in order to reduce this switching).