And don't forget when working in batches to reserve enough space for everything in advance.
Resizing a vector is potentially a slow operation, especially when objects are not trivially movable. Even when they are easily moved, code that triggers multiple vector resizes can have a huge performance hit.
See this the tiny dilemma that I have. I want to better my sprite batcher as now it currently just does a intermediate mode (no sorting, no support for changing shaders, etc). But I do not know which is worse:
Creating and allocating additional VBOs for the GPU when I do not have a "free" VBO
OR
Creating one large VBO and have something like the batch class above.
Then when I do not have a "free / used" batch I create a new one and add it onto some vector
Boiling down using up more GPU memory or system memory
Keeping in mind I am targeting mobile devices, using OpenGL es 2.0 where I do not have direct mapping functions for pushing data directly into a VBO.