I see this differently, maybe I have thinking of this concept wrong, to me a renderable is a object such as a Quad.
Well you can see it however you want, what I'm saying is that your current definition, as implied by the sample code you showed and the problems you have with, is part of what is limiting your design. Sure you can define a renderable as a single quad, but that's not really a useful abstraction in general.
For an instance of SpriteBatcher they have:
So here you have a vertex buffer with a particular format. You can describe the format with data as well, as simple as some flags or as complex as a class with offsets and attribute types inside. You have a VAO, buffer objects for the index and vertex buffers, a shader program. All data.
struct RenderCommand { // "Renderable," or "RenderObject," or "DrawCommand" or "Drawable" or "RenderJob" or...
VertexFormatFlags vertexFormatFlags;
void * vertexData;
void * indexData;
int vertexBufferId;
int indexBufferId;
int arrayObjectId;
int programId;
map_of<string, Uniform> programUniforms; // including world matrix, et cetera
array_of<Texture *> textures;
// ...other stuff...
}
You can extend this with whatever other data you find yourself needing; often this might include state information, such as whether or not culling should be enabled or not. Whatever.
This structure represents a logical draw call. It's all the data necessary for you to make that call. All the state you must set. The system that consumes a big list of all of these can use this information to sort the calls appropriately; sort anything flagged as transparent from back-to-front, for example, and ensure it's all rendered after solid objects. It can sort by shared textures or shared shaders, so as to avoid you needing to make excess state changes. This is both more optimal and completely circumvents the problem you initially illustrated where things go all pear-shaped if you forget to call your "end()" method. You don't need any sort of "end" method any more, the RenderSystem has all the draw calls requested as a big list of these RenderObject and all the logic to make sure they ordered and sequenced properly along with their appropriate API state changes is in there, in one place.
Continuing on to your particle emitters, you point out a need for two VAOs and two shaders. You can't bind two shaders at once, thus a particle emitter will naturally involve two RenderCommands, one for each VAO/shader pair. One is for the "update particles" call, the other is for the "render particles" call. Here you will need to come up with a means of ensuring that those two commands always execute in the right order and add that to your RenderCommand. One way to do that is to give each command a unique (per frame) "job ID" and a "dependency job ID," which allow your sorting routines in RenderSystem to ensure one always executes first, if this isn't already implicit in the sorting based on your shaders. Other ways to sort include allowing a "bucket" to be specified when you insert the RenderCommand into the RenderSystem, and preventing the RenderSystem from reordering the buckets relative to each other (bucket 0 always executes before bucket 1) and simply insert the two RenderCommands into different buckets. Et cetera.
In the case of the first command, it's configured such that its geometry shader is writing to an output buffer, and the ID of that output buffer is used for the ID of an input buffer in the second command.
That's it. All the of stuff you're doing in these disparate sprite and particle systems is capture-able as data, as values of fields on a generic structure (or set of structures) that can be generically processed, avoiding repetition, increasing maintainability, probably improving performance, and eliminating the possibility of the sort of state co-operation errors you encounter with a system where every kind of thing manages it own entire render path.
Now your SpriteBatcher still needs to manage a set of vertex buffers, but whenever you ask it to give you a new sprite, it modifies the buffer and returns a RenderCommand pertaining to that buffer. If it happens to be unable to make a sprite with the existing buffer, it creates a new and returns a RenderCommand for that one instead. Either way you simply take all its render commands and insert them into the RenderSystem. Similar for the particle system, you ask for a new particle emitter from it, it maybe allocates some new buffers or resources if it needs them, but ultimately just returns two RenderCommands appropriate-constructed for insertion into the RenderSystem.
The idea is simply to take all the low-level OpenGL state-manipulation that you used to be doing in both places, factor it out by representing the differences as data, and move it into a single uniform place. Build on top of that single uniform place for the parts of the thing that need to be specific, such as the fact that certain specific shaders must be used for rendering or updating the particles.