i use the following databases:
meshes
textures
materials
models
animations
animation players
entity types
object types
entities
objects
an entity has a model, a texture set (for different skin tones), an animation player, and a current animation.
an object has a mesh, texture, and material, or a model, an animation player, and a current animation.
models consist of meshes, textures, and materials.
i use a struct called a Zdrawinfo to pass drawing information around. It holds primitive type (mesh, model, 2d billboard, or 3d billboard) mesh, texture, material, model, animation player, scale, rotation, translation, cull, alphatest, and clamp information, as well as a world matrix for complex transforms.
in my newest code architecture, object types and individual entities has a Zdrawinfo with basic drawing information such as primitive type as well as mesh, texture material, scale, cull, and alpahtest, or what model to use.
to draw, i make a copy of the Zdrawinfo for an entity or object, fill in the transform and other info, set the model to the current animation frame and texture set (if used), then send it off to the "drawlist". The drawlist calculates the world matrix for the mesh if needed (IE a complex transform matrix was not supplied, just scale, rotation, and translation values). Then it adds the Zdrawinfo to a list of textures used in the scene. each texture in the list has a list of meshes that use it. Models (rigid body) are broken down into their component meshes and added to the drawlist as separate meshes. When it comes time to draw, the draw list draws all the meshes in order, based on texture. It includes a state manager to filter out redundant state changes.
all the databases are implemented as arrays. all references are done using ID numbers, which are the indices of the database arrays. so the first texture in the texture database has a texID of zero, and so on.
originally, i kept all the Zdrawinfos in the drawlist in one big list. When a Zdrawinfo was added, its range to the camera was also calculated. when it came time to draw, the list was sorted on texture, then mesh, then far to near. Later, i developed the 2D list while working on a fast way to draw large chunks of terrain containing thousands of meshes. So i modified the drawlist to use the new 2D list data structure. Right now its so fast, i don't even bother to sort on mesh, material, or near to far. I don't sort at all, although i may go back and add sort on mesh, then material, then near to far.
the data in the Zdrawinfos in the drawlist, especially the old single list version that got sorted, is similar to your "rendertokens".
in my case the basic drawing info is stored in the entities database or the object types database, which in your case would be a "drawinfo" component in your entities.
then you would proceed along the same lines that i do. pass your drawinfo with your pointers in it to your renderer, and let it do its thing.
its possible that using pointers instead of ID numbers and arrays, or some other aspect of C++ specific syntax is causing your difficulties.
I see questions like this all the time. Sometimes i wonder how anyone gets anything done with such difficulties.
As you've probably surmised by now, i don't have these issues because i don't use C++ syntax for such things.