Some background: right now, my engine uses a loose octree for culling. Naturally, whole octree subhierarchies can be rejected if outside the view frustum, but the objects that are found to be visible, are treated as individual entities almost to the end of the CPU part of the rendering pipeline. Only when drawcalls are finally formed of the visible scene, then drawcalls using the same material, same vertex buffer, the same lights influencing them etc. are combined into instancing groups, which basically list the GPU state to use + transform matrices for all instances.
Using this system, hardware instancing happens automatically when possible, so the user of the engine doesn't have to care about it at all. However, when dealing with high object counts, it is somewhat wasteful to treat the objects as individuals so far up the pipeline, and "re-detect" each frame that these objects indeed belong to the same instancing group.
I was wondering, is someone here using some kind of dynamic grouping system to form groups of objects automatically at runtime, for example let's say there's 5 tree objects next to each other (with same material, VB..) their bounding boxes would be combined into a group bounding box, and they would be treated as a group from there onwards, until something changes which invalidates the group (for example one of them changes LOD, or a dynamic light starts shining on one of them.)
What I'm afraid is that the management overhead for the groups (namely, having to check that they remain valid once they're formed) would consume more CPU than just bruteforcing the objects as individuals right to the end of the pipeline.
Any ideas or hints?