So, maxgpgpu, let me get this straight. You have the most powerful hardware on the planet available to you - your GPU - but yet you're flat-out refusing to use it for the kind of processing it's best at. You have well-known, tried-and-tested solutions for collision and bboxes that have been proven to work for over a decade, but yet you're also flat-out refusing to use them. You have a VBO solution that involves needing to re-up verts to arbitrary positions in the buffer in a non-predictable manner - congragulations, you've just re-invented the worst case scenario for VBO usage - you really should profile for pipeline stalls sometime real soon.
None of this is theoretical fairyland stuff. This is all Real, this is all used in Real programs that Real people use every hour of every day of every week. You, on the other hand, have a whole bunch of theoretical fairyland stuff. It's not the case that you're a visionary who's ideas are too radical for the conservative majority to accept. It is the case that your ideas are insane.
Am I missing anything here?
Look, I don't know everything. I might be making mistakes, maybe serious ones. However, I do have reasons for every choice I make. Good reasons? Hopefully most of the time. But maybe not, and I'm totally open to that, and my reason for posting this thread was to scare up ideas that help me brainstorm. I can say one thing, in usual cases, the engine is plenty fast in normal cases. As I explained in this thread, for those cases, my approach is inherently GPU limited (or more exactly, not CPU or GPU limited).
And hey, I love having the GPU do instancing. I never criticized instancing. I know, because I never would. I also love having the GPU perform bumpmapping and parallax-mapping via relaxed cone-step mapping with self-shadowing to generate tons of fine details without need to have zillions of micro-triangles.
You talk about pipeline stalls like I never thought about that issue (one of many issues that interact, unfortunately). Yet my approach draws many objects with unlimited vertices with a single glDrawElements() call. And *****I***** don't care about pipeline stalls? While the conventional way only draws one object per draw call because it needs to change transformation matrices (and sometimes other state) between objects. To be honest, if I have one worry about my approach, it is that I pay too much attention to avoiding pipeline stalls.