I am looking for a technique to render my Tree models in an efficient way. The model has 4-5000 vertexes but repeats these for a maximum of 20,000 vertexes using Model transforms (one for each tree limb). I think I would normally use hardware instancing to render these efficiently without lots of separate render calls. However I have many trees, and each of these is already hardware instanced (position, rotation). Using XNA4 I cannot combine hardware instances "to multiply them up" as it seems to support only one instance buffer per call.
I can repeat the model transforms into an array, and then repeat all that for the 30-40 visible trees, to form a large dynamic instance buffer but I wonder whether this is any more efficient than baking in the transforms into the model (even though the model would then be 20,000 vertexes in size).
With a hardware instance approach to rendering complex meshes, is it better to bake-in the model transforms and use the only available hardware instance stream to repeat the whole mesh, or is it better to using the hardware instance stream to render the one mesh efficiently, and the do repeated render calls for each of the visible trees ? (I am using Imposters to view large numbers of trees but want to push out the transition distance to prevent popping, which is why I am considering rendering more tree meshes than I've done before).
Any advice greatly appreciated.