I have read several articles about using VBO:s and VAO:s on today's hardware, but due to mixed information, I'm not sure what the bottom line is. For example http://www.opengl.org/wiki/VBO_-_more is instructive, but I feel that someone could have second opinions on these matters.
In my game, I usually have models with approximately 300-1000 polygons and 1-3 textures. Each model also has a few lower detail versions of them, which don't use the same vertex data, but usually do use the same textures. Additionally, some models (e.g. player) can have 20k polygons and maybe 5 textures.
In my current implementation, a mesh is divided into groups by texture. Each group has VBO:s for vertex positions, normals, texture coordinates etc. (each has its own VBO) and then an index buffer. After culling, I have a bunch of mesh groups, which I then first sort by texture and then by VBO and draw by glDrawElements. The vertex data is always static and all animating is done in shaders.
A few questions:
If a group has, say 4000 polygons (which could happen for the player models), is it advisable to call glDrawElements once for the whole chunk, or should I cut it into pieces? I read about some "cache pressure" kicking in with large chunks, but I don't understand what it means. I have experienced some hick ups with 20k models, but just dividing the draw calls didn't seem to help.
Should I put all vertex data to a single VBO? For each group or the whole mesh? I read that 1 - 4 MiB buffer is preferred on some hardware, so should I go further and implement some sort of general VBO allocator, so that the data of several meshes are pushed there? For 4 MiB VBO and 1000 polygon meshes, I can see how this would reduce bindings, if all meshes use the same types of attributes.
Should I use interleaved arrays? I remember that there has been some controversy with this issue. Probably depends on local coherence of vertex / index data.
Any other considerations?