Thanks for your suggestions. Alas I''m still unable to find the culprit. Here are my findings:
1) My loading code and rendering code for X file meshes appears virtually identical to example code that comes with the DX8.1 SDK, except for minor, clearly dismissable differences.
2) When I strip my code so that only the mesh is rendered, I get 90 FPS while the DX8.1 example code gets around 1800 FPS.
3) When I render nothing but a blank screen, my FPS shoots up past 7000, telling me that its the mesh render that is clogging the pipeline.
4) The mesh has 20 materials, one 256x256 texture and 265 vertices, which is reasonable. The DX code handles it just fine. it is interesting that the DX8.1 code runs at 20x the FPS of my code though (correlation??? Hmmm).
5) tweaking the code so that all 20 materials point to one texture had no improvement.
6) I''ve checked for the usual blunders. The call to render the mesh is made only once. It calls ID3DXMesh::DrawSubset() 20 times, one for each material, just like the DX8.1 example code.
Given that everything "looks" as it should, I wonder if there is some sort of setup issue that can cause slowness, especially with a mesh that has many materials.
I would gladly be in debt to anyone who can help me.
Value of good ideas: 10 cents per dozen.
Implementation of the good ideas: Priceless.
Proxima Rebellion - A 3D action sim with a hint of strategy