I'm coding for modern CPU's.
But the meshes I'm sending to the graphics card are pretty much directly output from Maya. I guess they will have poor vertex-cache-ordering.
I'm hoping to use a vertex-cache-optimization pre-pass on the meshes to gain some performance. As suggested here http://home.comcast.net/~tom_forsyth/papers/fast_vert_cache_opt.html
Does anyone know of the best technique to use on modern cards? Is it still Toms algorithm? The models I have to render have 1,000,000+ triangles. :(
So I would like the pre-pass to run fast, as well as produce good results. ( like Toms algorithm )
Does anyone know of a solid and good implementation of Toms algorithm? The ones I'm finding on the internet seem to be from junior/hobby programmers who are not writing the most efficient code.
I used to use Tootle for mesh optimization, which is an older library from ATI. However they haven't updated it and there's no VS2010 version, so I stopped using it. The old D3DX mesh library can optimize for you, although I don't know what algorithm it uses.
Strictly speaking, an algorithm like TomF's is no longer the absolute best way to optimize for a high-end modern GPU. This is because newer GPU's have multiple hardware units for setting up triangles, which means triangles will get processed concurrently rather than in sequential order. But optimizing for this would require specific knowledge of how the hardware splits up triangles among its hardware units, so it's probably not practical unless you're targeting a specific GPU or family of GPU's. Either way you can still get gains out of the older optimization methods, so it's still worth doing.