Starting Triangle Strips
Hello,
My engine currently renders indexed triangle lists with 75,000 triangles at 465FPS. I am going to convert to triangle strips, and I was just wondering what algorithm most people use to generate the strips. I found this article which seems good, so I am reading it now. But I thought I would get some opinions or links to resources if you have them.
Thanks.
Heh, I compiled and ran their NVTriStripTest program, and it runs a little bit FASTER with optimized lists than optimized strips...what's up with that?
A lot of newer video cards are optimizing for speed with lists because they are generally easier to use than strips. Or so this is what I've read many times online.
Interesting, so are most engines just using optimized lists then?
I added the GenerateStrips() for my meshes index buffer, and it worked great...added 100-150FPS. However it takes FOREVER to optimize for large models. It's been working on a 75k face model for about 5 minutes now...
I also noticed it's staying at 50% CPU usage. Is there a way to override this and use closer to 100% so it goes quicker? I imagine this is handled by win xp's process priority system. Maybe that isn't something I would want to do anyway?
Thanks for the help!
I added the GenerateStrips() for my meshes index buffer, and it worked great...added 100-150FPS. However it takes FOREVER to optimize for large models. It's been working on a 75k face model for about 5 minutes now...
I also noticed it's staying at 50% CPU usage. Is there a way to override this and use closer to 100% so it goes quicker? I imagine this is handled by win xp's process priority system. Maybe that isn't something I would want to do anyway?
Thanks for the help!
Finally finished. 75k face model going 578FPS now instead of 465 previously.
How is this speed, considering all I'm doing is rendering a model with no translations or even texturing yet?
I know what you might say...dont be so worried about speed yet, but I figure this is the very foundation of everything to come - if rendering the vertex data isn't as fast as I can get it, then I'll have troubles later, right?
How is this speed, considering all I'm doing is rendering a model with no translations or even texturing yet?
I know what you might say...dont be so worried about speed yet, but I figure this is the very foundation of everything to come - if rendering the vertex data isn't as fast as I can get it, then I'll have troubles later, right?
hi,
just my guess, but as you don't use any texturing, translations, etc. your program is currently limited by the speed at which the triangles are drawn. The GPU uses a cache to access vertices, do the T&L on it, etc. With triangle strips, as the vertices are always in the right order (you will never have indices like that : 1, 100, 50, etc. but always like that : 0, 1, 2, 3, etc.) So it's optimal in terms of caching, so that's probably why you have more FPS with triangle strip.
Again, just a guess, I'm not really good with all the low level stuff ^^
just my guess, but as you don't use any texturing, translations, etc. your program is currently limited by the speed at which the triangles are drawn. The GPU uses a cache to access vertices, do the T&L on it, etc. With triangle strips, as the vertices are always in the right order (you will never have indices like that : 1, 100, 50, etc. but always like that : 0, 1, 2, 3, etc.) So it's optimal in terms of caching, so that's probably why you have more FPS with triangle strip.
Again, just a guess, I'm not really good with all the low level stuff ^^
Quote:Original post by paic
With triangle strips, as the vertices are always in the right order (you will never have indices like that : 1, 100, 50, etc. but always like that : 0, 1, 2, 3, etc.) So it's optimal in terms of caching, so that's probably why you have more FPS with triangle strip.
That's not entirely accurate. What you've got is two caches - pretransform and posttransform. The pretransform cache contains data read in from the vertex buffer - it usually contains several vertices because it's more efficient for the card to read a block of several vertices from the buffer in one go, instead of one at a time. The posttransform cache contains vertices that have been run through the vertex shader.
Because of the posttransform cache, the same sequence of triangles in both strip and list form should be roughly equivalent because they refer to the same vertices, meaning that you should get cache hits. However, the tri strip can mean fewer indices (provided the number of degenerates you have isn't too high), which is less work to actually process, and takes less memory.
To make best use of the pretransform cache, you should resequence the vertices in your buffer such that vertices which are referred to sequentially in the index buffer are laid out sequentially in the vertex buffer (and this isn't a particularly easy thing to accomplish as vertices may be referred to multiple times in the index buffer). I've got a feeling NVTriStrip provides a function for this.
Quote:Original post by superpig
To make best use of the pretransform cache, you should resequence the vertices in your buffer such that vertices which are referred to sequentially in the index buffer are laid out sequentially in the vertex buffer (and this isn't a particularly easy thing to accomplish as vertices may be referred to multiple times in the index buffer). I've got a feeling NVTriStrip provides a function for this.
Yeah, there is a function added to it by the xbox crew called RemapIndices, however you have to reorder your vertex buffer yourself. I've already implemented it, and the indices are ordered much more sequentially.
So do you think optimized, reordered indexed VB will be sufficient, or should I go for strips?
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement