# Terrain: Triangles or TriangleStrips?

## Recommended Posts

hi all i'm improving my terrain engine. i know that triangle strips are handled faster on the gpu than triangle lists. but i'cant represent my terrain with only one indexbuffer, if i use triangle strips. for each terrain-row i need a index buffer and have to call this buffer for each row in my programm. with triangle lists i can put all in one vertex/indexbuffer and call this buffer once for the terrain. does anyone has experience which of them could be faster? i couldn't find anything about that in the web. thanks

##### Share on other sites
You can use degenerate triangles that have two vertices with the same index to patch multiple strips together. Be sure to test if you actually get any speed up though. It's probably going to be quite small.

##### Share on other sites
ok, this would be an idea, but i don't like things like degenerated vertices. i'm porting my engine to XNA, so i don't know how XNA handles degenerated vertices.
thx for the approach

##### Share on other sites
Quote:
 Original post by LPVOID_CHi know that triangle strips are handled faster on the gpu than triangle lists.
Wrong. The best transform ratio (physically transformed vertices / triangles count) you can get with tri-strips is asymptotically near 1.00, whereas with tri-lists you could get the ratio down to 0,52.

If its not obvious by now, this means that you can make the card render 1000 tris with only 520 transforms by Vertex Shader.
Compare this to 1008 transforms for tri-strips or 3000 transforms for non-indexed tri-lists.

Of course, you must rearrange the vertices so that they stay in cache as long as possible so you can reuse as much of them as possible.

Then again, its easy to do for terrains, but not so easy to rearrange them for any regular 3D mesh (there are some utils from card vendors, though). However, these days, when you can easily push up to 10M tris per scene, its not really that much needed, unless you _REALLY_ want to push on screen as much vertices as possible. With basic LOD, you can get a really nice and dense terrain with long view distance easily under 1M tris, which is nothing these days.

Youll hurt the performance MUCH more if youll start using more streams for your terrain. Try to aim for less than 3-4, even at the cost of duplicated data.

##### Share on other sites
Quote:
 Original post by joe1024Of course, you must rearrange the vertices so that they stay in cache as long as possible so you can reuse as much of them as possible.

For big terrain patches, you could use a Hilbert curve for this.

##### Share on other sites
Quote:
 Wrong. The best transform ratio (physically transformed vertices / triangles count) you can get with tri-strips is asymptotically near 1.00, whereas with tri-lists you could get the ratio down to 0,52.

well, i always read in books/articles that tri-strips are faster than tri-lists. if this is true, i'll go ahead with tri-lists.
i don't use LOD because the LOD calculation on the CPU is slower than cashing whole terrain patches in the g-ram. my engine is also for FPS, so you can't look very distant like in a flight sim.

Quote:
 For big terrain patches, you could use a Hilbert curve for this.

##### Share on other sites
Quote:
 Original post by joe1024The best transform ratio (physically transformed vertices / triangles count) you can get with tri-strips is asymptotically near 1.00, whereas with tri-lists you could get the ratio down to 0,52.

How did you get this? I can understand that if the cache is large enough to hold vertices of one row, it won't make any difference which you use, strips or lists, you get about 0.5 . But if the cache is not that big, I'd really like to know how you can easily arrange a list to get a ratio of 0.52, no matter how big the grid.

##### Share on other sites
Quote:
 Original post by LPVOID_CHok, this would be an idea, but i don't like things like degenerated vertices. i'm porting my engine to XNA, so i don't know how XNA handles degenerated vertices.thx for the approach

It handles it very well. I've implemented a terrain engine in XNA that supports triangles lists and triangle strips without much problem. By the way I've profiled the engine and I've not found any noticeable difference between the two approaches. But i'm using only two vertex buffers and one index buffer for all the terrain...

##### Share on other sites
Quote:
 well, i always read in books/articles that tri-strips are faster than tri-lists.
Of course. All literature for beginners should say that, beacuse otherwise they would have to start explaining it in more detail which would just utterly confuse/scare off the beginners. You cant realistically expect to learn the advanced tricks from place which introduces you to basic concept of indexing, can you ?
Plus, not everybody knows of that. I too personally have seen many people not aware of this "trick", although they were quite competent otherwise.

Quote:
 i'm porting my engine to XNA, so i don't know how XNA handles degenerated vertices.
What does that have to do with XNA ? Youre confusing API with basic 3D concepts. Check the underlying architecture of Xenos to see similarities.

Quote:
 Original post by SnotBobHow did you get this? I can understand that if the cache is large enough to hold vertices of one row, it won't make any difference which you use, strips or lists, you get about 0.5 . But if the cache is not that big, I'd really like to know how you can easily arrange a list to get a ratio of 0.52, no matter how big the grid.
Personally, Ive reached only the ratio of 0.56 with my own pattern. Only later, Ive read here on gamedev one thread where one guy showed a way how to get the ratio of 0.52. However, I must admit I wasnt bothered to try to implement his approach, because frankly, whether it is 0.52 or 0.56, it doesnt matter at all. Its still ~twice as much as compared to a no cache-friendly-indexing. 5% more or less - I couldnt care less.

As to other part of the question - it doesnt matter if the row fits to cache, since my indexing pattern isnt based on rows at all - its of a completely different shape (no, not even a Hilbert curve), but I can easily traverse whole terrain chunk with it.

I should probably write a paper on it, since I havent seen it anywhere on the net. But since eventually a 4% faster approach was discovered, I couldnt be bothered to devote some time the paper.

[Edited by - VladR on June 4, 2008 2:19:34 AM]

##### Share on other sites
Ok, I think I get it now. I wonder why I've not come across this before. Even a simple zig-zag pattern will get a ratio of about 0.75 . Seems to me that a simple, but reasonably good approach is to grab a suitably thick strip of Hilbert Curve -like curlies and lay the whole grid with those.

##### Share on other sites
Quote:
 Original post by joe1024The best transform ratio (physically transformed vertices / triangles count) you can get with tri-strips is asymptotically near 1.00, whereas with tri-lists you could get the ratio down to 0,52.

That's just plain wrong. You're comparing a naive implementation of tri-strips to a highly optimised version of tri-lists. There's no reason you can't use many of the techniques, such as ordering triangles to maximise cache use and cache priming, that can make tri-lists fast with tri-strips. (See figure 3 in the original geometry clipmaps paper for one example.)

With the rate that GPU performance improves relative to memory, the effort of reading the indicies list might well become the limiting factor for highly optimised versions of both lists and strips.

##### Share on other sites
If you want to experiment, you could run the OptimizedMesh sample from the DirectX SDK. It lets you try various kinds of batch.

I just tested it on a GeForce 6200, and it achieved ~57M triangles per second with "list" and ~50M tris per second with "single strip" (remember to turn vsync off). I guess there could be some dodgy programming in OptimizedMesh that somehow favours lists, but on the face of it seems to be a reasonable test.

I would speculate that having an index array doesn't appreciably slow down modern video cards (as long as it's in video memory) and in fact leads to slightly better performance when used with triangle lists, because there's no need for degenerate triangles. You still need to optimize the mesh in order to take advantage of the vertex cache, of course.

##### Share on other sites
The absolute fastest setup for regular grid terrain is an indexed triangle list or strip (it really doesn't matter) with cache priming. End of story.

##### Share on other sites
Quote:
 Original post by PromitThe absolute fastest setup for regular grid terrain is an indexed triangle list or strip (it really doesn't matter) with cache priming. End of story.

Absolutely - there are just too many other variables that make any slight differences between them meaningless. I just wanted to correct the impression that strips were automatically much worse that lists.

##### Share on other sites
Quote:
 Original post by PromitThe absolute fastest setup for regular grid terrain is an indexed triangle list or strip End of story.
But you cant get better transform ratio than 1.01(or 1.05, doesnt matter) with strips, now can you ? With ideally sorted indices, your tri-list could give you a ratio of ~0.52.
I dont think that the card checks each current strips vertex whether it is in cache or not. Its just vertex data for the card.

As for the size of Index buffer - I can see that you surely could find some pathological cases with very big IB that it could be the issue.

##### Share on other sites
Quote:
 Original post by VladRBut you cant get better transform ratio than 1.01(or 1.05, doesnt matter) with strips

Depends on how many times the same vertex is used for different triangles. Yes, if the vertex is only used once, then theoretical lower bound on ACMR is 1. Whether the vertex has been transformed and is in the hardware FIFO (with its attributes) is tracked by the index of the vertex (and this can be determined very fast). That is why indexed strips/lists are so useful and both strips and lists can take advantage of indexed vertices.

At least this is how I understand it, please correct me if I misunderstood it :)

##### Share on other sites
Quote:
Quote:
 Original post by PromitThe absolute fastest setup for regular grid terrain is an indexed triangle list or strip End of story.
But you cant get better transform ratio than 1.01(or 1.05, doesn`t matter) with strips, now can you ?

Yes you can. See Hugues Hoppes paper for examples. In the case of regular grids for terrains it's even easier - see figure 3 of this paper.
Quote:
 As for the size of Index buffer - I can see that you surely could find some pathological cases with very big IB that it could be the issue.

Strips have an almost factor of three advantage over lists for number of indicies, although that is reduced by degenerate triangles - which are necessary for cache optimised strips.

##### Share on other sites
Quote:
 Original post by dave jStrips have an almost factor of three advantage over lists for number of indicies, although that is reduced by degenerate triangles - which are necessary for cache optimised strips.

This is very true. But the size of the index buffer rarely matters and I think most people feel it's much easier to work with lists in their algorithms.

##### Share on other sites
Quote:
 Original post by ndhbThis is very true. But the size of the index buffer rarely matters and I think most people feel it's much easier to work with lists in their algorithms.

Oh, I agree, most of the time it's more effort than it's worth. But there are some cases where it is relatively easy to define cache friendly indicies for strips - regular grids for terrains for instance which is where this thread started from.

##### Share on other sites
With the cache priming, each vertex in your terrain will be transformed only once ever. (Ideally. Since the cache has size limits, you'll be forced to repeat some edge vertices, but this is barely anything.) You can't really do any better than that.

##### Share on other sites
I think that the confusion here is due to the fact that some people are talking about indexed triangle strips, and others are talking about NON-indexed triangle strips. The 1.0 limit applies for the latter, not the former.

## Create an account

Register a new account

• ### Forum Statistics

• Total Topics
628354
• Total Posts
2982236

• 10
• 9
• 11
• 24
• 11