Performance Test: Indexed Lists or not !

Started by
15 comments, last by Zaph-0 15 years, 9 months ago
Hi guys, I am working on a terrain and have been (out of laziness ^_^) using normal triangle lists so far. I took the advice from some people and changed them into indexed triangle lists and did some performance checks. The first thing I noticed was that the speed of my terrain went down by about 5-10% with indexed lists and not much to render. After doing some heavy load tests (focus on vertex transformation) I noticed that the normal triangle lists stayed at about 3500fps for my terrain with about >200 million triangles/sec and the indexed lists went down to 1900 fps with only about 110 million triangles/sec ! Uhm... aren't indexed lists supposed to be much faster ? Or did maybe the HW-makers optimize for simple triangle lists (caching, etc...) so that the indexed lists are falling behind ? Could it be because of the vertex shader code ? It might also matter that my vertices take up only 16 bytes each so there is not *that* much to save on the memory bandwith (in addition that everything is in gpu memory anyway). Anyway, I don't have a real explanation for this and if anyone knows something I'd really be interested in it. PS: I verified it several times, there appear to be no mistakes...
Advertisement
How large are you index buffers in bytes?


The index buffer is about 25k and I am using 16bit size indices.

When using the index lists, are you removing all of the duplicate vertices? Otherwise, I could easily see an index buffer that has no duplicate indices in it, being slower than a plain vertex buffer, since the indices don't gain you anything in that case other than using up space.

The vertex cache only kicks in when you use an index buffer (otherwise, it has no way of knowing if 2 vertices are the same or not). But, that'll only help if there are identical vertices to be hit in the cache.
Also, note that certain cards have certain optimum sizes for vertex and index buffers. If you are going over that then you might not be getting the performance boost you expect.
@andur

I completely removed all duplicate vertices and that reduced the vertex buffer by around 60-70% so that cannot be the problem

@Dave

I am not certain about the optimum sizes for vertex and index buffers but I tried changing the sizes somewhat and the results stayed the same. Simple triangle lists are just much faster then indexed triangle lists.

I guess I will have to stay with the simple triangle lists. I think the reason for the speed is because the efficient vertex caching simply doesn't only make the indices obsolete but it even surpasses them. I do have to make a few testruns on my ati card and check the differences (only nvidia so far) but I think it will probably be the same...

did you use an optimizer ( the one in DX10 or the nvidia NvTriStrip Library or ATI tootle ) to "Reorders faces to increase the cache hit rate of vertex caches" & "rearrange memory layout for vertices based on the final indices to exploit vertex cache prefetch" ?


Hi,

no I didn't use an optimizer but the vertices and faces were presorted by me and put in the buffer in a very straightforward order.

I might try it but so far that appears to be of no use because of the huge time difference I get so far.

I do now render roughly about twice as much triangles with an unindexed list, but that means I do render about 6 times as much vertices.

You may try to reduce index buffer size transforming, if possible, the primitive topology from triangle list to triangle strip

Rendering 3000 triangles
TriangleList-> IndexBuffer = 3 * triangles = 9000
TriangleStrip->IndexBuffer = 2 + triangles = 3002

You may have significant performance improvements.
Quote:Original post by Zaph-0
I do now render roughly about twice as much triangles with an unindexed list, but that means I do render about 6 times as much vertices.


i was curious about you performance gain.
i tryed with a R8G8 triangle vertex buffer, R8G8 triangle list vertex buffer + index buffer and R8G8 triangle list optimized vertex buffer + optimized index buffer. the mesh used was a regular grid 128x128 & the triangle vertex buffer was filled by row.

triangles : 48 FPS
indexed triangle list(no optimization) : 130FPS
indexed triangle list(optimization) : 160FPS

i got the results expected... did you do something else to your triangle vertex buffer ?

ps : got a 8800 GT.

This topic is closed to new replies.

Advertisement