glDrawElements: Are vertices processed multiple times?

Started by
7 comments, last by Lutz 19 years, 11 months ago
I''ve got a quite simple question: Say I store 4 vertices in my VRam: 1---2 3---4 and I call glDrawElements with GL_TRIANGLES and index array (1,3,2,3,2,4) as arguments in order to draw the triangles 1-3-2 and 3-2-4, does the GPU light the vertices 2 and 3 twice? Note that they occur twice in the index array. Or is the GPU clever enough to remember that the vertices have already been lit when processing them the first time.
Advertisement
Most GPUs have small post-T&L FIFO cache that stores las few transformed vertices. It usualy has size about 16 or 32 (but you have to subtract about 4 becouse of pipelining). This is why cache optimization has a big impact on performance in transform limited applications.

You should never let your fears become the boundaries of your dreams.
You should never let your fears become the boundaries of your dreams.
Thanks, _DarkWIng_.

So if the distance of each double vertex pair in my index array is not greater than 12-28, the vertices will only be transformed once, right?

Hmmm, well, it''s a FIFO stack, so you probably can''t just fetch the 6th element if the stack depth is 11, can you?

So _DarkWIng_, what advice can you give me?
quote:Original post by Lutz
So if the distance of each double vertex pair in my index array is not greater than 12-28, the vertices will only be transformed once, right?

Right.

quote:Original post by Lutz
Hmmm, well, it''s a FIFO stack, so you probably can''t just fetch the 6th element if the stack depth is 11, can you?

Well, it''s not a stack. It''s a cache. Probably fully asociative (I hope this is the right english word for it). So if a vertex is in cache it will be used (no matter where in cache it is), if it''s not the last vertex will be pushed out of it and replaced with a new one. So replacing strategy is FIFO.

As for advices. There are some triangle stripifiers that can also do cache optimizations, so you might want to find one of those. If nothing else you can code one yourself, as it''s not such a big task to code a greedy algoritem to do this. You just have be carefull when selecting cache size. It depends on target platform. GF1-4 have size of 16 and FX has 32 (I think). I don''t know about sizes on ATI cards, but probably about the same.

You should never let your fears become the boundaries of your dreams.
You should never let your fears become the boundaries of your dreams.
OK, thanks for your help.

I need this to render a triangle bintree coming from a longest edge bisection. The mesh can have holes in it (i.e. leaf triangles which are not be rendered), so a stripification is rather difficult - I would have to break the strip when I arrive at a hole, or I would have to find a strip which goes around the holes. The former is rather inefficient, the latter is not even always possible.

I guess I will just push the triangles in the bintree leaf nodes without striping (I leave my clothes on:-) ). If I imagine this correctly, the distance (in the index array) of the double vertices will be below 12 or 28 in maybe 60%-90% of the cases. And that''s enough for me.

For example, if a triangle T has two leaf childen T1 and T2, it will look like this:

        3       /|      / |     /  |    4 T2|  <-- T = 1-2-3   / .  |  /   . | / T1  .|1-------2


So I just push the nodes in the order 1-2-4-3-4-2 for T1 and T2, so the nodes 2 and 4 will only be lit once.

Also for vertices which are shared by triangles which don''t have a common father but maybe a common grandfather or grand-grandfather, this should still work.
If you''re worried about this kind of thing, lock the vertex buffer, or use a VBO. Then you can be certain transformation will only happen once per vertex.
quote:Original post by Anonymous Poster
If you''re worried about this kind of thing, lock the vertex buffer, or use a VBO. Then you can be certain transformation will only happen once per vertex.

AP, you have no idea what you are talking about. VBO has nothing to do with this and CVAs are outdated.

quote:Original post by Anonymous Poster
...stripification is rather difficult...

I''m not saying you have to use stripification. I''m saying that some tools used for stripification can also produce cache optimized triangle list, as the algoritem is very simmilar. This is a very simple and effective optimization if your mesh (at least index buffer) is static.

In general meshes usualy have about 50% hit ratio. If you get it up to 90% that''s great. My optimizer usualy gives me about 70-90% hit ratio. But this all highly depends on input mesh type.

You should never let your fears become the boundaries of your dreams.
You should never let your fears become the boundaries of your dreams.
AP: I''m already using VBOs. Why are VBO vertices only transformed once? How does it work in the GPU? I mean, it can''t transform ALL vertices at the beginning and then use the results if a certain vertex is indexed by glDrawElements, can it?

_DarkWing_: My index buffer is highly non-static (it changes almost every frame). It''s for terrain rendering. Triangles keep splitting and merging all the time. But I guess you can view the binary tree structure as an already optimized structure.
Like darkWing said, VBO vertices are *NOT* transformed only once. The card reads your vertex index, fetches that vertex from the vertex buffer and transforms and lights it (or runs the vertex program on it) and then sends it on its way to primitive assembly and triangle setup. There is a post trandform vertex cache however, which DarkWing already has explained the workings of in detail.

Think of the graphics card like a really deep pipeline where vertices fall in one end and pixels fall out the other.

This topic is closed to new replies.

Advertisement