Sign in to follow this  
kRogue

VertexCache question:

Recommended Posts

kRogue    100
Ok, I am considering makeing my own generic priming the vertex cacher (i.e. reorder triangle's to get the most out of a vertex cache of a GPU), but now i am wondering how the caching is decided when a cache hit occurs: is it really simple and dumb FIFO? i.e. regardless of cache hit the front vertex of the cache is dropped from the cache and the next vertex is placed at the back, this method might have that the same vertex is stored several times in the cache. or is it something brighter: if a cache hit occurs, that vertex is placed at the back of the cache and the verices that were newer than it are all pushed up. (so the front vertex gets dropped unless that was the vertex that caused the hit). I imagine that it is the second, but having no docs to really reference, does anyone know?

Share this post


Link to post
Share on other sites
_the_phantom_    11250
From my understanding of it the post-T&L cache is an associative array; each group of vertex data which is stored has an id number assocated with it (thus why you need to use index'd data to get the most from it).

When each vertex arrives to be transformed the id number is compared with the ids in the cache; if there is a match then the cached data is used, otherwise the data is transformed and the oldest data in the cache is over written.

Basically, to aid caching you need to index/order the data in such a way that you can get maximal reusage of the data in a small number of vertex submissions.

Share this post


Link to post
Share on other sites
kRogue    100
I figured it would be associative, what I am asking is this, how does it decide what to dump from the cache, so for example, lets say the cache size is 6 thingies (lets call them A[0], A[1], .. A[6])... lets label what is inside them associatively, i.e. what vertex index is in the cache. so lets say the cahce is storing:

A[0]=a, A[1]=b, A[2]=c, A[3]=d, A[4]=e, A[5]=f, i.e. A[i] stores the post TnL value for the vertex at VertexArray[ A[i] ]

as the GPU streams along the glDrawWhatever(GL_TRIANGLES, stuff)... the next triangle uses vertices (i.e. vertex indexes), c,d and g

what is the contents of the cache after the triangle is processes? is the cache _very_ dumb and simple FIFO so the results would be (start of cache gets killed)

-->after c gets entered: (notice that c appears twice in the cache now)
A[0]=b, A[1]=c, A[2]=d, A[3]=e, A[4]=f, A[5]=c

-->after d gets entered: (notice both c and d are in the cache twice each)
A[0]=c, A[1]=d, A[2]=e, A[3]=f, A[4]=c, A[5]=d

-->after g gets entered:
A[0]=d, A[1]=e, A[2]=f, A[3]=c, A[4]=d, A[5]=g


or is the cache more intelligent and does something like this:

-->after c gets entered, c was already in the cache, so cache puts c as most recently used:
A[0]=a, A[1]=b, A[2]=d, A[3]=e, A[4]=f, A[5]=c

-->d is also in cache, so d gets put into most recently used slot:
A[0]=a, A[1]=b, A[2]=e, A[3]=f, A[4]=c, A[5]=d

-->g not in cache, so cache drops of oldest element, the fellow stored in A[0]=a
A[0]=b, A[1]=e, A[2]=f, A[3]=c, A[4]=d, A[5]=g


or is it something else?

Share this post


Link to post
Share on other sites
Kalidor    1087
I believe (I'm not 100% sure, but I've read this several places including some threads here) the cache in modern cards is a simple FIFO cache, but it's not so dumb to include duplicates. So using your example when it comes across vertices c and d it will find them in the cache and use that, then leave the cache untouched. Then when it comes to vertex g it doesn't find it in the cache so it drops vertex a and adds vertex g. So the cache will be [b, c, d, e, f, g] after this new triangle.

Share this post


Link to post
Share on other sites
Sneftel    1788
All vertex-cache optimization approaches I've seen (which work, so apparently they got it right) assume an LRU replacement scheme. Thus if your cache has size 3 and you access vertices 1,2,1,3,4,5 then vertex 2 will be discarded to make room for vertex 4 and vertex 1 will be discarded to make room for vertex 5. This is pretty easy to implement in hardware so there's no particular reason to use something stupider.

EDIT: Actually, now that I think about it, I recall Nvidia cards in particular using a FIFO instead of an LRU. That was several years ago, though, so I'm not sure if it's still true.

Share this post


Link to post
Share on other sites
kRogue    100
so, if the cache is LRU, then it is functionally the same as the second (i.e. not the uber dumb FIFO one) but if it is FIFO but don't allow duplicates, this will significantly complicate my vertex priming algorithm (which is not even written yet)...

as for checking the source for NvTriStripper... (shudders) it is probably a good place to look, but it is from the age of GeForce3, when nVidia *I think* was saying the cache was FIFO... but now adays being up to GeForce8 (and I am targetting GeForce6 class) I don't know.... also NvTriStripper is tries to make triangle strips and prime the cache, not quite the same as just priming the cache...

Share this post


Link to post
Share on other sites
kRogue    100
I, beg to differ to atleast some extent... optimizing vertex cache usage can give a significant performance boost, especially with terrains and models with skinning... though right now I have been using NvTriStripper and another TriStripper out there, the second one wants to make triangle strips only... so.. ehh...

Share this post


Link to post
Share on other sites
Krohm    5030
So I guess your vertex unit does have considerably more work to do than my average case (or, by contrast your FS unit does have less).
Could you please quantify the "significant performance boost"? In the order of 100%, 50%, 10% or what?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this