post-TnL cache size on recent GPUs ?

Started by
3 comments, last by paic 15 years, 3 months ago
Hi, I'm trying to find the usual post-TnL cache sizes on recent GPUs (GeForce 88xx and higher) but google won't help me much ... I only found a newsletter from NVidia ... talking about GeForce 3 and 4. Is there any tool to get that information from my card ? Or any website where this information can be found ? Thx for any answer !
Advertisement
If I understand correctly, the general feeling is that on recent GPUs one shouldn't be tuning for the cache quite so much. Because the unified architecture shares the stream processors, memory and cache for both vertex and fragment shaders, there isn't any good way to state that X amount of cache is used for vertices.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Hi !
But I've been reading the "GPU Programming Guide" (for G80 and higher) provided by NVidia, and they still talk about post-TnL cache, and recommend using their optimizing tool NVTriStrip ...

There may not be any fixed post-TnL cache size, but there should at least be some kind of lower-upper bounds ? Since it's usually 24 vertices on GeForce 4s (if I remember correctly) I assume we should around 2 or 4 times more (if not more) ...

Well, I'll discard this for the moment, but if anyone has information on the subject, I would really appreciate :)
You may wanna try ATI tootle lib (2007) instead of NvTriStrip which is older.

http://developer.amd.com/gpu/tootle/Pages/default.aspx

It can also optimize mesh to reduce overdraw (hurts a bit post TnL cache hit).

On a 8800GT, using a cache size of 48 give the best results but since dx9 cache hit query don't work, i can't confirm it. Maybe it's more, less or not fixed, run some tests to find the one that suits you best.
Thx, but I don't want to use another lib. And on top of that, my meshes are procedurally generated, so I don't want to first generate a basic trimesh, then use the lib, then render ... that wouldn't be very efficient :)

Anyway, thx for the answer. I'm currently implementing the code that generate the trimesh, with a variable cache size, so hopefully I will be able to find the value that gives the best results for me. I will probably start with 48 with what you said :)

This topic is closed to new replies.

Advertisement