Jump to content
  • Advertisement
Sign in to follow this  
Erik Rufelt

OpenGL Getting Max Tri/sec rate

This topic is 4907 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, I've been experimenting with different ways of drawing triangles, and no matter what I try I seem to be getting very low tris/sec rates. What's the optimal way to achieve as high geometry rate as possible? I've tried both OpenGL and Direct3D 9.0, with about the same results. I have an ATI x800XT 256mb graphics card, and get best results drawing using vertex buffers in D3D and vertex buffer objects in OpenGL. D3D gets max ~90 MTris/sec, and OpenGL ~70MTris/sec. I've tried larger/smaller vertex buffers and found that this is as high as it gets. I draw without textures or anything, it's the same if i cull away all tris. ATI's website says the card should be able to draw like 500MTris/sec. I realize this is impossible to reach =) but i'm not even getting 1/5 of this. I've found sites where other people have posted much higher rates, but I have not managed to get them myself. An example app that came with the DX9 SDK reported about 160MTris/sec while using optimized meshes. I've tried all different settings i could think of, and the data is stored on the card, not in system memory. If i keep them in system mem i get like 10MTris/sec =) Are there better ways to draw triangles, or is this something wrong with my comp? Thx, /Erik

Share this post


Link to post
Share on other sites
Advertisement
Hello..

There are alot of factors to be considered.
nVidia has a nice pdf with step-by-step tp get the highest tris/sec out of your card. ATI should have something similar.

Some pointers:
* Use as few batches as you can (and still try to use shorts for indices ;]).
* Remeber to store the indices on the graphics-card as well...
* Some vertex formats are more card-friendly than others. For example one 32 bit int for the colors is better than 4 floats...

And then obvious stuff like vertex and fragment programs.. textures..
Try to just use vertices and nothing else (less data == more speed.. usually).

Good luck!

Share this post


Link to post
Share on other sites
Quote:
Original post by Erik Rufelt

ATI's website says the card should be able to draw like 500MTris/sec. I realize this is impossible to reach =) but i'm not even getting 1/5 of this. I've found sites where other people have posted much higher rates, but I have not managed to get them myself.
An example app that came with the DX9 SDK reported about 160MTris/sec while using optimized meshes.


Congratulations, you've been bitten by the PC hardware marketing BS bug. Your card would have no problem writing 500M triangles/second... if your machine had enough bandwidth to send triangles to the GPU that fast. The PC hardware scene is full of meaningless specs that can never be acheived because there is a limiting bottleneck somewhere else in the system. You could probably get a little higher than the 90M triangles/second as evidenced by the program that comes the DX9 SDK you mentioned, but give up any hope of ever reaching 500M, it's just not gonna happen unless you have a PCI express card.

Share this post


Link to post
Share on other sites
Hi,

Thanks for the reply
I use just one batch, tried drawing it once per frame as well as 10+ times per frame, with same result.
I have also tried more batches.
I don't have any colors or anything, just the vertices, D3DFVF_XYZ.. adding D3DFVF_DIFFUSE and a color (32 bit int) made the rate drop with 40%.. which seems to imply that the problem is memory..
Everything is stored on the card..
I've tried using 16bit indices as well, but it didn't result in any change..
I get 125 MTris/sec now with
device: D3DADAPTER_DEFAULT, D3DDEVTYPE_HAL, D3DCREATE_HARDWARE_VERTEXPROCESSING
vbuf: D3DFVF_XYZ, D3DUSAGEWRITEONLY, D3DPOOL_DEFAULT
ibuf: D3DFMT_INDEX32, D3DUSAGE_WRITEONLY, D3DPOOL_DEFAULT
SetFVF: D3DFVF_XYZ
SetIndices
SetStreamSource(0, buf, 0, sizeof(D3DXVECTOR3))
and DrawIndexedPrimitive(D3DPT_TRIANGLELIST..);
Is it possible to tell d3d how to store the vertices on the card?
I was thinking perhaps it's somehow defaulted to double precision or something..

Maybe someone could try my program and see what tri rate u get?
So i'm not trying to fix something in my program when the problem is really my comp =)
http://gys.mine.nu:180/~rufelt/D3DGeometryRateTest.exe
Link


cwhite: that shouldnt be a problem.. everything is stored on the card so hardly anything is sent to it per-frame..


thx,
/Erik

Share this post


Link to post
Share on other sites
I get 78M tris per second on my Radeon 9800 pro with 256 megs or ram.

Try drawing in wireframe mode, it might cut the fill rate.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
It's not been mentioned but I think you'll also discover that the triangles they claim they render in huge volumes are *single pixel*, i.e. not a true triangle, just one pixel (technically a polygon, just a very small one!). Try that and I bet your throughput is higher still.

Cunning buggers aren't they?

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
I've you try to optimize your vertex buffer?
NVidia has made a tool called NvStrip, which reorganize data in a perfect strip, with vertex order optimized for memory paging and access data optimized for cache misses.
At least, a primitive can be faster tha another, depending on the type of mesh.
A very smooth mesh is faster to draw than a sharped mesh, because more vertices are shared by triangles. If you have a triangle to draw, you have more chances that the vertices in it are allready transformed in the cache. So the perfect case is a very smooth mesh (like a sphere) reorganized by NvStrip.
The only way to reach the maximum number of triangles is to have a perfect geometry with a perfect strip, but in practice it never happen because it's too restricting for graphists.
I don't think PCI Express will increase the polycount if the datas are allready on the video memory.
I've seen one time more than 13M triangles per second on a G-Force I, and the maximum limit vas 14M. This was a perfect strip case...

Ronan

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Yes, I suspect that the so-called "triangle transformation rate" is actually the vertex transform rate. If you use a triangle strip/fan then the two are almost exactly equivalent because if you don't change render state a vertex is guaranteed to transform identically, and of course triangle sounds more impressive than vertex.

I also have no idea how clipping works in hardware, so it's possible that you'll only get full speed when the clipping cases are trivial (i.e. completely on screen or completely behind a single clipping plane). I don't think that would be much of a dent though because in any normal scene almost all triangles will fall into this category.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!