Trying to reach 380 M triangles/s

Started by
21 comments, last by sjelkjd 20 years, 9 months ago
Basically, I''m trying to max out my video card. Supposedly a radeon 9800 pro can do 380 M triangles/s. Now, I wrote a small demo where I try to achieve this. The only object I draw is a sphere, which is tesselated by polar angles. Right now I have it at 40x40, which means that each sphere has 3200 triangles. I draw 200 spheres in my scene, which are randomly placed. I''ve been trying out several methods, with the following results: glDrawElements Basically, I created a big vertex array, and then created indexes into this. I''m drawing plain GL_TRIANGLES. This gets me 20M triangles/s glDrawArrays I create separate vertex arrays from the indexes of the first method - so there is more geometry in this method, and it shows - 3.2M triangles/s Display lists I tried putting both the drawElements and the DrawArrays calls in a display list. Both get approximately the same rate, 35M triangles/s VBOs using DrawArrays Same as the original DrawArrays, but using static VBOs to transfer my sphere geometry over once. 36.5M triangles/s VBOs using DrawElements By far the fastest. I put the indices into a VBO, and I get 108M triangles/s VAO using DrawElements I can''t put the indices in a VAO(or I don''t know how), so it''s slower than it could be. 48M tris/s I have lighting enabled, and am using 1 light with the standard opengl pipeline. It''s not fill limited, because I use a very small window size. So how the heck am I supposed to get up to 380M? Or is that a big lie? Do I need larger batches per DrawElements call(currently at 3200)? Do I need to drop the per-object transforms? Or is 380M/s a big lie? If so, what is the best anyone here has gotten?
Advertisement
Forgot to mention - I have an athlon 1.2 ghz, so is it possible that I am cpu limited? I don''t do anything besides draw in my loop, currently.
Triangle strips would help improve your speed considerably, since it cuts back on geometry processing a lot, and doesn''t have to pass around as much memory.


The theoretical limit is also done on non-lit, untextured, flat shaded polygons, and being a theoretical limit, can pretty much never be achieved . They base it off of clock speed of the geometry processor, and doesn''t take into account the delays of the AGP bus, bus transfer rates, driver overhead, etc. In a normal scene, you will most likely be fillrate limited anyways, so I wouldn''t worry about pushing more polygons... if you have over 100million polygons in a single scene (after occlussoin culling, etc) you either have one hell of a complex scene, or need to re-think your engine culling design .
Oh yeah, and if you could reach that number, it would run at 1 FPS exactly. Assuming that they were correct in that number, of course.
---------------
quote:Original post by Ready4Dis
Triangle strips would help improve your speed considerably, since it cuts back on geometry processing a lot, and doesn''t have to pass around as much memory.

Would it? Memory transfer shouldn''t be an issue with VBO, since I use static buffers(should be created once on the card, never updated). As for geometry processing, you are assured that 2/3 vertices per triangle are in the vertex cache. That should be the case with plain triangles though, since I generate them sequentially.

quote:
The theoretical limit is also done on non-lit, untextured, flat shaded polygons, and being a theoretical limit, can pretty much never be achieved .

Ok, but even if I turn off lighting, no texturing, flat shading, I only get 125 M/s.

quote: In a normal scene, you will most likely be fillrate limited anyways, so I wouldn''t worry about pushing more polygons... if you have over 100million polygons in a single scene (after occlussoin culling, etc) you either have one hell of a complex scene, or need to re-think your engine culling design .

Yeah, true =) I really wish I could get 380 though, just for the fun of it.

quote:Original post by Joe-Bob
Oh yeah, and if you could reach that number, it would run at 1 FPS exactly.


Why is that? You could have, for instance, a 1M tri scene that ran at 380 fps
Also interesting is that display lists max out at about 38 M tris/s, and are less dependent on number of spheres, number of tris/sphere than the other methods - that is, the triangle rate doesn''t deviate from 38M/s.
quote:Original post by sjelkjd

Why is that? You could have, for instance, a 1M tri scene that ran at 380 fps


its a huge overhead to clear and swaw, and other stuff you and the driver has to do per frame
380 Mtri/s is the theoretical value the card can handle, you will never reach that in a real situation.
Make sure the rendered triangles are *really* tiny otherwise you''ll become fillrate limited. Also run your program in fullscreen with a low resolution (640x480x16).

This topic is closed to new replies.

Advertisement