- Lighting is disabled in all tests.- Rendering 3000 Balls with 950 triangles each, grouped in one triangle list for every ball.************************************************************************ withOUT GL_NV_vertex_array_range (Simple VA) (800x600x32)**********************************************************************================================================- Minimum Geometry : 2,905,622.25 tris/sec- Maximum Geometry : 3,595,494.50 tris/sec- Average Geometry : 3,509,296.25 tris/sec================================================************************************************************************ with GL_NV_vertex_array_range (2 textures) (800x600x32)**********************************************************************================================================- Minimum Geometry : 7,029,515.00 tris/sec- Maximum Geometry : 9,769,094.00 tris/sec- Average Geometry : 9,260,820.00 tris/sec================================================************************************************************************ with GL_NV_vertex_array_range (1 texture) (800x600x32)**********************************************************************================================================- Minimum Geometry : 8,782,627.00 tris/sec- Maximum Geometry : 9,797,635.00 tris/sec- Average Geometry : 9,347,072.00 tris/sec================================================************************************************************************ with GL_NV_vertex_array_range (No textures) (800x600x32)**********************************************************************================================================- Minimum Geometry : 9,405,980.00 tris/sec- Maximum Geometry : 9,830,263.00 tris/sec- Average Geometry : 9,382,089.00 tris/sec================================================************************************************************************ with GL_NV_vertex_array_range (No textures) (8x8x32) (all enabled)**********************************************************************================================================- Minimum Geometry : 9,450,366.00 tris/sec- Maximum Geometry : 9,836,973.00 tris/sec- Average Geometry : 9,388,133.00 tris/sec================================================************************************************************************ with GL_NV_vertex_array_range (No textures) (8x8x32) (nothing enabled)**********************************************************************================================================- Minimum Geometry : 9,559,716.00 tris/sec- Maximum Geometry : 10,462,466.00 tris/sec- Average Geometry : 9,577,451.00 tris/sec================================================
As you can see i hardly break the 10 Mtris/sec barrier when rendering on a 8x8 window with depth-stencil-color-texture disabled I don''t say that the numbers aren''t good enough. nVidia''s demo on VAR (the wavy thing) gave the similar results.
The 3000-sphere package, has been rendered by randomly translating the origin for every sphere, every frame, so every ball lies inside an imaginary box. Ball''s dimensions are small, because i thought this way i can minimize fillrate. I placed the camera so all the balls being visible every frame (imaginary box completely inside the frustum). Backface culling is enabled, and no hint for volume clipping.
Can you explain the above results? I''m a little confused with them, because there seems to be no big difference between multitextured and no-textured tests (except from the min counter).
quote:
You might want to consider a different spatial subdivision structure than an octree, in order to avoid the redundancy problem and minimize the required splits.
The reason i started (and stuck) with octrees is the simplicity when creating them. No tree is that hard to implement, but octrees are the simplest. I wanted octrees for another reason to. As i had read in some occlusion culling papers, their perfect cube node shape, is more friendly to those algorithms, than an arbitary BBox. While i was implementing HOMs, i haven''t saw any advantage of using perfect cubes vs arbitary boxes, but i was too focus on occlusion culling, i haven''t got any time for changing them. Now, that my HOM implementation is stuck on the software rasterizer (really hard part, i must admit, not only from the speed point of view but too hard to get OpenGL-like precise results), i''m too bored to change them. But occlusion culling is another topic, which respects plenty of threads and posts!
[off topic]
(Sad memories came to his mind. "What the hell", he thinks, "i''ll complete it someday.")
- Reminder (to myself) : Change Octrees.
- Question (to myself) : With what??????
[/off topic]
quote:
Ah, but you didn''t mention you have a GF-1... But you didn''t really mention how much performance you actually get either. First, get rid of those double triangles you mentioned above, they will suck away your precious fillrate like mad. I''m starting to suspect that you aren''t really geometry limited, but fillrate limited. VAR won''t help you very much in that case.
Please give the definition of performance! Do you mean tris/sec, pixels/sec (how can you measure that? is it the obvious way, the way to go?)? I thought FPS is enough as a performance result.
My opinion is that i''m both geometry and fillrate limited. But lets assume that i''m only fillrate limited, what can i do to overpass it? Change resolution, smaller textures, texture compression, texture filtering, minimize blended polygons, no multitexturing, are some possible solutions, i think. But what if i can''t "implement" one of them, because i really want the functionality it gives, then i guess the only solution is a newer card
Thanks for the support Yann. I''m really greatfull
I''ll now try to eliminate double rendered triangles, and i''ll be back with the final results
HellRaiZer
PS:
quote:
Well, that is generally what is called profiling: measuring the exact impact of a specific piece of code, without external interference from other code parts. Things like parallel execution pipelines can hide the actual performance hit of a function, because it is delayed/overlapped. What you are talking about is benchmarking, which is basically a speed measure over the entire program. Profiling is micro-benchmarking, on subsystem or even instruction level.
My bad english made me think of benchmarking and profiling as the same thing. Thanks for the clear explanation