The answer to this question has absolutely no practical answer, as it depends on too many things.
Just before I came to these forums just now I was running a test on my own engine on another type pf performance issue about which I may be posting on my blog soon.
Luckily by default I always print my FPS, triangles-per-second, and milliseconds per frame (whether I am benchmarking or not).
Here are a few actual numbers I just got minutes ago:
322.287908 FPS 595,595,897 Triangles per second. 3.102816 ms per frame.
790.450921 FPS 710,755,419 Triangles per second. 1.265101 ms per frame.
728.731499 FPS 101,038,645 Triangles per second. 1.372248 ms per frame.
709.486340 FPS 72,470,738 Triangles per second. 1.409470 ms per frame. Clearly the number of triangles drawn per second by itself has absolutely no meaning. At 322 FPS I drew almost 6 times as many triangles as when I had 729 FPS.
for example one milion triangles mesh (very detailed plant, bunny or something) (one milion triangle mesh is high but i think it is reasonable) thing should take an array of floats
weighting a 36 MB - how many milisecond it will be taking
to flush it?
That depends. How expensive is the vertex shader? How much screenspace do these vertices take, multiplied by how expensive the pixel shader is?
Basically your question has no meaning because you specifically asked how many triangles can be rasterized per second, and the raw number of triangles to process is only part of the full rasterization process.
- First vertices need to be submitted. Here is one variable factor. Rendering 72,000,000 triangles once is faster than rendering 9,000,000 triangles 8 times. So how many triangles you can fit into a single render call is one variable.
- Then they need to be transformed. This clearly depends on the number of cycles (among something else I plan to post on my blog soon) in your vertex shader. For 2D pre-normalized objects, the shader could be as simple as a single copy, while most 3D vertices need at least a matrix multiply.
- Then they need to go cover some amount of pixels on the screen. A single triangle drawn to cover a large amount of the screen will be slower to rasterize than the exact same triangle drawn almost incident to the camera. The fewer number of pixels that actually need to be drawn to the screen, the faster the rasterization.
- And pixel-shader complexity determines the cost of fill-rate (the actual process of determining the color of each pixel and outputting it). If you plan to have any meaningful result you will have lighting calculations etc.
- Overdraw is another variable. If a bunch of triangles keep getting drawn on top of each other they are consuming fill-rate time, where as if they are being drawn behind previously drawn triangles they can be rejected early via early Z tests.
I mean working example, I know it differs but take some
example and you could measure it
The numbers I posted are real examples. Different objects with different complexities. Some made of many textures, some with just a few, some with normal mapping, some not, some heavy on transparency, etc.
Now you know that in the real-case examples, triangles-per-second is meaningless.
Of course you can test just the vertex-processing stage by creating a test case in which a vertex buffer contains millions of the same triangle, pre-normalized so that no math happens in the vertex shader and small enough to not fill a single pixel on the screen, but you are asking about real-world examples, and that is just not a real-world case.
L. Spiro