triangles per second

Started by
11 comments, last by _the_phantom_ 10 years, 4 months ago

could someone say me an reliable info how many triangles per

second (or milisecond) are menagable by todays graphics api

to be rasterized on the screen?

But I would to know not any way theoretical value but simple practical

one (for example you got simple array of one milion triangles that describes some mesh - how much time it will take for example to rasterize it)

Also doeas anybody know how many triangles per milisecong software

rasterization can do?

Advertisement

could someone say me an reliable info how many triangles per

second (or milisecond) are menagable by todays graphics api

to be rasterized on the screen?

But I would to know not any way theoretical value but simple practical

one (for example you got simple array of one milion triangles that describes some mesh - how much time it will take for example to rasterize it)

Also doeas anybody know how many triangles per milisecong software

rasterization can do?

today that really depends on your shaders and the size of the triangles, amount of overdraw, etc.

Todays graphics APIs don't really care how many triangles you use(triangle data are stored in buffers on the GPU and the API only sends a buffer id to the GPU, it doesn't matter if it is 1 triangle or 1 billion triangles from the APIs point of view). (The APIs care about things like draw calls, state changes, etc).

a modern GPU can handle several billion triangles per second in theory. in practice a few hundred million per second shouldn't be a problem in the normal case.

[size="1"]I don't suffer from insanity, I'm enjoying every minute of it.
The voices in my head may not be real, but they have some good ideas!

could someone say me an reliable info how many triangles per

second (or milisecond) are menagable by todays graphics api

to be rasterized on the screen?

But I would to know not any way theoretical value but simple practical

one (for example you got simple array of one milion triangles that describes some mesh - how much time it will take for example to rasterize it)

Also doeas anybody know how many triangles per milisecong software

rasterization can do?

today that really depends on your shaders and the size of the triangles, amount of overdraw, etc.

Todays graphics APIs don't really care how many triangles you use(triangle data are stored in buffers on the GPU and the API only sends a buffer id to the GPU, it doesn't matter if it is 1 triangle or 1 billion triangles from the APIs point of view). (The APIs care about things like draw calls, state changes, etc).

a modern GPU can handle several billion triangles per second in theory. in practice a few hundred million per second shouldn't be a problem in the normal case.

You sure that more than 100 Milion traingles? How many

it should be possible to test..

I mean working example, I know it differs but take some

example and you could measure it

for example one milion triangles mesh (very detailed plant, bunny or something) (one milion triangle mesh is high but i think it is reasonable) thing should take an array of floats

weighting a 36 MB - how many milisecond it will be taking

to flush it?

I am quite sure that it should take far less than one

second, but do not know much more.. Maybe it would

be comparable to RAM read/write speed which in

practice I think is somewhat more than 1GB/s

(think few GB/second but also I am not sure exact values - doeas anybody know that?, but i am asking also about just practical and no oversized values here)

So assuming 1MB/ms flushing it through pipeline should be "about" 36 milisecond (few times worse?) - but I would like to get some 'field' tests and results, here (And cannot measure it myself right now)

The answer to this question has absolutely no practical answer, as it depends on too many things.
Just before I came to these forums just now I was running a test on my own engine on another type pf performance issue about which I may be posting on my blog soon.

Luckily by default I always print my FPS, triangles-per-second, and milliseconds per frame (whether I am benchmarking or not).


Here are a few actual numbers I just got minutes ago:

322.287908 FPS 595,595,897 Triangles per second. 3.102816 ms per frame.
790.450921 FPS 710,755,419 Triangles per second. 1.265101 ms per frame.
728.731499 FPS 101,038,645 Triangles per second. 1.372248 ms per frame.
709.486340 FPS 72,470,738 Triangles per second. 1.409470 ms per frame.


Clearly the number of triangles drawn per second by itself has absolutely no meaning. At 322 FPS I drew almost 6 times as many triangles as when I had 729 FPS.


for example one milion triangles mesh (very detailed plant, bunny or something) (one milion triangle mesh is high but i think it is reasonable) thing should take an array of floats
weighting a 36 MB - how many milisecond it will be taking
to flush it?

That depends. How expensive is the vertex shader? How much screenspace do these vertices take, multiplied by how expensive the pixel shader is?

Basically your question has no meaning because you specifically asked how many triangles can be rasterized per second, and the raw number of triangles to process is only part of the full rasterization process.

  • First vertices need to be submitted. Here is one variable factor. Rendering 72,000,000 triangles once is faster than rendering 9,000,000 triangles 8 times. So how many triangles you can fit into a single render call is one variable.
  • Then they need to be transformed. This clearly depends on the number of cycles (among something else I plan to post on my blog soon) in your vertex shader. For 2D pre-normalized objects, the shader could be as simple as a single copy, while most 3D vertices need at least a matrix multiply.
  • Then they need to go cover some amount of pixels on the screen. A single triangle drawn to cover a large amount of the screen will be slower to rasterize than the exact same triangle drawn almost incident to the camera. The fewer number of pixels that actually need to be drawn to the screen, the faster the rasterization.
  • And pixel-shader complexity determines the cost of fill-rate (the actual process of determining the color of each pixel and outputting it). If you plan to have any meaningful result you will have lighting calculations etc.
  • Overdraw is another variable. If a bunch of triangles keep getting drawn on top of each other they are consuming fill-rate time, where as if they are being drawn behind previously drawn triangles they can be rejected early via early Z tests.

I mean working example, I know it differs but take some
example and you could measure it

The numbers I posted are real examples. Different objects with different complexities. Some made of many textures, some with just a few, some with normal mapping, some not, some heavy on transparency, etc.
Now you know that in the real-case examples, triangles-per-second is meaningless.



Of course you can test just the vertex-processing stage by creating a test case in which a vertex buffer contains millions of the same triangle, pre-normalized so that no math happens in the vertex shader and small enough to not fill a single pixel on the screen, but you are asking about real-world examples, and that is just not a real-world case.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

It is not meaningless - those examples you give are interesting thing i am asking about

322.287908 FPS 595,595,897 Triangles per second. 3.102816 ms per frame.
790.450921 FPS 710,755,419 Triangles per second. 1.265101 ms per frame.
728.731499 FPS 101,038,645 Triangles per second. 1.372248 ms per frame.
709.486340 FPS 72,470,738 Triangles per second. 1.409470 ms per frame.

This 500 700 values seem high to me 100 70 would

be more i would be expecting..

Has your scene about 2M of triangles? Why do you think,

you got such speedups and slowdowns? I would like to see

a results in miliseconds with heavier load of triangles, especialy with some screen contained mesh so no tu much

frustrum clipping would occur...

Triangle per second don't mean jack squad in today's world.
Back in 1997 when gpus struggle to push 400.000 triangles per second and they were scanline rasterizers; it was a valid unit or measure.

Today pipelines are too complex to be measured by such unit of measurement, and gpus are no longer scanline rasterizer. A triangle covering 2 unaligned pixels may take more time that a triangle that covers 4 aligned pixels.

Furthermore, it's not even accurate. GPUs process vertices and pixels. 4 vertices can be used to draw 2 triangles, or 4. A cube may be drawn using 36 vertices, 24 vertices, or just 8.

And they will all may yield different performance because different units in the GPU are being used, or the load balancing is different.

Why do you think, you got such speedups and slowdowns? I would like to see

Because I am drawing different models or looking at the same model from a different direction etc.
One model has 300,000 triangles, drawn in multiple passes per frame (1 ambient, 1 shadow generation, 1 lighting pass), adding up to 710,000,000 triangles per frame at 790 FPS.

On the other hand one of the models is just a little guy with perhaps 15,000 total triangles. No matter how furiously I draw (number of lights, shadows, etc.) him I only get 72,470,738 triangles per second, and cannot possibly catch up to the car with 300,000 triangles rendered only 3 times per frame.


So I am trying to say that triangles-per-second is a useless measurement.


L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

Why do you think, you got such speedups and slowdowns? I would like to see

Because I am drawing different models or looking at the same model from a different direction etc.
One model has 300,000 triangles, drawn in multiple passes per frame (1 ambient, 1 shadow generation, 1 lighting pass), adding up to 710,000,000 triangles per frame at 790 FPS.

On the other hand one of the models is just a little guy with perhaps 15,000 total triangles. No matter how furiously I draw (number of lights, shadows, etc.) him I only get 72,470,738 triangles per second, and cannot possibly catch up to the car with 300,000 triangles rendered only 3 times per frame.


So I am trying to say that triangles-per-second is a useless measurement.


L. Spiro

but this is not.. i want to know example values - generalising the outcome could be nonsense but the concrete measure test is not

I didnt understood the above about triangles - I would be interesting in simple one pas and flat shading - could you

say how much triangles you scene have and what frame times it take to flush it? 300 000 triangles in 3 ms ?

but this is not.. i want to know example values - generalising the outcome could be nonsense but the concrete measure test is not


Example values are meaningless because they ONLY tell you the outcome for that frame and that combination of vertices, shaders, camera position etc. and nothing more.

AMD and NV don't even list triangles/sec for their latest cards because it really is a meaningless metric to even consider these days due to how the GPUs are built.

The question is impossible to answer in a meaningful way. A few points:

  • Any figure given is not very authorative since the performance difference between a mobile card, entry level, or a two year old enthusiast card and the latest enthusiast card may very well be 5-20x, depending on what cards you compare and what you measure.
  • Depending on the complexity of the vertex shader and other details (geom shader, tesselation) the same card may be able to process 100 or 1000 times as many triangles as in a different setup.
  • A "typical" 1080p screen has little over 2 million pixels. Drawing 100 million triangles means you have at least 50x overdraw (best possible case, with pixel-sized triangles -- otherwise you have a lot more). Since GPUs are able to process more triangles than there are pixels on the screen, it is somewhat useless to worry how many they can process.
  • Many triangles are not necessarily better than fewer triangles. Processing more triangles quickly results in diminuishing returns. Pixel-sized triangles are a crazy abuse of the pixel shading pipeline due to 2x2 quad processing, and there is no visible improvement (but there may be aliasing effects!).
  • ALU >> TEX >> BW >> ROP. You are nowadays almost always ROP-bound. If you aren't, you are bandwidth-bound. The ALU/ROP ratio is about 10x higher nowadays than it was a decade ago [source].

This topic is closed to new replies.

Advertisement