Average pixels per triangle?

Michael Thompson · 2004-07-17T06:20:17

Hi, I am trying to find out what the average nimber of pixels per triangle is for benchmarking purposes. I realize that this is a rather arbitrary question as it depends on the scene, resolution, etc. I guess what I am seeking is what has been considered and average pixel-count when others have benchmarked their performance so I have something concrete to compare to. I've googled for this information for about an hour both on gamedev and web-wide. Can anyone give me an answer or reasonable estimate? My software renderer currently draws 88k flat shaded triangles per second in 16bit color with vertices (10, 10)-(128, 12)-(12, 128). I will be implimenting a new edge-walking algorith which should push it over 100k, possibly as high as 120k.

Graphics and GPU Programming Programming

Started by Ravyne July 06, 2004 11:49 PM

11 comments, last by Charles B 19 years, 9 months ago

Charles B

863

July 16, 2004 04:03 AM

Yes a 2X speedup with just one element from the 8 maybe that can significantly improve perfs. ;) You see in the end 2^8 = 64. There is surely still a good margin. 6 mega triangles already looks promissing. You are in the aera of a Dreamcast, not bad at all. I suppose that if your edge walking was in assembly too, the span rendering inlined, (a trapazoid routine replces the span filler), you would gain another X2 for small triangles. A function call overhead costs much on small loops.

Can you explain concisely what is the buggy case ? Is this related to clipping ? Or is it related to the way you compute the starting x ? It's a typical source of graphical bug (lacking pixels on edges etc...). You must be really precise on how you take the fractional part into account, how you jump to the exact next line under the vertex, and how you deal with fractionnal precision. How you change the xstart, when a new left edge is treated, etc ...

But be sure it's worth the pain for you keeping on improving your code. Because it's quasi an abyssal job to reach the 'perfect code' for triangle rendering.

"Coding math tricks in asm is more fun than Java"

Ravyne

14,306

Author

July 16, 2004 02:56 PM

Yes, I'm sure asm would benefit the edge-walking, although I don't know if I want to go quite that far. I'll be able to recycle this code for next years CG class, but they discourage using lots of asm, I suppose I'll look into it for my own benefit though. The Span-filler is small and inlined, so there shouldn't be a function call. I suppose I should examine the compiler's output just to be sure.

The failing cases seems to do with the order of the vertices. It should be a matter of ensuring order at the top of the function or reworking the algo to accept unordered vertices, I have a feeling that the second option will be faster and eliminate some conditional jumps. In either case, the inner loop would remain the same.

I have also confirmed my suspicions(spelling?) about why 16 bit color is slower than 32bit in triangle rendering. It was indeed an alignment issue. Other functions did not exhibit this behavior as they were always aligned. Aligning to 32 or 64bit boundaries should increase that performance by around 2-3 times.
That would put 16bit triangles in the neighborhood of 12-18m. In any case, I plan to be switching to an MMX span-filler in the near future.

As for whether this continued improvement is worth it, I believe it is for my own education. Unfortunately Software rendering is not a very viable skill these days, With all the new engines coming out with (sometimes requiring) pixel-shader support, software becomes unusable even as a fall-back renderer. UT2k4 contains a software mode writen by Michael Abrash, which actually pieces together hand coded ASM fragments as features are turned on or off, and even he states that PS support could not be done to satisfactory quality and speed. Maybe this could change when multi-core CPUs/SMP systems become common. Using one core (or CPU) to simply do rasterization, leaving the other to perform geometry calculations (building a display list for the other core/CPU,) gameplay, etc... Anyhow I'm wandering off topic.

I believe it will be worthwhile when I one day get an interview and the man behind a desk says "We were very impressed with your use of Direct3D in your demos." to which I can reply "Actually, thats all my own software rendering routines." Hopefully sealing the deal :D I suppose they'll know when they look through the source, but I'm allowed to have occasional delusions of grandure, right?

Thanks again for your input Charles.

throw table_exception("(? ???)? ? ???");

Charles B

863

July 17, 2004 06:20 AM

Quote:Original post by Ravyne
The failing cases seems to do with the order of the vertices. It should be a matter of ensuring order at the top of the function or reworking the algo to accept unordered vertices, I have a feeling that the second option will be faster and eliminate some conditional jumps. In either case, the inner loop would remain the same.

I'd start by handling :
- backculling : always
- ccw : always
Then you can start making your API compatible with OpenGL philosophy, let user choose each option.

So now this leaves you a crossproduct (if verts are already projected) and sign evaluation. If the sign is positive your rasterizer won't have to cope with inversely ordered triangles. Even though, spans of negative length should render nothing. for(x=a; x<b; x++) renders nothing when a>=b. I don't remeber any difficulty with this problem. Unless you want to accept any kind of winding order. Then you probably better have to swap the vertices 2 and 3 at the top of the function after the backfacing test.

Quote:
It was indeed an alignment issue.

Sure, I forgot to answer to this question. But misalignement and even 16 bit accesses are reknown perfs killers. I remember it was often faster to write 2 bytes on the first pentiums. The best is to group 16 bits pixels two by two (or, shift) and write 32 bits aligned data at once. Now same thing with 64 bits data if you use the MMX. This means you have to deal with some nasty loop preambles and postambles. Then comes the idea of the masks back again. The loop is complete ;)

Quote:
Anyhow I'm wandering off topic.

Software rendering is fundamental to understand the 3D hardware far more in depth. Having tackled the perfs issues in software rendering enables an intuitive understanding on how the undocummented parts of the hardwares may work, and how to help them speed up. Now I also consider that after the dull years of the first hardwares, the developments of shaders tend to let the coders reappropriate the lower levels of rendering code. So it's not as obsolete concern as it seems. Maybe on longer term, the GPUs will enable even more layers of coding, possibly letting a coder redefine a completely customized rendering pipeline, for instance rendering complex shapes without tesselation. Example direct nurb rasterization. The architecture of the PS2 for instance was in this direction.

Quote:
I suppose they'll know when they look through the source, but I'm allowed to have occasional delusions of grandure, right?

Sure it's one way to go. The other is hard work and produce things. All a question of balance with some clear objectives in mind.

"Coding math tricks in asm is more fun than Java"

Average pixels per triangle?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Average pixels per triangle?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines