Quote:Original post by C0D1F1ED
Anyway you assumed that there is no depth complexity in the scene.
True, high overdraw can really kill performance for a software renderer. If you have control over the application side as well, make sure you render front-to-back as much as possible.
Actually I've got pretty much control in this case. There will be no simple way to render raw triangles (nothing like immediate mode in OpenGL) but the smallest entity will be 'mesh'. Each mesh will have its own bounding box so before sending vertices down the pipeline I will clip meshes against the frustum and sort them when 'mesh assembling' is finished.
Quote:Normally you can keep z interpolation separate from the rest, so you can use floating-point there and fixed-point for the rest. It really depends on the precision you want.
The reason why I choose fixed-point z-buffer is that you cannot mix floating point operations with MMX (at least that's what I heard). I've read somewhere that there is some clock penalty when switching from MMX to floating point. Sure that SSE/SSE2 use the same registers as MMX but how about coprocessor routines?
Quote:For a z-buffer 16-bit integer is enough, but you need to be quite careful there is no unnecessary precision loss. With floating-point it's a whole lot more straightforward and no slower in practice.
How about 32-bit z-buffer - maybe it's a better choice in case of 32 bit architecture?
What do you think of hierarchical z-buffer, C0D1F1ED? I am not sure if scan-converting bounding boxes would be a good choice in case of scenes of small complexity but pyramidal z buffer seems interesting (probably better CPU cache management).
Thank you