Jason Z


Software Rasterizer On Hold
I pretty much have hit a limit with optimization on the rasterizer. I made quite a few improvements with respect to memory access, redundant function calls, and overall program flow. The net result, when rendering a single bilinear filtered textured triangle at 640x480 is approximately 22-24 fps when the triangle covers approximately 1/3 of the screen. VTune tells me that my latest hotspot is the bilinear sample function, and there is not a whole lot going on in there other than looking up some memory and combining it all together.

I would like to eventually perform some SSE2 optimizations, in particular on the rasterizer. This would be my first venture into such a beast, and would be a great learning experience. I have wanted to add some form of SIMD support to my vector/matrix operations for a while now anyways.

However, the software rendering project is going to be ramping down. I have a couple other projects that I have been kicking around in my head that are just about ready to start, and my free time is very limited right now. I should get some news in the next couple days regarding a couple of proposals that I submitted last month, so my free time schedule is going to be either completely empty or a little more open in the coming days.

I still haven't decided if I will attempt to write a software rendering article or not. It would be fun I'm sure, but I don't know if I can add enough material to make it worth while. We'll have to see how that goes - usually if I am going to write it, inspiration will strike me and I'll just start writing. We'll see how it all works out...
Recommended Comments

You can do better than that. Mine runs faster than that.

Somehow, I got the impression that you're re-calculating the pixel pointer for every pixel you draw. Don't!

Forget about filtering your textures, it's just too damn much of a performance hit.

Switch to affine texture mapping when the triangle has a constant z.

Map the texture in little affine chunks.

Shame to hear it's on hold, but so long as it doesn't transition to 'cancelled' [wink]

I'd of thought the optimization side of things would be facinating - if I actually knew what this thing called "spare time" was then I'd be all over it. There can only be a few classes of relevant applications that tax the entire machine like a software rasterizer does - highly complex, high arithmetic, high bandwidth, massively parallel and almost perfectly suited to SIMD extensions [grin]

But anyway, best of luck with your proposals!


