In some respects, it is nice to be free of the constraints of the hardware rendering systems. For example, I implemented a floating point z-buffer (just because I could [grin]) which couldn't be done in hardware unless you have a brand new $400 video card. Of course, there is that small performance penalty that has to be paid for using software instead of hardware...
Anyways, I also added a back face culling processor to the pipeline as well. The next component will be a triangle clipper for triangles that intersect the viewing frustum. Once this is completed, the traditional pipeline will be complete (more or less - I could add alpha blending, but I have read/write access to the framebuffer in the pixel shader, so it isn't really necesary). Then it will be on to the bigger and better specialty processors.
More on that to come. In the meantime, here is a screenshot from the z-buffer doing its thing:
Try enormous. Software rasterizes are more for cool factor now, and they are awesome, but you aren't going to run a 'modern' game on one.
Also, that zig-zag pattern in your texture mapping is probably caused by sub-pixel inaccuracy. You should google up it.