Optimizing a software rasterizer?

Started by
17 comments, last by MarkS 14 years, 1 month ago
I know that this question is extremely general, but that is actually what I'm after. I've written a software rasterizer. It was strangely easy, but, of course, the frame rate sucks. Drawing two large perspective correct texture mapped (no filtering) triangles in a 1024 x 768 window gives me a frame rate of about 30. If I do bilinear filtering, the frame rate drops to around 3. I'm not doing much right now to optimize the code. Obvious things like loop unrolling and assembly language aside, what is typically done to optimize a software rasterizer? Also, is there a way to blit more than one 32-bit pixels at a time? One of the biggest performance hits is texture access. How is this typically optimized? I've tried interpolating the texture coordinates after perspective division, but that leaves me with linear texture mapping. Sorry, I'm at work and the code is on my home computer. I'll post some code for more direct critiques when I get home.

No, I am not a professional programmer. I'm just a hobbyist having fun...

Advertisement
Quote:Original post by maspeir
One of the biggest performance hits is texture access. How is this typically optimized?


By building specialized fast hardware for exactly this purpose. Really. Larrabee, which is essentially a complete software renderer couldn't get by without hardware texture samplers.
Quote:Original post by maspeir
I'm not doing much right now to optimize the code. Obvious things like loop unrolling and assembly language aside, what is typically done to optimize a software rasterizer?


If you want to optimize, you'll need to do profiling to figure out where the bottlenecks are in your code, coupled with having a deep understanding of the hardware you're running your code on will help you play nice with the hardware so you're not stalling waiting for memory or otherwise wasting computing resources doing unnecessary things.
[wink] OK, now let's assume that this is done just for fun and my own purposes and wont be released. I want it fast enough to be able to display simple scenes, maybe on the order of the original Doom at 30 FPS or so in a 1024x768 window.

My original purpose in making this was educational (I've never made one before). However, now that it works, I want to see how fast I can make it.

No, I am not a professional programmer. I'm just a hobbyist having fun...

It doesn't matter if it's being released or not. Optimization is the same either way
I apologize. That came across as rather dismissive. You threw me off with the specialized hardware comment.

No, I am not a professional programmer. I'm just a hobbyist having fun...

ASM is really the way to go. Not only will using the SSE instruction set give you access to more ops/cycle, but allows you to really fine-tune memory access. Pre-loading data, cache hints, ect... all can go a long way to speeding things up.

Also, this might be of interest to you:
http://www.radgametools.com/pixomain.htm

Michael Abrash has been in the industry a long time and has written alot of articles (just google him, you'll get all sorts of stuff). Also they have demos on that site, so you can get an idea of what sort of speed you can get if you're really good ;)
Quote:Original post by maspeir
I apologize. That came across as rather dismissive. You threw me off with the specialized hardware comment.


I only meant that the way people have optimized software renderers in the past is by inventing the GPU
The wait for memory is usually the performance killer when it comes to texturing with a software rasteriser. You can check by giving it very small textures.
If this is the case, there are two common ways to optimise. One is to swizzle the texture addressing, the other is mipmapping.
Quote:Original post by maspeir
If I do bilinear filtering, the frame rate drops to around 3.

You might be interested in texture coord space dither. It was used in the original Unreal and provides a faster bilinear filtering mechanism at the expense of quality.

This topic is closed to new replies.

Advertisement