Jump to content
  • Advertisement
Sign in to follow this  
MarkS

Optimizing a software rasterizer?

This topic is 3021 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I know that this question is extremely general, but that is actually what I'm after. I've written a software rasterizer. It was strangely easy, but, of course, the frame rate sucks. Drawing two large perspective correct texture mapped (no filtering) triangles in a 1024 x 768 window gives me a frame rate of about 30. If I do bilinear filtering, the frame rate drops to around 3. I'm not doing much right now to optimize the code. Obvious things like loop unrolling and assembly language aside, what is typically done to optimize a software rasterizer? Also, is there a way to blit more than one 32-bit pixels at a time? One of the biggest performance hits is texture access. How is this typically optimized? I've tried interpolating the texture coordinates after perspective division, but that leaves me with linear texture mapping. Sorry, I'm at work and the code is on my home computer. I'll post some code for more direct critiques when I get home.

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by maspeir
One of the biggest performance hits is texture access. How is this typically optimized?


By building specialized fast hardware for exactly this purpose. Really. Larrabee, which is essentially a complete software renderer couldn't get by without hardware texture samplers.

Share this post


Link to post
Share on other sites
Quote:
Original post by maspeir
I'm not doing much right now to optimize the code. Obvious things like loop unrolling and assembly language aside, what is typically done to optimize a software rasterizer?


If you want to optimize, you'll need to do profiling to figure out where the bottlenecks are in your code, coupled with having a deep understanding of the hardware you're running your code on will help you play nice with the hardware so you're not stalling waiting for memory or otherwise wasting computing resources doing unnecessary things.

Share this post


Link to post
Share on other sites
[wink] OK, now let's assume that this is done just for fun and my own purposes and wont be released. I want it fast enough to be able to display simple scenes, maybe on the order of the original Doom at 30 FPS or so in a 1024x768 window.

My original purpose in making this was educational (I've never made one before). However, now that it works, I want to see how fast I can make it.

Share this post


Link to post
Share on other sites
I apologize. That came across as rather dismissive. You threw me off with the specialized hardware comment.

Share this post


Link to post
Share on other sites
ASM is really the way to go. Not only will using the SSE instruction set give you access to more ops/cycle, but allows you to really fine-tune memory access. Pre-loading data, cache hints, ect... all can go a long way to speeding things up.

Also, this might be of interest to you:
http://www.radgametools.com/pixomain.htm

Michael Abrash has been in the industry a long time and has written alot of articles (just google him, you'll get all sorts of stuff). Also they have demos on that site, so you can get an idea of what sort of speed you can get if you're really good ;)

Share this post


Link to post
Share on other sites
Quote:
Original post by maspeir
I apologize. That came across as rather dismissive. You threw me off with the specialized hardware comment.


I only meant that the way people have optimized software renderers in the past is by inventing the GPU

Share this post


Link to post
Share on other sites
The wait for memory is usually the performance killer when it comes to texturing with a software rasteriser. You can check by giving it very small textures.
If this is the case, there are two common ways to optimise. One is to swizzle the texture addressing, the other is mipmapping.

Share this post


Link to post
Share on other sites
Quote:
Original post by maspeir
If I do bilinear filtering, the frame rate drops to around 3.

You might be interested in texture coord space dither. It was used in the original Unreal and provides a faster bilinear filtering mechanism at the expense of quality.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!