Software rasterizer and float/int color

Started by
12 comments, last by rarelam 12 years, 11 months ago
For a 3D software rasterizer nowadays, is there much speed difference between using floating point or integers as color? For example, pseudo-code implementations of two color classes are below. The float one is easier to implement and use, but don't know how much slower it would be.


class Color3f
{
public:

float r,g,b;


Color3f mix( Color3f& c1 )
{
return Color3f( r*c1.r, g*c1.g, b*c1.b);
}

Color3f add( Color3f& c1 )
{
return Color3f( r+c1.r, g+c1.g, b+c1.b);
}


};

class Color3i
{
public:

int rgba;



int getRed() { return (0xFF0000 & rgba)>>16; }
int getGreen() { return (0x00FF00 & rgba)>>8; }

int getBlue() { return (0xFF & rgba)>>0; }




Color3i mix( Color3i& c1 )
{


return Color3i( ( getRed() * c1.getRed() / 256 ) <<16 |
( getGreen() * c1.getGreen() / 256 ) <<8 |
( getBlue() * c1.getBlue() / 256 ) <<0 );
}
// An add() function would have to check for overfloat/underflow. Omitted here




}

};
Advertisement
I'd just use floats. Easier to handle things like HDR I'd imagine.

...

I get that that was probably pseudo-code, but you might want to store things in arrays. Like uint8_t rgba[4]; or float rgba[4]; and plan on using SSE intrinsics a lot if you're worried about speed.
I would use a integer's unless you need the extra precision of floats. One of the largest bottlenecks in a software rasterizer is memory bandwidth and using floats instead of bytes is 4x the bandwidth(Even using 16bits per channel would be an advantage over floats). It does make the code more complicated to read, one thing you can do is use floats for your calculations and just pack to ints for storing.
storage using as small data as possible, computations using float (SIMDfied)
I've found integers the simplest since but I don't think there should be any real speed problems. The only thing I can think of is that if I used floats for color, I'd have to convert back to int before finally drawing to my backbuffer. There's some things floats might slightly simplify for me but whatever. My renderer is super simple though and I don't do anything fancy in the slightest.

I just do something like this.
struct Color
{
Color() : c32(0) { };
Color(u32 c) : c32(c) { };
Color(u8 r, u8 g, u8 b) : alpha(255), red(r), green(g), blue(b) { };
Color(u8 a, u8 r, u8 g, u8 b) : alpha(a), red(r), green(g), blue(b) { };

union
{
struct
{
u8 blue;
u8 green;
u8 red;
u8 alpha;
};

u8 c8[4];
u32 c32;
}; };


My surfaces are aligned and make use of SSE. Works for me.

I would use a integer's unless you need the extra precision of floats. One of the largest bottlenecks in a software rasterizer is memory bandwidth and using floats instead of bytes is 4x the bandwidth(Even using 16bits per channel would be an advantage over floats). It does make the code more complicated to read, one thing you can do is use floats for your calculations and just pack to ints for storing.



I don't get about the memory bandwidth. Do you mean things like if texture data is in cache? That is, if texture color is int rather than float, you get more data loaded in cache?
If I limit my scene to have one .3ds file (size 400k-2000k) and twelve 512x512 images, would memory bandwidth issues be insignificant?

At what size of memory will this matter?

Yes, using smaller data means you can fit more in the CPUs cache at once.

Moving data from RAM into the cache is the most expensive operation you can perform (10-1000 times slower than mathematical operations).

Your L1 cache is probably about 32KB, and your L2 cache around 2MB. If you're operating on more data than that and you're trying to optimise for performance, then you really want to think about how much data you're transferring between RAM and cache.

[quote name='rarelam' timestamp='1306222851' post='4815004']
I would use a integer's unless you need the extra precision of floats. One of the largest bottlenecks in a software rasterizer is memory bandwidth and using floats instead of bytes is 4x the bandwidth(Even using 16bits per channel would be an advantage over floats). It does make the code more complicated to read, one thing you can do is use floats for your calculations and just pack to ints for storing.



I don't get about the memory bandwidth. Do you mean things like if texture data is in cache? That is, if texture color is int rather than float, you get more data loaded in cache?
If I limit my scene to have one .3ds file (size 400k-2000k) and twelve 512x512 images, would memory bandwidth issues be insignificant?

At what size of memory will this matter?

[/quote]

Not just texture data but writing to the framebuffer the less reading and writing from memory the better.

The size of the textures does not have much impact if you are using mip mapping, the thing that will make a difference is how you store the data if it is bytes rather than float, you wil need to read alot less from main memory.

I would have thought it quite unlikely that you would fit all your data in cache at one time, but that is kind of irrelevant as you only need to have data you are about to access in cache. The smaller the data you are using the less data needs to be pre fetched.

storage using as small data as possible, computations using float (SIMDfied)

In my scene, typically I have cube maps. So I use reflection vectors to index into the cube map. If cube map stores integer RGBAs, you think the operations to convert this RGBA into a float4 is worth it? Essentially extra four multiplications by 1/255.0 each time a texel is read.


This makes me want to convert everything to 16:16 fixed point math. But similar to the int/float dilemma with color, I ask myself is it worth doing?





Yes, using smaller data means you can fit more in the CPUs cache at once.

Moving data from RAM into the cache is the most expensive operation you can perform (10-1000 times slower than mathematical operations).

Your L1 cache is probably about 32KB, and your L2 cache around 2MB. If you're operating on more data than that and you're trying to optimise for performance, then you really want to think about how much data you're transferring between RAM and cache.


Would there be worthy speed increase if we restrict uncompressed texture/model data to be <2MB?

If the rasterizer were written in Java or some other managed language where you don't know what's going on with memory, perhaps this suggestion is even less workable?



[quote name='Krypt0n' timestamp='1306234232' post='4815060']
storage using as small data as possible, computations using float (SIMDfied)

In my scene, typically I have cube maps. So I use reflection vectors to index into the cube map. If cube map stores integer RGBAs, you think the operations to convert this RGBA into a float4 is worth it? Essentially extra four multiplications by 1/255.0 each time a texel is read.
[/quote]nobody forces you to multiply by 1/255.0, I don't see the use of scaling your values by a constant.

you need float if you want to interpolate e.g. vertex-colors perspective correct, doing so with integers might be a headache. you might also want to have gamma correct blending etc. with integer you'd lose quite some performance.


In addition, loading 32bit, converting to 4 floats (SIMD) isn't really more work than working with fixed point (from performance point of view). but as soon as you do some math (interpolation, filtering, blending), you can use simple SIMD instructions, while with fixed point you'll probably do slow mul/div + shifts on ever color channel.

that's why I say, work with float (SIMDfied), keep your data as small as possible (for cache and memory bandwidth).

This topic is closed to new replies.

Advertisement