Jump to content
  • Advertisement
Sign in to follow this  
Ilici

Byte arithmetic operations for color calculations

This topic is 4970 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

In order to save memory when making a color class i'm using unsigned chars to represent colors. This is all nice and all but when it comes to calculations i run into lots of trouble. First thing: Addition wraps around the 255 max value of a byte. Can anyone help me with a function that will add and saturate the color? How about subtraction? Second thing: Can i do fast multiplications with a float value (from 0.0 to 1.0) without having to convert the byte to float? How about rounding the results (64 x 0.7 = 44.8 -> should be 45 but the conversion back to byte makes it 44).

Share this post


Link to post
Share on other sites
Advertisement
You can use a fixed point representation instead of floating-point and avoid a byte-to-float conversion, but you'll lose some accuracy.
#include <cmath>
#include <iostream>

typedef unsigned char byte;

class Fixed
{

public:

Fixed(float value)
{
value_ = 255 * std::min(std::max(0.0f, value), 1.0f);
}

byte mult(byte value)
{
unsigned short result = (((unsigned short)(value_)) * value) + 128;
return result / 255;
}

private:

byte value_;

};

byte saturatedAdd(byte a, byte b)
{
byte result = a + b;
if (result < a || result < b)
{
return 255;
}
return result;
}

byte saturatedSubtract(byte a, byte b)
{
byte result = a - b;
if (result > a)
{
return 0;
}
return result;
}

byte multiplyByteFloat(byte a, float b)
{
return (a * std::min(std::max(0.0f, b), 1.0f)) + 0.5f;
}

byte multiplyByteFixed(byte a, Fixed b)
{
return b.mult(a);
};

int main()
{
std::cout << int(saturatedAdd(35, 79)) << '\n';
std::cout << int(saturatedAdd(255, 255)) << '\n';
std::cout << int(saturatedAdd(1, 255)) << '\n';
std::cout << int(saturatedAdd(134, 207)) << '\n';
std::cout << int(saturatedSubtract(79, 35)) << '\n';
std::cout << int(saturatedSubtract(255, 255)) << '\n';
std::cout << int(saturatedSubtract(0, 255)) << '\n';
std::cout << int(saturatedSubtract(1, 255)) << '\n';
std::cout << int(saturatedSubtract(134, 207)) << '\n';
std::cout << int(multiplyByteFloat(35, 0.5)) << '\n';
std::cout << int(multiplyByteFloat(255, 1)) << '\n';
std::cout << int(multiplyByteFloat(255, 0.55)) << '\n';
std::cout << int(multiplyByteFloat(0, 1)) << '\n';
std::cout << int(multiplyByteFixed(35, 0.5)) << '\n';
std::cout << int(multiplyByteFixed(255, 1)) << '\n';
std::cout << int(multiplyByteFixed(255, 0.55)) << '\n';
std::cout << int(multiplyByteFixed(0, 1)) << '\n';
}

Obviously the use of the Fixed datatype in this example offers no benefit, since a new Fixed object is created from a float each time, but if you created one Fixed object and used it multiple times you'd see a (small) benefit.

Enigma

Share this post


Link to post
Share on other sites
Take my advice: Ditch the floats. Instead of passing in floats between zero and 1, pass in integers (or bytes) from 0 to 255. And I don't mean just converting it before calling the function. I mean actually use entirely integral data types (fixed point) instead of any real numbers.

Here's a few optimised C routines I've used in the past:

//(be warned, these are potentially unportable so I'd double
// check that they work properly on any other compiler)
inline int clipMin(int value, const int minVal = 0) {
return (minVal & (-(int)(value < minVal))) | (value & (-(int)!(value < minVal)));
}
inline int clipMax(int value, const int maxVal = 255) {
return (maxVal & (-(int)(value > maxVal))) | (value & (-(int)!(value > maxVal)));
}

// color c1 += c2 * alpha
inline UINT32 AdditiveBlend32(unsigned long c1, unsigned long c2, unsigned long alpha) {
INT32 a1 = (alpha)+1;
return (clipMax(((c1>>16)&0xFF) + ((((c2>>16)&0xFF)*a1)>>8), 0xFF)<<16)
+ (clipMax(((c1>> 8)&0xFF) + ((((c2>> 8)&0xFF)*a1)>>8), 0xFF)<< 8)
+ (clipMax(((c1 )&0xFF) + ((((c2 )&0xFF)*a1)>>8), 0xFF) );
}
// color c1 -= c2 * alpha
inline UINT32 SubtractiveBlend32(unsigned long c1, unsigned long c2, unsigned long alpha) {
INT32 a1 = (alpha)+1;
return (clipMin(((c1>>16)&0xFF) - ((((c2>>16)&0xFF)*a1)>>8), 0)<<16)
+ (clipMin(((c1>> 8)&0xFF) - ((((c2>> 8)&0xFF)*a1)>>8), 0)<< 8)
+ (clipMin(((c1 )&0xFF) - ((((c2 )&0xFF)*a1)>>8), 0) );
}


Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!