curious case of integer overflow?

Started by
9 comments, last by welOhim 9 years ago

On android, i am capturing images doing some (bitmap) analysis,

I scan image (iterate through pixel by pixel) examining each pixel colour.

I needed to find the average colour (primary school maths), But Something wasn't right ,

Initially i had pondered:- as i begin to sum up the colours (a typical value is -8287599) and for a sub patch of 400X300 the sum could be as big as -994511880000 or much higher. These values are higher than an integer could handle so i was expecting overflow error when i run it.

But i didn't get any exception, instead the values were plain wrong. But you will only know they are wrong if you examine very closely. These values could be negative or positive, Black-ish pixels tend to be large negative integers while white-ish pixels tend to be large positive integers.

To pin down the wrongness, i snapped while putting the camera in pitch-dark environment (covering the camera lens or place in a black box ) and snapped

Then i wrote a very simple code to get the largest pixel colour, smallest pixel colour and the average pixel colour.


   int  biggestDark2 = -990466330, biggestWhite2 = 990466330;
   float avr = 0.0f;
                      .
                      .
                      .
		if ( copyOfBM != null ){                         
	    	int y=0, x=0, sum=0,  cnt=0;
	    	
			for(y=0; y<rHeight; y++){
				for( x=0; x<rWidth; x++ ){
				                                                               
					if( !(((y>=lH) && (y<uH)) && ((x>=lW) && (x<uW))) ){
					}                
					else if( (((x>=lW) && (x<uW))) && ((y>=lH) && (y<uH)) ){
						
						if( biggestDark2 < copyOfBM.getPixel(x, y) ){
							biggestDark2 = copyOfBM.getPixel(x, y);
						}
						if( biggestWhite2 > copyOfBM.getPixel(x, y) ){
							biggestWhite2 = copyOfBM.getPixel(x, y);
						}	
						sum = sum + copyOfBM.getPixel(x, y);
						cnt = cnt + 1;
					}
				}
			}
			avr = (float)sum / (float)cnt;
			
			Log.v(tag, "avr             "+avr);
			Log.v(tag, "biggestWhite2    "+biggestWhite2+"  *=**=*   biggestDark2   "+biggestDark2);
			Log.v(tag, "    ");

The output confirmed that the average was completely wrong was:

--------- beginning of /dev/log/system
V/DEAS (18171): avr 9671.571
V/DEAS (18171): biggestWhite2 -16777216 *=**=* biggestDark2 -16645630
V/DEAS (18171):
which doesn't make any sense, since the lowest and highest values are large negatives but the average is a large positive (which shouldn't exist.)
Was this caused by integer overflow?
How do i avoid this overflow problem (if it is overflow) so to obtain the correct average colour?
Thanks
Advertisement

I don't see your "sum" variable initialized...

You try to archieve what ? I hope you don't try to analyse a RGB color by summing up its 24-bit integer representation (or even 32-bit RGBA). If you want something like the luminance of a RGB color, then use something like this :

Lum = 0.2126 R + 0.7152 G + 0.0722 B

In code(RGBA version):


float sum = 0.0f;
float min_lum = 1.0f;
float max_lum = 0.0f;

...

int RGBA_encoded_color = copyOfBM.getPixel(x,y);
float red = float( (RGBA_encoded_color >> 24) & 0xff )  / 255.0f;
float green = float( (RGBA_encoded_color >> 16) & 0xff)  / 255.0f;
float blue = float( (RGBA_encoded_color >> 8) & 0xff )  / 255.0f;

float lum = 0.2126f*red + 0.7152f*green + 0.0722f*blue;
sum += lum;
min_lum = min(lum,min_lum);
max_lum = max(lum,max_lum);

If you want something like the average color, then you need to analyse each color channel separately.


In code(RGBA version):
float sum = 0.0f;
float min_lum = 1.0f;
float max_lum = 0.0f;

...

int RGBA_encoded_color = copyOfBM.getPixel(x,y);
float red = float( (RGBA_encoded_color >> 24) & 0xff ) / 255.0f;
float green = float( (RGBA_encoded_color >> 16) & 0xff) / 255.0f;
float blue = float( (RGBA_encoded_color >> 8) & 0xff ) / 255.0f;

float lum = 0.2126f*red + 0.7152f*green + 0.0722f*blue;
sum += lum;
min_lum = min(lum,min_lum);
max_lum = max(lum,max_lum);
If you want something like the average color, then you need to analyse each color channel separately.

Makes very good sense. I will try this, thanx

Just to be clear, in C (and C++) the result of integer overflow is undefined. If you are relying on undefined behaviour for the correct operation of your software, you're going to run into trouble sooner or later. Your expectations in this regard are irrelevant.

More helpfully, I don't even understand how integer arithmetic relates to colours. If you're trying to find the simple arithmetic mean of a colour value without breaking it down into appropriate colour channel, you should at least be using an unsigned integer since most colour encodings are unsigned positive values.

Unsigned integer overflow is strictly defined by C (and C++). The overflow "wraps", as in performs a mod(MAX_UNSIGNED_INT) on the result.

Stephen M. Webb
Professional Free Software Developer


int RGBA_encoded_color = copyOfBM.getPixel(x,y); float red = float( (RGBA_encoded_color >> 24) & 0xff ) / 255.0f;

The result of right-shifting an int is not defined in C (or C++). Be careful about relying on undefined behaviour. The result of right-shifting an unsigned int is strictly defined by the standard, you should probably prefer that.

Stephen M. Webb
Professional Free Software Developer

int RGBA_encoded_color = copyOfBM.getPixel(x,y); float red = float( (RGBA_encoded_color >> 24) & 0xff ) / 255.0f;

The result of right-shifting an int is not defined in C (or C++). Be careful about relying on undefined behaviour. The result of right-shifting an unsigned int is strictly defined by the standard, you should probably prefer that.
good advice about using unsigned there, but just to be pedantic... :lol:
The signed version is *implementation defined* behaviour, not *undefined* behaviour, which means the spec is replaced by whatever your compiler documentation says.
In practice, every compiler will almost certainly define it as an arithmetic shift / sign extending shift.
This is such an expected behaviour that a hell of a lot of softwafe relies on it, such that a compiler is simply broken if it doesn't follow suit. Doing signed/unsigned casts is the defacto standard way of selecting between logical and arithmetic shifts instructions in C/C++, despite not being in the core spec.

If you arrogantly expect this of your compiler as I've just done, then signed/unsigned will work the same in this particular example, as the high (potentially sign-extended) bits are masked out by the AND anyway.
Nonetheless, an unsigned int is much more clear, yes :)

Just to be clear, in C (and C++) the result of integer overflow is undefined. If you are relying on undefined behaviour for the correct operation of your software, you're going to run into trouble sooner or later. Your expectations in this regard are irrelevant.

More helpfully, I don't even understand how integer arithmetic relates to colours. If you're trying to find the simple arithmetic mean of a colour value without breaking it down into appropriate colour channel, you should at least be using an unsigned integer since most colour encodings are unsigned positive values.

Unsigned integer overflow is strictly defined by C (and C++). The overflow "wraps", as in performs a mod(MAX_UNSIGNED_INT) on the result

-

If it makes any difference, the code is in Java not c++

Second if aBitmapObject.getpixel( x, y ) returns a negative value, who am i to decide to change it to an arbitrary unsigned value which the compiler would then interpret to represent a different colour, that would be a meaningless value

Which is why i strongly believe Ashaman73's code should be an ideal solution as it avoids overflow, while still maintaining the relative colour space mapping as in the compiler interpretation. It is merely scaled down.


If it makes any difference, the code is in Java not c++
Yes, it does. In Java all of this is defined, overflow will wrap around, integer variables get auto initialized to 0, you don't have unsinged types (except for 'char' technically, but whatever), and so on.

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

My journals: dustArtemis ECS framework and Making a Terrain Generator

In trying to understand it, this stands out:


Initially i had pondered:- as i begin to sum up the colours (a typical value is -8287599) and for a sub patch of 400X300 the sum could be as big as -994511880000 or much higher. These values are higher than an integer could handle so i was expecting overflow error when i run it.

which makes absolutely no sense whatsoever.

As Ashaman points out, you need to deal with color channels or with other scalar values.

At first you seemed to understand this, but then you wrote:


Second if aBitmapObject.getpixel( x, y ) returns a negative value, who am i to decide to change it to an arbitrary unsigned value which the compiler would then interpret to represent a different colour in the compiler, that would be a meaningless value. Which is why i strongly believe Ashaman73's code should be an ideal solution as it avoids overflow, while still maintaining the relative colour space mapping as in the compiler interpretation. It is merely scaled down.

Which seems to suggest that you have no clue about how all this works.

Colors are three or four values, Red, Green, Blue, and optionally Alpha, generally encoded in 24 or 32 bits, generally as 8 bits per color channel.

When you talk about a "negative" value, that is because the first value in the encoding is larger than 128. The specific value of red has essentially nothing to do with the "black-ish" or "white-ish" aspect of a single color, only with the intensity of a single channel.

Attempting to treat the encoded RGB triple or RGBA quad as an integer value makes no sense whatsoever.

There are many excellent books on the subject. When I search in Amazon for "digital image processing Java" I find 47 books, 17 rated 4+ stars, with several top-rated books published in recent years. Many of those are textbooks. Image processing is usually an optional course in the 3rd or 4th year of computer science.

Or if you prefer online versions, Google for the same terms, "Java digital image processing tutorial" pulls up several useful links on the subject. They likely won't go into the same depth, but they will hopefully cover enough that you won't be completely ignorant of how to do this.

I recommend reading up whatever of those are available to you, preferably the better books since they tend to explain things more in depth.

Finally, note that image processing is very math intensive. It is usually reserved for the 4th year of college because of the math involved. Convolution kernels and color spaces rely on matrix transforms and dimensional operations. Transformations and signal decomposition rely on integrals and Eigenfunctions. In other words, a solid understanding of both calculus and linear algebra.

So in an attempt to provide a good answer to your question: probably the closest thing to what you describe is an image histogram. Building a histogram is much more than taking the "average color" (whatever that is supposed to mean), and the process of building it will be different depending on if you are looking at a grayscale image, or if you are working in color. In color you need to figure out your color space and color model (such as working in HSL or HSV or frequency space or something else entirely). Once you have generated an image histogram that represents your image, you can use the histogram to find the mean ("average") intensity.

Doing that work is going to require a bit of processing into histogram buckets and then processing on those buckets, but if you use Ashaman's formula above you should be able to convert to pixel luminance to build your histogram, relying on that formula rather than actual knowledge of a color space.

This topic is closed to new replies.

Advertisement