• Create Account

# Explaning Image downsample

Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

25 replies to this topic

### #1Ahmed Egyptian  Members   -  Reputation: 137

Like
0Likes
Like

Posted 13 March 2013 - 03:33 PM

Hi All,

I tried to figure out the meaning of that function, I couldn't, even tried on paper, but didn't get it. Would someone explain that to me ? If that's possible with Ascii text or some images would be so much appreciated.

static void inline resizeRow(uint32_t *dst, uint32_t *src, uint32_t pixelsPerRow)

{
uint8_t * pSrc8 = (uint8_t *)src;
uint8_t * pDest8 = (uint8_t *)dst;
int stride = pixelsPerRow * sizeof(uint32_t);
int x;
int r, g, b, a;

for (x=0; x<pixelsPerRow; x++)
{
r = pSrc8[0] + pSrc8[4] + pSrc8[stride+0] + pSrc8[stride+4];
g = pSrc8[1] + pSrc8[5] + pSrc8[stride+1] + pSrc8[stride+5];
b = pSrc8[2] + pSrc8[6] + pSrc8[stride+2] + pSrc8[stride+6];
a = pSrc8[3] + pSrc8[7] + pSrc8[stride+3] + pSrc8[stride+7];
pDest8[0] = (uint8_t)((r + 2)/4); // average with rounding
pDest8[1] = (uint8_t)((g + 2)/4);
pDest8[2] = (uint8_t)((b + 2)/4);
pDest8[3] = (uint8_t)((a + 2)/4);
pSrc8 += 8; // skip forward 2 source pixels
pDest8 += 4; // skip forward 1 destination pixel
}
}


Edited by Ahmed Egyptian, 13 March 2013 - 03:35 PM.

### #2Paradigm Shifter  Crossbones+   -  Reputation: 4746

Like
1Likes
Like

Posted 13 March 2013 - 03:40 PM

It's halving the size of 2 rows of the source image by adding together the r, g, b and a components of each 4x4 pixel square in the 2 rows of the source image, adding 0.5 to each value as well (so adding 2 to the total), then dividing by 4, to get the destination colour.

Looks like the for loop should go to pixelsPerRow - 1 though, otherwise it will read pixels off the right hand side on the last iteration.

Presumably the function is called in a loop for every other row of the source image.

"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

### #3ApochPiQ  Moderators   -  Reputation: 11933

Like
2Likes
Like

Posted 13 March 2013 - 03:44 PM

Downsampling is basically just taking an image at a higher resolution and redrawing it at a lower resolution.

For example, suppose I have a 10x10 pixel image, and downsample it to 5x5. There are a number of ways to do this. First, I could simply skip every other pixel:

Source image: 0 1 2 3 4 5 6 7 8 9

Destination image: 0 2 4 6 8

And then I skip a row, and repeat this process on the next row.

This is going to make things look a little bad, though. A better approach is to average the pixels:
Source image:

A B
C D

Destination pixel at 0,0 is A+B+C+D / 4
It looks like your code is doing the second method.
Maker of Machinery

### #4Paradigm Shifter  Crossbones+   -  Reputation: 4746

Like
0Likes
Like

Posted 13 March 2013 - 03:47 PM

I think the loop should probably go from 0 to pixelsPerRow / 2 or else x should increment by 2 every iteration...

"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

### #5Ahmed Egyptian  Members   -  Reputation: 137

Like
0Likes
Like

Posted 13 March 2013 - 03:59 PM

Thanks for your answer, I have made an imaginary bitmap, so I got 4*4 pixels assuming 1 Pixel is 3 bytes RGB, and I added them, but how come from two rows I get 4*4 pixels?

Would someone please show me some pics ?

Regarding also those lines:

                pDest8[0] = (uint8_t)((r + 2)/4); // average with rounding
pDest8[1] = (uint8_t)((g + 2)/4);
pDest8[2] = (uint8_t)((b + 2)/4);
pDest8[3] = (uint8_t)((a + 2)/4);
pSrc8 += 8; // skip forward 2 source pixels
pDest8 += 4; // skip forward 1 destination pixel


He skipped 4 bytes, but at the second iteration he writes again at 0,1,2,3 at the dest, should be 4,5,6,7 ?

### #6Paradigm Shifter  Crossbones+   -  Reputation: 4746

Like
1Likes
Like

Posted 13 March 2013 - 04:06 PM

That's because it's a pointer not an array... if you do

pSrc[offset] that's just shorthand for *(pSrc + offset) but you add 4 to pSrc each iteration so if you have

char myCharArray[10000] = { some data };

char* pSrc = myCharArray;

pSrc[0] = 10;

pSrc += 4;

pSrc[0] = 11;

that's the same as writing 10 to myCharArray[0] and 11 to myCharArray[4].

EDIT: Oops, mixed up order of pSrc and pBase in first attempt.

EDIT2: Changed pBase into myCharArray... easier to understand...

Edited by Paradigm Shifter, 13 March 2013 - 04:10 PM.

"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

### #7Ahmed Egyptian  Members   -  Reputation: 137

Like
0Likes
Like

Posted 13 March 2013 - 04:21 PM

Thanks for your explanation, you know what I have worked a lot in C,C++, and I never know that info!! I'm gonna review pointers now..

If its ok for you, would you please just draw a simple bitmap to get the algorithm ?

I would like also to modify it so that it works for any scale, not just by two.

### #8Paradigm Shifter  Crossbones+   -  Reputation: 4746

Like
0Likes
Like

Posted 13 March 2013 - 04:36 PM

I'm terrible at drawing ;)

It's not easy to make it work for scales other than a simple division of the width either... you need to use a different way of weighting the pixel values since the sample points you take aren't centred on the pixels of the source image. Best to use a graphics library to do the downsizing for you (if you need to do it at runtime) or if you just need to resize a lot of images once use an image filtering program (or something like photoshop).

"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

### #9Ahmed Egyptian  Members   -  Reputation: 137

Like
0Likes
Like

Posted 13 March 2013 - 04:51 PM

I have used opencv function cv::resize and it is slow on ARM... that's why I wanna write my own, and convert it to ARM Neon...

### #10Ahmed Egyptian  Members   -  Reputation: 137

Like
0Likes
Like

Posted 13 March 2013 - 05:20 PM

What is meant by stride + 0 ,  stride + 1 ?

### #11Paradigm Shifter  Crossbones+   -  Reputation: 4746

Like
0Likes
Like

Posted 13 March 2013 - 05:28 PM

That's the pixel on the next row down. Stride is the width of the image times the size of each pixel (4 bytes).

So if the width is 100, the stride is 400 bytes and

pDest[0] is the first byte of the top left pixel of the 2x2 square being considered and pDest[0 + stride] = pDest[400] which is the first byte of the pixel on the next row down of the source image.

"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

### #12Khatharr  Crossbones+   -  Reputation: 2589

Like
0Likes
Like

Posted 13 March 2013 - 05:30 PM

Stride is used to pad the width of an image so that its width in memory aligns to a power of two, which makes the pixels easier to reference with bit-math.

Sometimes stride is the 'true length' of a row, sometimes it's the 'added length' of a row.  For instance, if you have a texture with width 100 there's a good chance that it's stored as a width 128 texture. The stride is either 128 or 28, depending on what API you're working with. When the image is rendered only the 100 pixel section gets drawn, but having the width stored as 128 means that you can use bit-math to select a row very quickly in the hardware.

Edit - Ah, yeah. In some cases the stride is stored in byte length rather than pixel length as well. It should be documented somewhere in what you're working with, but basically pixel + stride + 1 means the pixel that's one row down and one column to the right from where you're at. In this case it's clearly defined as 'int stride = pixelsPerRow * sizeof(uint32_t);' in the code itself.

As a term, whenever you see 'stride' you should just realize that it's dealing with the 'real length' of a row rather than the apparent length.

Edited by Khatharr, 13 March 2013 - 05:33 PM.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

### #13Paradigm Shifter  Crossbones+   -  Reputation: 4746

Like
1Likes
Like

Posted 13 March 2013 - 05:36 PM

Note that the code posted doesn't actually calculate the stride width, it just multiplies the number of pixels per row by 4, so any calculation based on a stride that isn't the width must be done outside the function shown in the first post.

Not all strides are powers of 2, depends on the graphic file format (bmp files use next highest multiple of 4 for the stride, for example).

"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

### #14Ahmed Egyptian  Members   -  Reputation: 137

Like
0Likes
Like

Posted 13 March 2013 - 06:12 PM

Thanks for replies.

so the evaulation of that statement   r = pSrc8[0] + pSrc8[4] + pSrc8[stride+0] + pSrc8[stride+4];

is :    r = first red byte + 4th red byte of 4th pixel, red byte at row 800  and red byte 404 at row 800 ?

Edited by Ahmed Egyptian, 13 March 2013 - 06:48 PM.

### #15Khatharr  Crossbones+   -  Reputation: 2589

Like
0Likes
Like

Posted 13 March 2013 - 08:35 PM

It's 'red byte I'm on' + 'red byte to my right' + 'red byte below me' + 'red byte below me and to my right'.

Adding the stride moves you down one row.

If you think of a 10x10 grid, the stride is 10. If the grid is represented an array of 100 bytes then index 4 is the 5th cell in the top row. index 4 + stride is index 14, which is the 5th cell in the second row.

One way to simplify this kind of process is to use a union to represent a color value:

union uColor {
unint32_t u32;
struct {
unsigned char alpha;
unsigned char green;
unsigned char blue;
unsigned char red;
};
};


You can create a color:

uColor col;

then reference the uint value:

col.u32

or a specific channel:

col.red

You cast an array of texels as uColor* then use the union to easily access the channels without trying to work with two index scales between uint and uchar.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

### #16iMalc  Crossbones+   -  Reputation: 2136

Like
0Likes
Like

Posted 14 March 2013 - 01:43 AM

To downsize an image with an even higher quality result, i.e. more true to the orignal image in terms of overall brightness levels, you can perform dithering as well.
E.g. take Red colour chanel values of four pixels. Lets say they are 100, 101, 102, and 103.
Sum them up, you get 406.
Divide by 4 you get 101, remainder 2.
Because the remainder is 2, the rounding, by adding two befire dividing, produces a final result of 102. The problem is that the actual correct answer is 101.5

Big deal right? Nobody cares! You can't represent the 0.5 anyway, and rounding to the nearest integer beats always rounding down, or always rounding up!
That's all largely true, but well actually the person adding 2 cares enough to at least recognise the problem, but do they know the best solution...
Consider what happens if the majority of the remainders cause rounding to go up. E.g. What if all the pixels were the same colour? Overall the image brightness has changed. Yes it's barely noticeable in most cases, and 99% of people wont care about it, but it's there.
Wouldn't it be better if we took half of those pixels and rounded those down instead? Then the overall brightness would be the same.
Better still, we could add 0, 1, 2, or 3 before dividing by 4 each 1/4th of the time, which helps for the cases where the remainder is odd.

This tends to be done one of several ways:
In a patterened approach - pattern dithering.
In a randomised way, adding anything from 0-3 randomly before the division - random dithering.
Or, where the error term is accumulated as we travel across the pixels in each row - Floyd-Steinberg style dithering.

These are all techniques that subtley increases the quality of the resulting image. It certainly isn't so important in real-time rendering, and it is less important the higher the bit-depth. But when working with say 256-colour images in an image editing program for example, this stuff really makes a difference. Most people will be using an image processing application that will probably happen to use one of the above techniques anyway, so it's all done for you. If you're the one writing such an application, you might need to know this stuff in order to produce images of the same quality as other applications. I just thought I'd share it anyway.

Edited by iMalc, 14 March 2013 - 01:47 AM.

"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms

### #17Ahmed Egyptian  Members   -  Reputation: 137

Like
0Likes
Like

Posted 14 March 2013 - 02:44 AM

Thanks for all the input.

What should be added to the above function to downsample to any size ?

### #18iMalc  Crossbones+   -  Reputation: 2136

Like
0Likes
Like

Posted 14 March 2013 - 12:37 PM

Thanks for all the input.

What should be added to the above function to downsample to any size ?

Ah, that's quite a different task.
As you know, the above code only downsamples by a factor of four exactly. To downsample by a different amount involves picking a strategy such as "bilinear interpolation", or "bicubic interpolation".
Perhaps try looking up those terms.
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms

### #19rozz666  Members   -  Reputation: 491

Like
0Likes
Like

Posted 14 March 2013 - 02:17 PM

One way to simplify this kind of process is to use a union to represent a color value:

union uColor {
unint32_t u32;
struct {
unsigned char alpha;
unsigned char green;
unsigned char blue;
unsigned char red;
};
};
You can create a color:

uColor col;

then reference the uint value:

col.u32

or a specific channel:

col.red

You cast an array of texels as uColor* then use the union to easily access the channels without trying to work with two index scales between uint and uchar.

Actually, that's undefined behaviour. You are not allowed to read from a member that wasn't written to directly. Also, casting a pointer to unsigned char to a structure is also undefined behaviour, since there is no guarantee by C++ that such cast is valid (e.g. due to alignment).

Edited by rozz666, 14 March 2013 - 02:20 PM.

### #20Khatharr  Crossbones+   -  Reputation: 2589

Like
0Likes
Like

Posted 14 March 2013 - 02:45 PM

...

Actually, that's undefined behaviour. You are not allowed to read from a member that wasn't written to directly. Also, casting a pointer to unsigned char to a structure is also undefined behaviour, since there is no guarantee by C++ that such cast is valid (e.g. due to alignment).

It's not defined by C++. It is defined by the compiler. I've never come across a compiler that had a problem doing it correctly, even without pragmas. Endianness may be a concern if you port it, but that's easily solved. In short, if you're iterating through units and casting them as byte arrays then you're already in the 'grey area' even though it's something that has to be done all the time. May as well make it easier to work with.

I should probably have used uint8_t there, though, since I used uint32_t.

Edited by Khatharr, 14 March 2013 - 02:48 PM.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

PARTNERS