Sign in to follow this  

Efficient way to find same pixel?

This topic is 2846 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, Let's say this is to be done: Given an image with 32bit RGBA pixels, I wish to list down all the unique pixels, such that: R:G:B:A = 5:230:5:90 = occurs 5x in image R:G:B:A = 2: 42:5:10 = occurs 2x in image and so on... I could go and compare each pixel one by one through a loop (comparing first pixel with the rest of the image, then comparing 2nd pixel with the rest of the image, and so on). But that would be slow? is there a faster way? Or is there a method / tool already that does this? Thanks!

Share this post


Link to post
Share on other sites
1) Create a 4 dimensional array of integers, indexed by r, g, b and a.
2) Zero all entries
3) For each pixel, increment the correct array entry
4) Win (if you have enough memory)

Share this post


Link to post
Share on other sites
Quote:
Original post by helloworld123
I could go and compare each pixel one by one through a loop (comparing first pixel with the rest of the image, then comparing 2nd pixel with the rest of the image, and so on).

But that would be slow?

is there a faster way? Or is there a method / tool already that does this?

That can be performed much more efficiently if you sort the image's pixel values. Assuming img is a pointer to the image (if it's already a 32-bit type pointer, the cast is unnecessary).

unsigned int *img32bit = reinterpret_cast<unsigned int *>(img);
std::sort(img32bit, img32bit+width*height);

Now all you have to do is find adjacent pixel values with the same color, instead of searching the whole image for the same color.

Quote:
Original post by pbryant
1) Create a 4 dimensional array of integers, indexed by r, g, b and a.
2) Zero all entries
3) For each pixel, increment the correct array entry
4) Win (if you have enough memory)

And how much memory do you think would such an array require, even for a moderate color space such as 8-bits per component?

Share this post


Link to post
Share on other sites
Quote:
Original post by Brother Bob
And how much memory do you think would such an array require, even for a moderate color space such as 8-bits per component?


16gb, but that is the speed/memory trade off (which incidentally, is the point I was trying to make)

Depending on your environment, you might not be able to allocate such a big array, so it would have to be split up. However, testing with a 1024x1024 rgb image gets the counts 33x faster than the required std::sort call alone (which essentially just does exactly what he stated in the original post - a compare of each pixel to every other pixel in the image).

Share this post


Link to post
Share on other sites
What if instead of int you'd store a bool value, so when the value is true, you know there's at least 1 pixel of that color. It would work when you only need to know whether color is unique not how many times it occurs.

Share this post


Link to post
Share on other sites
Quote:
Original post by Eigen
What if instead of int you'd store a bool value, so when the value is true, you know there's at least 1 pixel of that color. It would work when you only need to know whether color is unique not how many times it occurs.


Yeah, if you just need to know if each color occurs then you can shave it down to 4gb, assuming your compiler has 1 byte bools.

Share this post


Link to post
Share on other sites
Quote:
Original post by pbryant
(which essentially just does exactly what he stated in the original post - a compare of each pixel to every other pixel in the image).

No, that's not what the sort method does. It doesn't compare with every pixel of the image, it compares values, and only as much as is needed to determine how many there are. Sorting only rearranges the order so you can make efficient assumptions about the order of equal pixels that you cannot do about the original image.

Quote:
Original post by pbryant
Quote:
Original post by Eigen
What if instead of int you'd store a bool value, so when the value is true, you know there's at least 1 pixel of that color. It would work when you only need to know whether color is unique not how many times it occurs.


Yeah, if you just need to know if each color occurs then you can shave it down to 4gb, assuming your compiler has 1 byte bools.

If the task is only to find duplicate values, not how many of them (and that is, indeed, how he formulated it), you can even cut it down to 4 Gbits, or 512 MB. (edit: Scrap that; you need 2 bits to determine the second time a color is seen, so 1 GB.)

Share this post


Link to post
Share on other sites
What are you going to do with such a dataset?

Would it instead suffice that when an user provides a given color, the app would report the count of pixels with that particular color? This way, you wouldn't have to store anything but the actual image.

Share this post


Link to post
Share on other sites
Quote:
[i]Original post by Brother Bob
No, that's not what the sort method does. It doesn't compare with every pixel of the image, it compares values, and only as much as is needed to determine how many there are. Sorting only rearranges the order so you can make efficient assumptions about the order of equal pixels that you cannot do about the original image.


~Essentially~

I realize that it doesn't actually compare each pixel with all the rest.
std::sort certainly doesn't determine "how many there are".

The point remains that massive gains can be made in terms of speed (far and above sorting) if you are willing to sacrifice memory.

Edit: In fact, you could probably just use a comparatively small hash table rather than a huge array, and still come out ahead by a factor of 30 over sorting.

Share this post


Link to post
Share on other sites
512MB with one bit per color would be enough to check for dublication (but the helloworld123 wants the count, so probably it's really 16GB).

which version is faster depends on the image size, the bit table has contant cost regarding the amount of pixel O(1) while the sorting is probably O(n log n).

Sometimes you can reduce the color range, e.g. if you want to generate a 256-color palette, you'd be fine with probably 6bit/channel, cause you'll quantitize the image anyway. 2^(3*6) bit are just 262144entries, even using int32, you'll end up with 1MB.


So I think both versions are valid, the better choice depends on the particular case:
-bit/channel
-image resolution
-checking for doublication vs. counting

Share this post


Link to post
Share on other sites
Quote:
Original post by Krypt0n
which version is faster depends on the image size, the bit table has contant cost regarding the amount of pixel O(1) while the sorting is probably O(n log n).
A better solution is to iterate over every pixel in the image, and insert the pixel into a std::tr1::unordered_multiset (i.e. a hash table) of colours.

Insertion into an unordered_set is amortized constant time, so the total cost is O(N) with respect to the number of pixels, in *both* time and memory.

By contrast, pbryant's solution is O(N) in terms of time, but that is dwarfed by a massive constant cost in memory, and Brother Bob's solution is O(N log N) in terms of time, with (almost) no cost in memory.

Share this post


Link to post
Share on other sites
Could it be possible to use some kind of different approach with pixel shaders ?

I'm thinking of something like this :

Use the first color in question
(R:G:B:A = 5:230:5:90 ) as stated in the original post to render a quad with color mask for that color. Use clip(HLSL/Cg) to discard any pixels that do not match that color, for those that match the color - ouput pure white (1.0,1.0,1.0,1.0) . The resulting image should be a black background with some white pixels that match 5:230:5:90. Downsample that to one pixel and get the color value of that pixel. The brightness of that pixel should contain the percentage of the appearing of the color in question. You could probably use the four channes of the output color to store other information or for greater precision.
You can also use to test 4 colors in one pass, outputing "white" values in the 4 channels of the output. Then the 4 components of the downsampled image should have values corresponding to the frequency of appearing of the 4 tested colors. Should be fast enough. Don't know how precise it would be. Maybe good enough, for the particular solution.
Is this going to work ? Are there similar methods like this(using shaders) to approach such a probllem ?

Share this post


Link to post
Share on other sites
Quote:
Original post by swiftcoder
Quote:
Original post by Krypt0n
which version is faster depends on the image size, the bit table has contant cost regarding the amount of pixel O(1) while the sorting is probably O(n log n).
A better solution is to iterate over every pixel in the image, and insert the pixel into a std::tr1::unordered_multiset (i.e. a hash table) of colours.

Insertion into an unordered_set is amortized constant time, so the total cost is O(N) with respect to the number of pixels, in *both* time and memory.

By contrast, pbryant's solution is O(N) in terms of time, but that is dwarfed by a massive constant cost in memory, and Brother Bob's solution is O(N log N) in terms of time, with (almost) no cost in memory.

A hash table for int will result in a std::vector that indexes with modulo (the compiler will optimize that if po2) and quite some buckets.

unordered_multiset (actually, why not using unordered_set?) is a nice hash-table implementation, but which version is faster purely depends on the distribution of pixels, if you have bad luck (and that's not impossible in a picture) and your colors are all near offset+n^hash_table_size (histograms show that usually), you'll have a linear time increase O(N*N) (for reference: http://msdn.microsoft.com/en-us/library/bb982739.aspx "In the worst case, when all of the elements are in one bucket, the number of operations is proportional to the number of elements in the sequence (linear time)".
Even in best case, every lookup won't be faster than the bit-test (it will have more indirections and more cache trashing due to bigger structures).


And it will result in a slightly more complicated trade-of question
-bit/channel
-image resolution
-checking for doublication vs. counting
-pixel distribution

From that point of view, I'd prefer the quicksort over the unordered_set.

And my point of view is, that it is really random, purely depending on the input, which version will be faster.

Share this post


Link to post
Share on other sites
Another idea without shaders. Use a fullscreen quad with the texture that contains colors. Use alphatest to render only the color you are currently counting pixels of. Issue an occlusion query and it will return the number of pixels of that color.

Scrap that, and the previous. I should have read the whole thread, before posting. I miss-understood the whole point :)

[Edited by - solenoidz on February 23, 2010 9:52:44 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Krypt0n
A hash table for int will result in a std::vector that indexes with modulo (the compiler will optimize that if po2) and quite some buckets.
It is a hash table containing colours, not integers - Colour may be represented as a single integer (i.e. a paletted texture), 4 floating-point values (i.e. high dynamic range floating point texture), or anything in between. While the hash function you offer is one possibility, it isn't the only possibility.
Quote:
unordered_multiset (actually, why not using unordered_set?)
Because the OP requested the count of each unique pixel, not just the existence of unique pixels.
Quote:
Even in best case, every lookup won't be faster than the bit-test (it will have more indirections and more cache trashing due to bigger structures).
Lets leave the bit-table out of the question, because it plainly doesn't scale. 8-bit colour requires a 16 GB table, which is out of reach of most consumer hardware, and 32-bit colour would require a roughly 10^30 GB table.
Quote:
From that point of view, I'd prefer the quicksort over the unordered_set.
I agree, but it is worth evaluating the input images to see which would perform better.



Another interesting variation would be working on RLE encoded images - for some types of images, particularly those containing few colours (which is very bad for the hash map approach) this would represent a considerable performance improvement.

Share this post


Link to post
Share on other sites
Quote:
Original post by helloworld123

Hi,

Let's say this is to be done:


Given an image with 32bit RGBA pixels, I wish to list down all the unique pixels, such that:

R:G:B:A = 5:230:5:90 = occurs 5x in image
R:G:B:A = 2: 42:5:10 = occurs 2x in image

and so on...



I could go and compare each pixel one by one through a loop (comparing first pixel with the rest of the image, then comparing 2nd pixel with the rest of the image, and so on).

But that would be slow?

is there a faster way? Or is there a method / tool already that does this?

Thanks!



struct PIXELS
{
int color;
int count;
}; // 8bytes

void buildPixelSet(UINT *image, UINT width, UINT height)
{
PIXELS foundPixels[width*height];
//worst case scenario: 1920*1080*8 = 16,588,800 bytes =~ 16.6 megabytes
bool newpixel;
int foundPixelCount = 0;
ZeroMemory(&foundPixels, sizeof(foundPixels));
for(int i = 0; i < (1920*1080); i++)
{
newpixel = true;
for(int j = 0; j < foundPixelCount; j++)
{
if(image[i] == foundPixels[j].color)
{
foundPixels[j].count += 1;
newpixel = false;
break;
}
}
if(newpixel == true)
{
foundPixels[foundPixelCount].color = image[i];
foundPixels[foundPixelCount].count++;
foundPixelCount++;
}
}
}
//may need debugging, but you get the point...right? and you dont have to compare the entire picture..only the unique pixels you have already found.


Share this post


Link to post
Share on other sites

std::map<unsigned int, int> pixelsMap;
unsigned int * img = reinterpret_cast<unsigned int*>(originalImgData);
int len = w * h;
for (int i=0;i<len;++i)
{
unsigned int & pixel = img[i];
if (pixelsMap.find(pixel) == pixelsMap.end())
{
pixelsMap[pixel] = 0;
}
++pixelsMap[pixel];
}





That would be my solution.

Share this post


Link to post
Share on other sites
Quote:
Original post by CPPNick
*** Source Snippet Removed ***
This is the same as my hash-table approach, but implemented as an linear search instead of an amortised constant time hash lookup. This raises you all the way up to O(N^2) time and O(N) storage with respect to the number of pixels.
Quote:
Original post by Daivuk
*** Source Snippet Removed ***
That would be my solution.
And you have implemented exactly the same thing using a std::map, which is implemented as a balanced tree with insertion/lookup in logarithmic time, so you are at O(N log N) time, with a worst case of O(N) storage.



I am not picking on your implementations just because I feel like it - there is an important lesson at work here. It is generally a very poor idea to go ahead and implement the first solution that comes into your head, take a couple of minutes before writing any code to think of an efficient approach to the given problem [smile]

In practice, Brother Bob's sort is still the best way to approach this, although my hash table will be faster in some cases.

Share this post


Link to post
Share on other sites
Quote:
Original post by swiftcoder
In practice, Brother Bob's sort is still the best way to approach this


Not true. Using std::sort could only possibly be considered the best way if the images are small. In such a case, the time to process the data is trivial anyway.

Using a custom hash map (which I mentioned in my 4th post), will be much faster for large size images. Since speed is the issue here, we should assume the the images the OP is using are large. In such a case, it makes sense to sacrifice memory for speed.

The hash map is the best, unless you can afford the 16gb of memory.

Share this post


Link to post
Share on other sites
The 16GB RAM solution is not the right answer. Even if you have enough memory, that's not going to be efficient. The linear trawl through that 16GB to harvest the counts at the end is probably going to be slower than the std::sort of even a very large image. Cache! The memory/time tradeoff is usually a false dichotomy, given the hierarchy of storage speeds.

Share this post


Link to post
Share on other sites
map(String key, Image i): 
for each pixel p in i:
EmitIntermediate(p, "1");

reduce(String key, Iterator values):
int result = 0;
for each color in values:
result += ParseInt(color);
Emit(AsString(result));

Supports petabyte-sized inputs.

Share this post


Link to post
Share on other sites
Quote:
Original post by Melekor
Just take Brother Bob's solution, but use Radix Sort (or some other non-comparison-based method) instead of std::sort. That's O(n) complexity.


Radix sort is slower than std::sort (a heavily optimized quicksort).

See http://twistaz.net/sorts/

Share this post


Link to post
Share on other sites
Quote:
Original post by pbryant
Quote:
Original post by Melekor
Just take Brother Bob's solution, but use Radix Sort (or some other non-comparison-based method) instead of std::sort. That's O(n) complexity.


Radix sort is slower than std::sort (a heavily optimized quicksort).

See http://twistaz.net/sorts/


This is untrue. For random 32-bit integers, the only reason radix would be slower is with a poor implementation. But don't take my word for it. I have some optimized radix code lying around, let me just whip up a small benchmark to prove my point.

Share this post


Link to post
Share on other sites
OK, here are the results:

Quote:
Total time elapsed to sort 64 MB of random 32-bit integers 5 times:

std::sort: 3.71s
RadixSort: 1.50s

Conclusion: RadixSort is the winner (2.48x as fast as std::sort)


And the source code:

#include <cstdlib>
#include <cstring>
#include <cstdio>
#include <algorithm>

typedef unsigned int uint;
typedef unsigned int uint32;
typedef const char *cpointer;

#include <windows.h>

// radix sort code based on
// http://www.stereopsis.com/radix.html

#define _0(x) (x & 0x7FF)
#define _1(x) (x >> 11 & 0x7FF)
#define _2(x) (x >> 22)

static void RadixSort11(uint32 *data, uint32 *sorted, uint32 count)
{
unsigned int i;

// 3 histograms on the stack:
const uint32 kHist = 2048;
uint32 b0[kHist * 3];
uint32 *b1 = b0 + kHist;
uint32 *b2 = b1 + kHist;

memset(b0, 0, sizeof(b0));

// 1. parallel histogramming pass
//
for (i = 0; i < count; i++)
{
uint32 x = data[i];
b0[_0(x)] ++;
b1[_1(x)] ++;
b2[_2(x)] ++;
}

// 2. Sum the histograms -- each histogram entry records the number of values preceding itself.
{
uint32 sum0 = 0, sum1 = 0, sum2 = 0;
uint32 tsum;
for (i = 0; i < kHist; i++)
{
tsum = b0[i] + sum0;
b0[i] = sum0 - 1;
sum0 = tsum;

tsum = b1[i] + sum1;
b1[i] = sum1 - 1;
sum1 = tsum;

tsum = b2[i] + sum2;
b2[i] = sum2 - 1;
sum2 = tsum;
}
}

// byte 0: read/write histogram, write out flipped
for (i = 0; i < count; i++)
{
uint32 x = data[i];
uint32 pos = _0(x);
sorted[++b0[pos]] = x;
}

// byte 1: read/write histogram, copy
// sorted -> data
for (i = 0; i < count; i++)
{
uint32 x = sorted[i];
uint32 pos = _1(x);
data[++b1[pos]] = x;
}

// byte 2: read/write histogram, copy
// data -> sorted
for (i = 0; i < count; i++)
{
uint32 x = data[i];
uint32 pos = _2(x);
sorted[++b2[pos]] = x;
}
}

void CheckSorted(const char *method, uint32 *data, uint32 count)
{
for(uint i = 1; i < count; i++)
{
if(data[i-1] > data[i])
{
printf("%s is incorrect!", method);
exit(0);
return;
}
}
}

struct Timer
{
LARGE_INTEGER t0;
double elapsed;

Timer() : elapsed(0) {}

void start()
{
QueryPerformanceCounter(&t0);
}

void stop()
{
LARGE_INTEGER freq, t1;
QueryPerformanceCounter(&t1);
QueryPerformanceFrequency(&freq);
elapsed += double(t1.QuadPart - t0.QuadPart) / freq.QuadPart;
}
};

// Simple test of radix
int main(int argc, char* argv[])
{
// Make sure background processes don't mess with our test results.
SetPriorityClass(GetCurrentProcess(), ABOVE_NORMAL_PRIORITY_CLASS);

uint32 i;

const uint32 count = 1<<24; // 64 megabytes worth of 32-bit integers.
const uint32 trials = 5;

uint32 *data = new uint32[count];
uint32 *a = new uint32[count];
uint32 *b = new uint32[count];

enum
{
STDSORT,
RADIXSORT
};

Timer timers[2];

for (i = 0; i < trials; i++)
{
// generate a random array
for (i = 0; i < count; i++)
{
data[i] = rand() ^ (rand()<<15) ^ (rand()<<30);
}

// test both methods on the array
for(uint test = 0; test < 2; ++test)
{
memcpy(a, data, count*sizeof(uint32));

if(test == STDSORT)
{
timers[STDSORT].start();
std::sort(a, a+count);
timers[STDSORT].stop();
CheckSorted("std::sort", a, count);
}
else
{
timers[RADIXSORT].start();
RadixSort11(a, b, count);
timers[RADIXSORT].stop();
CheckSorted("RadixSort", b, count);
}
}
}

printf(
"Total time elapsed to sort %d MB of random 32-bit integers %d times:\n\n"
"std::sort:\t%.2lfs\n"
"RadixSort:\t%.2lfs\n\n",
count*sizeof(uint32)/(1024*1024), trials, timers[STDSORT].elapsed, timers[RADIXSORT].elapsed);

if(timers[STDSORT].elapsed < timers[RADIXSORT].elapsed)
{
printf("Conclusion: std::sort is the winner (%.2lfx as fast as RadixSort)\n",
timers[RADIXSORT].elapsed/timers[STDSORT].elapsed);
}
else if(timers[STDSORT].elapsed > timers[RADIXSORT].elapsed)
{
printf("Conclusion: RadixSort is the winner (%.2lfx as fast as std::sort)\n",
timers[STDSORT].elapsed/timers[RADIXSORT].elapsed);
}
else
{
printf("It's a tie! That's quite a coincidence.\n");
}

return 0;
}

Share this post


Link to post
Share on other sites

This topic is 2846 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this