Efficient way to find same pixel?

helloworld123 · 2010-02-27T11:26:42

Hi, Let's say this is to be done: Given an image with 32bit RGBA pixels, I wish to list down all the unique pixels, such that: R:G:B:A = 5:230:5:90 = occurs 5x in image R:G:B:A = 2: 42:5:10 = occurs 2x in image and so on... I could go and compare each pixel one by one through a loop (comparing first pixel with the rest of the image, then comparing 2nd pixel with the rest of the image, and so on). But that would be slow? is there a faster way? Or is there a method / tool already that does this? Thanks!

Graphics and GPU Programming Programming

Started by helloworld123 February 23, 2010 12:46 AM

25 comments, last by Melekor 14 years, 1 month ago

Melekor

379

February 25, 2010 09:06 AM

Just take Brother Bob's solution, but use Radix Sort (or some other non-comparison-based method) instead of std::sort. That's O(n) complexity.

Antheus

2,410

February 25, 2010 09:29 AM

map(String key, Image i): for each pixel p in i:   EmitIntermediate(p, "1"); reduce(String key, Iterator values):int result = 0;for each color in values:  result += ParseInt(color);Emit(AsString(result));

Supports petabyte-sized inputs.

pbryant

100

February 26, 2010 11:20 PM

Quote:Original post by Melekor
Just take Brother Bob's solution, but use Radix Sort (or some other non-comparison-based method) instead of std::sort. That's O(n) complexity.

Radix sort is slower than std::sort (a heavily optimized quicksort).

See http://twistaz.net/sorts/

Melekor

379

February 27, 2010 10:33 AM

Quote:Original post by pbryant
Quote:Original post by Melekor
Just take Brother Bob's solution, but use Radix Sort (or some other non-comparison-based method) instead of std::sort. That's O(n) complexity.

Radix sort is slower than std::sort (a heavily optimized quicksort).

See http://twistaz.net/sorts/

This is untrue. For random 32-bit integers, the only reason radix would be slower is with a poor implementation. But don't take my word for it. I have some optimized radix code lying around, let me just whip up a small benchmark to prove my point.

Melekor

379

February 27, 2010 11:09 AM

OK, here are the results:

Quote:Total time elapsed to sort 64 MB of random 32-bit integers 5 times:

std::sort: 3.71s
RadixSort: 1.50s

Conclusion: RadixSort is the winner (2.48x as fast as std::sort)

And the source code:

#include <cstdlib>#include <cstring>#include <cstdio>#include <algorithm>typedef unsigned int uint;typedef unsigned int uint32;typedef const char *cpointer;#include <windows.h>// radix sort code based on// http://www.stereopsis.com/radix.html#define _0(x)	(x & 0x7FF)#define _1(x)	(x >> 11 & 0x7FF)#define _2(x)	(x >> 22)static void RadixSort11(uint32 *data, uint32 *sorted, uint32 count){	unsigned int i;	// 3 histograms on the stack:	const uint32 kHist = 2048;	uint32 b0[kHist * 3];	uint32 *b1 = b0 + kHist;	uint32 *b2 = b1 + kHist;	memset(b0, 0, sizeof(b0));	// 1.  parallel histogramming pass	//	for (i = 0; i < count; i++)	{		uint32 x = data;		b0[_0(x)] ++;		b1[_1(x)] ++;		b2[_2(x)] ++;	}	// 2.  Sum the histograms -- each histogram entry records the number of values preceding itself.	{		uint32 sum0 = 0, sum1 = 0, sum2 = 0;		uint32 tsum;		for (i = 0; i < kHist; i++)		{			tsum = b0 + sum0;			b0 = sum0 - 1;			sum0 = tsum;			tsum = b1 + sum1;			b1 = sum1 - 1;			sum1 = tsum;			tsum = b2 + sum2;			b2 = sum2 - 1;			sum2 = tsum;		}	}	// byte 0: read/write histogram, write out flipped	for (i = 0; i < count; i++)	{		uint32 x = data;		uint32 pos = _0(x);		sorted[++b0[pos]] = x;	}	// byte 1: read/write histogram, copy	//   sorted -> data	for (i = 0; i < count; i++)	{		uint32 x = sorted;		uint32 pos = _1(x);		data[++b1[pos]] = x;	}	// byte 2: read/write histogram, copy	//   data -> sorted	for (i = 0; i < count; i++)	{		uint32 x = data;		uint32 pos = _2(x);		sorted[++b2[pos]] = x;	}}void CheckSorted(const char *method, uint32 *data, uint32 count){	for(uint i = 1; i < count; i++)	{		if(data[i-1] > data)		{			printf("%s is incorrect!", method);			exit(0);			return;		}	}}struct Timer{	LARGE_INTEGER t0;	double elapsed;	Timer() : elapsed(0) {}	void start()	{		QueryPerformanceCounter(&t0);	}	void stop()	{		LARGE_INTEGER freq, t1;		QueryPerformanceCounter(&t1);		QueryPerformanceFrequency(&freq);		elapsed += double(t1.QuadPart - t0.QuadPart) / freq.QuadPart;	}};// Simple test of radixint main(int argc, char* argv[]){	// Make sure background processes don't mess with our test results.	SetPriorityClass(GetCurrentProcess(), ABOVE_NORMAL_PRIORITY_CLASS);	uint32 i;	const uint32 count = 1<<24; // 64 megabytes worth of 32-bit integers.	const uint32 trials = 5;	uint32 *data = new uint32[count];	uint32 *a = new uint32[count];	uint32 *b = new uint32[count];	enum	{		STDSORT,		RADIXSORT	};	Timer timers[2];	for (i = 0; i < trials; i++)	{		// generate a random array		for (i = 0; i < count; i++)		{			data = rand() ^ (rand()<<15) ^ (rand()<<30);		}		// test both methods on the array		for(uint test = 0; test < 2; ++test)		{			memcpy(a, data, count*sizeof(uint32));			if(test == STDSORT)			{				timers[STDSORT].start();				std::sort(a, a+count);				timers[STDSORT].stop();				CheckSorted("std::sort", a, count);			}			else			{				timers[RADIXSORT].start();				RadixSort11(a, b, count);				timers[RADIXSORT].stop();				CheckSorted("RadixSort", b, count);			}		}	}	printf(		"Total time elapsed to sort %d MB of random 32-bit integers %d times:\n\n"		"std::sort:\t%.2lfs\n"		"RadixSort:\t%.2lfs\n\n",		count*sizeof(uint32)/(1024*1024), trials, timers[STDSORT].elapsed, timers[RADIXSORT].elapsed);	if(timers[STDSORT].elapsed < timers[RADIXSORT].elapsed)	{		printf("Conclusion: std::sort is the winner (%.2lfx as fast as RadixSort)\n",			timers[RADIXSORT].elapsed/timers[STDSORT].elapsed);	}	else if(timers[STDSORT].elapsed > timers[RADIXSORT].elapsed)	{		printf("Conclusion: RadixSort is the winner (%.2lfx as fast as std::sort)\n",			timers[STDSORT].elapsed/timers[RADIXSORT].elapsed);	}	else	{		printf("It's a tie! That's quite a coincidence.\n");	}	return 0;}

swiftcoder

18,997

February 27, 2010 11:09 AM

Quote:Original post by Melekor
Quote:Original post by pbryant
Radix sort is slower than std::sort (a heavily optimized quicksort).
This is untrue. For random 32-bit integers, the only reason radix would be slower is with a poor implementation.

The typical std::sort implementation is quicksort [with an average case of O(N log N)], and fallbacks to heap sort for bad cases, yielding a worst case of O(N log N) as well.

By contrast, radix sort is always O(k N), where k is the number of bits in the key.

Therefore Radix sort performs better than std::sort then when log(N) is greater than k. For 8-bit RGBA pixel data, k=32, so we need log(N) > 32. For a given base, b, than means we need N > b^32, and for pretty much any value of b, that is a darn big number.

Correct me if I am reasoning incorrectly, but that makes std::sort faster for most any reasonable image sizes.

Tristam MacDonald. Ex-BigTech Software Engineer. Future farmer. [https://trist.am]

Melekor

379

February 27, 2010 11:26 AM

Quote:Original post by swiftcoder
The typical std::sort implementation is quicksort [with an average case of O(N log N)], and fallbacks to heap sort for bad cases, yielding a worst case of O(N log N) as well.

By contrast, radix sort is always O(k N), where k is the number of bits in the key.

Therefore Radix sort performs better than std::sort then when log(N) is greater than k. For 8-bit RGBA pixel data, k=32, so we need log(N) > 32. For a given base, b, than means we need N > b^32, and for pretty much any value of b, that is a darn big number.

Correct me if I am reasoning incorrectly, but that makes std::sort faster for most any reasonable image sizes.

Well, strictly speaking radix sort performs better when c * log(N) > k, for some unknown constant c. But let's ignore that for now.

The most important thing is that k in O(k N) isn't the number of bits, it's the number of digits. We can split the keys into any number of digits we think would be most advantageous. For example with 32 bit integers, we can write them in base 2048 (11 bits for each digit) so that k = 3. (This is what is implemented in the above code.)

Now radix sort is starting to look much better isn't it?

Efficient way to find same pixel?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Efficient way to find same pixel?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines