nearest neighbour 1-bit bitmap

Started by
17 comments, last by AndersOredsson 13 years ago
Hi, long time since I used these forums :-) I basically learned to code here, about 8 years ago! Even lost my old user uncutno.

Question:

I'm doing a nearest neighbor search with a couple of thousand 16x16 1-bit black and white images. I pack the 256 bits into 32 bytes, and use a superfast bit-comparator to count the number of matching bits, and that way i find the closest match.

Without any optimization I can do a search with 2000 instances within the performance requirements.

My first optimization was to create a distance map from every instance to each other, a NxN array. While searching trough the set, i keep track of the closest instance found until now, and then for the rest of the instances I can check in the distance map if i possibly can be closer to that instance, before actually checking the real distance. This makes the search run in about 66% of the time.

I have also tried a lot of other approaches, but with no success.

The problem is that any k-tree or any stuff like that does not work, since its basically 256 dimensions with only two position each; no way to "split" the dimension.

So, any idea on how to do this hierarchal? I have the fealing that i should be able to do this in about 10% of a linear search, and i have all the preprocessing time in the world :-)
Advertisement
How many images would you like to be able to compare against, and how much time do you have to do it? On roughly what hardware?
While I wait for your answers, here's an idea: If you coarsen the image to a 4x4 image, where each pixel is now a count of how many bits were set in its box, you know that the L1 distance between the coarse images is less or equal than the Hamming distance between the original 16x16 images. You can then use some partitioning structure on this lower-dimensional space and prune branches when you can prove that the distance will be larger than one you've already found.

Do you think this might work with your images?
It sounds like spatial partitioning trees can still work here. You'd just need to consider planes that aren't aligned with your axes. You could use a KD tree with another basis, like the Hadamard Basis . Or you could just use a general BSP tree.

Considering general BSP trees: A reasonable algorithm for building these would be to recursively split each data set with a plane through its mean with normal along its principal axis. A yet-better one would be to use k-means with k=2 to determine your splitting planes. If you look at the independent component analysis (ICA) literature, you might also find other ideas to consider.

Whatever splitting method you choose, for a general BSP-tree approach, you'd then at runtime be doing roughly O(log(N)) inner products (each of which is O(n^2)) (if you have N images, each of which is nxn), so you'd have O(n^2 log(N)) complexity, vs. the O(n^2 N) complexity you have now. That said, you'd be doing multiplications instead of bitwise comparisons, so the constants would be worse; there might be a point in the tree beyond which it's cheaper to brute-force.

Anyway I'm speculating at this point. Have you looked at e.g. this?

Ah hah! YES. Check out p. 52 of this pdf. I like this. Here, halfspaces are encoded by pairs of points (called "pivot pairs"): If you're closer to the first point, you're in the first halfspace; if you're closer to the second, you're in the second halfspace. What's nice about this is that (1) it allows you to stick with your fast bitwise stuff, without requiring more general arithmetic (no MULs!), and (2) there are only finitely-many planes to consider. Let's see... well, without even coming up with something cleverer, you can brute-force construct a near-optimal tree (by choosing the pair of pivots that most nearly divide your points in half at each splitting) in O(N^3) time...

There's other stuff out there; e.g. R-trees, M-trees, but I don't really know them well. They're all various kinds of bounding volume hierarchies.

Hope that's somewhat helpful.
WOW! Guys, thanks!

Still the same great responses I remember from my old days here :-) Have to look into these forums more often; even if my fixed-pipeline-graphics-knowledge is somehow outdated :-) I also know alot about AI, but more of the theoretical "not usable in realtime games"-type stuff, then the "how do I make my soldiers cooperate as a realistic squad"-stuff.

alvaro#1: Currently i do about 10 lookups in a set of 2600 images, 10 times per sec = 100 checks in a 2600 set. I do it on an iPhone 3GS arm 32-bit processor. This is brute force check all strategy, so my feeling is that I should be able to do about 100 check is a 15000 set with optimizations. If i could do 200-300 check against a set of about 5000, I would be more then happy.

alvaro#2: Good idea; rank them along number of pixels in square, and then divide on that. Should give me the possibility to exclude a lot of the set + do some space splitting. I have a little pixel-lab where I try out stuff and measure improvement. I promise to post results if I try this out; should be interesting.

Emergent: Ah, thats how you do it :-) you don't need to split along axes. I did a lot of 3D BSP stuff back in the days, so I totally get this now (I sort of fell of the space-stuff when going from 3 to 256 dimensions :-) ). As I read trough your answer, I totally follow you. The two point axis makes perfectly sense regarding my bitstuff. I could even do a sqrt table lookup for distance, since square distance is int 0 - 256. Also thanks for a hint on how to find those points, the perfect divider-algo should be pretty simple to create.

Just to let you know:

Since last time, i optimized my distance-map system, and I optimized my bitstuff. Even if I hate to do low level optimizations; I managed to run in 50% time.

Distance map optimization: For all instances in set, I know the distance to all other instances. If my test is X away from instance 1, and instance 1 is Y away from instance 2, I don't need to do a full distance-test for instance 2 if Y > X*2.

Bitflick optimization: My old version had each case packed into bytes, so my 256 bits was inside 32 bytes. For each of those I did a bit-compare-count. My new version packs the bits into 8 32-bit ints. That way I get away with much less instructions, since I only need to do most of them 8 times (instead of 32).

I will implement the BSP solution and measure efficiency. Since the penalty for increasing the set is lower if I do BSP, I would actually be happy if i could match the current speed. Hopefully I _should_ be able to do it in 5-10% time :-)




Here is my superfast distance2 code. Found it on the web (as the fastest possible ), and updated it from 8-bits to 32-bits.


int distance2 = 0;
for(int i=0;i<8;i++){
uint32_t value;
value = A->data32 ^ B->data32;
while (value) {
distance2++;
value = (value - 1) & value;
}
}
return distance2;


The code tag is killing me today :-)
I thought of giving you some advice about the low-level comparison of images, but I took your description ("superfast bit-comparator") at face value.

Is your code anything like this?
int popcount(unsigned x) {
x -= (x >> 1) & 0x55555555u;
x = (x & 0x33333333u) + ((x >> 2) & 0x33333333u);
x = (x + (x >> 4)) & 0x0f0f0f0fu;
x += x >> 8;
x += x >> 16;
return x & 0x3f;
}

int hamming_distance(const char c1[32], const char c2[32]) {
const unsigned *u1 = reinterpret_cast<unsigned const *>(c1);
const unsigned *u2 = reinterpret_cast<unsigned const *>(c2);

int result = 0;

for (int i=0; i<8; ++i)
result += popcount(u1 ^ u2);

return result;
}


EDIT: I didn't see your previous post. This method should be faster, unless your images are very very similar.
Emergent: The pdf, page 52 is perfect! Exactly what I need :-) Thanks again.
alvaro: Yes, but I have no idea of why the popcount return matching bits. I notice that you don't have a loop, so ill check that out :-) Thanks!
This thread is so cool that Ill post back every experiment + every result that i do!

Could i get it down to 5%? ( given that most interesting cases actually have a pretty close neighbor ? )

This topic is closed to new replies.

Advertisement