Image Comparison

Started by
4 comments, last by TheUnbeliever 16 years, 10 months ago
Hi all, I am new to image manipulation stuff. I have done some programming in the past in Java/XML etc. I am trying to write an algorithm for image comparison in realtime. Any ideas where I can start with? any decent tutorials available I can refer to? I want the algorithm to be able to search a local drive of images and find exact similarities with other images. regards,
Advertisement
Machine vision, statistics, heuristics and artificial intelligence would be the topics look into. Many of those also require somewhat advanced mathematical background.

The only tutorial level documents you might find will simply use pixel-wise comparison to determine the RMS deviation between two images, but will be unable to compare images of different size.

Main problem here comes from "exact similarities". While somewhat trivial for human, expressing similarity is a very big problem for a computer.

Also, what does real-time mean? From a video camera live feed? Difficult.

Trying to build face recognition software or other pattern matching system isn't an easy task. 98% of the application is trivial. Unfortunately, the 2% comes down to actually comparing the images, which is still a very active research topic.
Antheus speaks the truth. Tell us what is required of your image comparison routines and we'll be able to give more specific advice. I hope you're not scared of a little mathematics.

Most of the work of general image comparison comes down to pattern-matching and majorisation representation. These both hinge on spectral decompositions of image matrices, so that only the salient information is passed to the image-distance metric (so you are comparing the general form of the images and not the background noise). Fortunately for you, the first part of this process is exactly image compression, so you immediately have a whole bunch of resources at your disposal.

A fairly simple, but general system would do something like this:

1. Determine the colour-gamuts of the two images and transform them both to a 'canonical domain', so that the two images look like they came from the same palette.
2. Perform a generalised Q-transform (it closely related to the Fourier transform) on the two images so that a linear comparison can detect matching pieces. For each suitably similar pair of pieces:
3. Scale the two images onto yet another canonical domain, so that the pieces now have the same dimensions, colours and gamma.
4. Filter the two images to remove noise and invisible high-frequency data. A simple JPEG compression would do the trick, but there are faster methods.
5. Compare the images pixel-wise, perhaps by a (colour-component-weighted) RMS difference in the YCC colour-space.

Any number of these steps may be simplified or discarded depending on the nature of the source images. In particular, you'd like to do away with the pattern-matching if possible, as it's computationally costly and rather error-prone.

Admiral
Ring3 Circus - Diary of a programmer, journal of a hacker.
explaining a bit further!!

here is what I have in mind

1- upload an image to an interface
2- perform a search based on the uploaded image. This will result in all the images in the repository compared to the primary image.
3- return matching images.
These steps would form the core functionality. I would then want to (if possible) add features such return images higher/lower than a certain resolution, quality, type (gif, jpeg etc) and percentage of similarity (50% similar, 90% similar etc

The idea I have (although in its infancy and widely open to ideas that would make my life easier) is use something like image slicing so keep slicing an image into equal cubes on the basis of N(exp)2.

Not sure if I am making any sense.

Dont know what I really meant with real-time (maybe it was just to emphasize the fact that I want a very efficient algorithm) - not like the windows search which takes years to return results

I appreciate your help!
Quote:Original post by sshah
explaining a bit further!!

here is what I have in mind

1- upload an image to an interface
2- perform a search based on the uploaded image. This will result in all the images in the repository compared to the primary image.
3- return matching images.
These steps would form the core functionality. I would then want to (if possible) add features such return images higher/lower than a certain resolution, quality, type (gif, jpeg etc) and percentage of similarity (50% similar, 90% similar etc



Heh, all easy. Except for 2).

What is "similarity"?

Does deviation in brightness mean two images are similar? Or are you interested in facial features, where an old BW photograph can be 100% identical to HD DVD video cap?

Forget formats, forget UI, even language and the rest.

This is typically done by transforming the image into some context dependant vector space. How this transformation is performed depends on the context. You then use a library that contains pre-calculated vectors for various criteria, then compare those to the picture you're interested in.

The most trivial transformation that yields somewhat practical results, is comparing resolution independant (scaled to pre-determined size) image, then doing RMS pixel-wise comparison, preferably in YUV format to emphasize the features - which also allows you to normalize the brightness of the Y channel.

So, simply put:
- Resize image to (w, h)
- Convert to YUV
- Perform normalization on Y channel
- Do pixelwise comparison between this image and every image in library, calculating RMS deviation.

The other is to extract features. There's several image classification algorithms, where you pre-process the image searching for edges, corners, or other distint elements.

Then you compare those with same types of elements stored in the library. After you find suitable candidates, you can resize/align images to fit each other, and perform more complex pixel-wise comparison.

The second method is considerably better, since it will, in many cases, be oblivious to rotation, scaling, even shear or other transforms, plus it allows you to search for parts of image only.

But all this is just high level talk. There's so many details here that are highly dependant on the purpose of your application.

Keep in mind that simply offsetting an image by 1 pixel can cause simple algorithms to fail completely.

But the key to this problem is determining "similarity". The word has no meaning, the key is devising an algorithm that will express that.
As a demonstration of how much you shouldn't expect, I'll give you a few links.

An air-fan photo site has a 'similarity search' which uses an algorithm similar to what you seem to be looking for to find similar images (by using data in the photograph, rather than meta-data -- although such a search is available).

I took this photo some years ago. Clicking to find similar photos yields these results -- some of which are good, some of which aren't. However, neither of the following images (which are virtually identical to the human eye) show up high in the results: One, two.
[TheUnbeliever]

This topic is closed to new replies.

Advertisement