Jump to content
  • Advertisement
Sign in to follow this  
smc

Covariance of an image

This topic is 3410 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I am working on calculating the Mahalanobis distance between two images. However, I am a little confused about how to go about calculating the covariance matrix. As it stands now this is what I have. For simplicity let the image have a scalar color component (gray scale). Now let A be a MxN matrix [X_1, X_2 ... X_N]. It is my understanding that X_1 ... X_N are columns out of the image. If this is the case then I can find the covariance matrix by this: Let D be the MxN matrix in mean-deviation form. In other words each column is X_k - M where M is the vector mean. The covariance matrix should then be (1 / N - 1)BB^t Most of the papers I have read calculate this a bit different, and my linear algebra book works in the spectral domain using 3 spectral components per pixel. With all the variations I am getting a little confused.

Share this post


Link to post
Share on other sites
Advertisement
It's unusual to squeeze a 2D greyscale image into a 2D matrix, column by column. More common is to make the image into a vector. I've never seen the Mahalonobis distance applied directly to raster images, though, so perhaps they're doing something special there.

Share this post


Link to post
Share on other sites
Suppose I make a 1x(MN) vector out of the image. It would appear the resulting covariance matrix would be 1x1.

To be more specific I am trying to implement this distance calculation as described in this paper. Page 1 section 2.2.



Share this post


Link to post
Share on other sites
Quote:
Original post by smc
Suppose I make a 1x(MN) vector out of the image. It would appear the resulting covariance matrix would be 1x1.
It looks like that paper is calculating the covariance over a library of (equally-sized) images. So your dataset is (MN)xP, where P is the number of pictures, and the covariance is MNxMN.

EDIT: No, waitasec. That paper isn't finding covariance over pixels, but converting to "representative vectors" and working with those. Comparing images pixel-for-pixel is not very effective, since small registration differences can lead to large apparent differences.

Share this post


Link to post
Share on other sites
What are the "representative vectors"? The only thing holding me up is the representation of the image by the vectors X and Y. Sense the covariance matrix is being calculated using the set of vectors X_i, I am lead to believe they are partitioning the image into column vectors. If the image was in RGB format I could view each pixel as a vector and proceed from there (which I have seen examples of). However with gray scale images I believe I would need to work within the MxN.

In any case thanks for taking the time to reply.

Share this post


Link to post
Share on other sites
Quote:
Original post by smc
What are the "representative vectors"?

The result of converting the image into a more semantically meaningful form.

For instance, suppose you were comparing photos of faces, taken in various environments, and that you decided to do pixel-by-pixel comparison. What you'd find is that factors such as the color of the background and the position of the lighting would be far more important than the identity of the person in the photo. A picture with the face taking up the bottom 2/3 of the image would be classefied as entirely unlike a picture with the face taking up the top 2/3 of the image, because the pixels corresponding to the eyes in one picture would correspond to the cheeks in the other. On the other hand, if you preprocessed that image to extract, say, the ratio of the width of the mouth to the width of the nose, the color of the eyes specifically, and other such registration-independent features, and then stuffed those various measures into a vector for that image, your classifier would get a lot better.

That's specific to faces, of course. There's various methods of extracting useful features from more general classes of images, though I'm personally not very familiar with them. It looks like they're using wavelet transforms as their feature extration method.

In machine learning, it is standard to refer to a sample as x and to a particular feature in that sample as xi.

Share this post


Link to post
Share on other sites
Thanks.

Another paper makes use of these feature extractions. This is a bit more involved then I had anticipated.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!