Image (Pre-)Processing for Face Detection

Started by
12 comments, last by willh 14 years, 1 month ago
Hey, I'm currently working on a face detection system using C# and I'm currently working on the optimal image pre-processing techniques. Obviously I want the algorithm to be as efficient and accurate as possible. Currently I'm working on the efficiency side, and since image pre-processing can vastly speed up the process as a whole, I'm putting aside quite a lot of time on it. Anyway, I've so far integrated histogram equalisation but it's taking about 1.4 seconds (according to the C# Stopwatch class - not sure how accurate that is) for a 1600x1200 JPG image. For a large set of images, this is quite a long time and I want to cut it down. Perhaps by resampling or a faster method of histogram equalisation. What would be a good resampling method? I've looked into it and the Lanczos method seems the best, but also quite slow? Bilinear looks to be a good choice though? Tell me if I'm wrong :) For a 120x80 image it took 42 milliseconds. So what other processing techniques might it be worth looking into? I know a lot of it is how the decision making process will work, but I'm looking for general ideas. I've looked into segmentation and thresholding so far; they seem like a good way to get rid of background noise? Thanks.
Advertisement
Quote:Original post by Side Winder
Hey, I'm currently working on a face detection system using C# and I'm currently working on the optimal image pre-processing techniques. Obviously I want the algorithm to be as efficient and accurate as possible. Currently I'm working on the efficiency side, and since image pre-processing can vastly speed up the process as a whole, I'm putting aside quite a lot of time on it.

Anyway, I've so far integrated histogram equalisation but it's taking about 1.4 seconds (according to the C# Stopwatch class - not sure how accurate that is) for a 1600x1200 JPG image. For a large set of images, this is quite a long time and I want to cut it down. Perhaps by resampling or a faster method of histogram equalisation. What would be a good resampling method? I've looked into it and the Lanczos method seems the best, but also quite slow? Bilinear looks to be a good choice though? Tell me if I'm wrong :) For a 120x80 image it took 42 milliseconds.

So what other processing techniques might it be worth looking into? I know a lot of it is how the decision making process will work, but I'm looking for general ideas. I've looked into segmentation and thresholding so far; they seem like a good way to get rid of background noise?


Are you resizing the image just to feed it to a recognition engine? If so, I'd suggest dispensing with anything fancy and simply averaging the high-resolution pixels to yield the low-resolution pixels. I'd also try applying any luminance adjustments (equalization, etc.) to the low-resolution version of the image. Also, consider doing something simpler than histogram equalization, like linear contrast stretching (to a fixed range) or even standardizing by simply subtracting the mean and dividing by the standard deviation.

Hi SideWinder.

Are you building a face detector or augmenting and existing one?

Why are you preprocessing the images?

I am doing the same thing right now, also in C#, using the technique described by Viola and Jones.
Its hard to recommend any pre-processing techniques without knowing how you intend to do your face detection. Resampling, for example, can only reduce the amount of information, and is likely to act as a low-pass filter - which may help or hinder your chosen technique. Equalization just re-map the same information, it doesnt clear any noise, and doesnt find any shape or depth cues.

Two classes of techniques Ive seen working well for face detection where sigma-shape detection & Support-Vector Machine classifiers. With any kind of good hardware, face detection should be possible in real-time - less than 5ms per image.

I advise not to get caught in early optimization - get your method working, THEN find whats taking too long and fix it. Why waste time optimizing an histogram equalization you might never use?
Quote:Original post by Steadtler
I advise not to get caught in early optimization - get your method working, THEN find whats taking too long and fix it. Why waste time optimizing an histogram equalization you might never use?


Steadtler beat me to the punch; I was waiting until you responded to my questions to see what you were doing first. :) So basically: what he said.

From my own experience (I've built other face detectors before) I've always found preprocessing to be not so great. It can definately improve the true positive rate, but also has an annoying tendency to increase the false positive rate-- it can make an obviously non-face image look very face like. :)

However: if you're doing face recognition (determining the identity of the face you've detected) then the literature seems to indicate that there are some very good reasons to preprocess your training images. I can't speak from first hand experience though.



I see. Yeah, I'm using neural networks. What's the problem with histogram equalisation? Is it just that it's slower than other methods and yields similar results? Either way, I'll heed your advice and work on getting the method fully working before thinking about optimising.
Quote:Original post by Side Winder
I see. Yeah, I'm using neural networks. What's the problem with histogram equalisation? Is it just that it's slower than other methods and yields similar results? Either way, I'll heed your advice and work on getting the method fully working before thinking about optimising.


What result do you expect to get out of an histogram equalization in the context of a computer vision program?
Well, I was under the impression that it helps with distinguishing features after greyscaling. Makes the greys more varied so when the data is put into the ANN it's got a broader range of input?
Quote:Original post by Side Winder
Well, I was under the impression that it helps with distinguishing features after greyscaling. Makes the greys more varied so when the data is put into the ANN it's got a broader range of input?


Having lots of face samples under different lighting conditions will give your ANN a broad range of inputs. It's not uncommon to use several 1000 positive training samples.

Just out of curiosity: why are you using an ANN?



Quote:Original post by Side Winder
Well, I was under the impression that it helps with distinguishing features after greyscaling. Makes the greys more varied so when the data is put into the ANN it's got a broader range of input?


Actually, I think equalization might help, but because it standardizes the images for lighting. In other words, faces under different lighting should occupy a smaller portion of the input space.

This topic is closed to new replies.

Advertisement