views:

514

answers:

6

there are two images

alt text alt text one is background image another one is a person's photo with the same background ,same size,what i want to do is remove the second image's background and distill the person's profile only. the common method is subtract first image from the second one,but my problem is if the color of person's wear is similar to the background. the result of subtract is awful. i can not get whole people's profile. who have good idea to remove the background give me some advice. thank you in advance.

A: 

Post the photo on Craigslist and tell them that you'll pay $5 for someone to do it.

Guaranteed you'll get hits in minutes.

Justin
never. I am a poor man. $5 can feed me one week.
carl
A: 

Instead of a straight subtraction, you could step through both images, pixel by pixel, and only "subtract" the pixels which are exactly the same. That of course won't account for minor variances in colors, though.

themarshal
this may be get problem,because of the light condition is not stable
carl
-1: This is exactly the same thing as subtracting everything and then setting a threshold of 0. Also even if lighting would be static there is no guarantee that the pixels in two subsequent frames are exatly the same.
kigurai
+1  A: 

One technique that I think is common is to use a mixture model. Grab a number of background frames and for each pixel build a mixture model for its color.

When you apply a frame with the person in it you will get some probability that the color is foreground or background, given the probability densities in the mixture model for each pixel.

After you have P(pixel is foreground) and P(pixel is background) you could just threshold the probability images.

Another possibility is to use the probabilities as inputs in some more clever segmentation algorithm. One example is graph cuts which I have noticed works quite well.

However, if the person is wearing clothes that are visually indistguishable from the background obviously none of the methods described above would work. You'd either have to get another sensor (like IR or UV) or have a quite elaborate "person model" which could "add" the legs in the right position if it finds what it thinks is a torso and head.

Good luck with the project!

kigurai
@kigurai: Good solution. If I may add to it, a good algortithm could be:
nav
+3  A: 

If you have a good estimate of the image background, subtracting it from the image with the person is a good first step. But it is only the first step. After that, you have to segment the image, i.e. you have to partition the image into "background" and "foreground" pixels, with constraints like these:

  1. in the foreground areas, the average difference from the background image should be high
  2. in the background areas, the average difference from the background image should be low
  3. the areas should be smooth. Outline length and curvature should be minimal.
  4. the borders of the areas should have a high contrast in the source image

If you are mathematically inclined, these constraints can be modeled perfectly with the Mumford-Shah functional. See here for more information.

But you can probably adapt other segmentation algorithms to the problem.

If you want a fast and simple (but not perfect) version, you could try this:

  • subtract the two images
  • find the largest consecutive "blob" of pixels with a background-foreground difference greater than some threshold. This is the first rough estimate for the "person area" in the foreground image, but the segmentation does not meet the criteria 3 and 4 above.
  • Find the outline of the largest blob (EDIT: Note that you don't have to start at the outline. You can also start with a larger polygon, as the steps will automatically shrink it to the optimal position.)
  • now go through each point in the outline and smooth the outline. i.e. for each point find the point that minimizes the formula: c1*L - c2*G, where L is the length of the outline polygon if the point were moved here and G is the gradient at the location the point would be moved to, c1/c2 are constants to control the process. Move the point to that position. This has the effect of smoothing the contour polygon in areas of low gradient in the source image, while keeping it tied to high gradients in the source image (i.e. the visible borders of the person). You can try different expressions for L and G, for example, L could take the length and curvature into account, and G could also take the gradient in the background and subtracted images into account.
  • you probably will have to re-normalize the outline polygon, i.e. make sure that the points on the outline are spaced regularly. Either that, or make sure that the distances between the points stay regular in the step before. ("Geodesic Snakes")
  • repeat the last two steps until convergence

You now have an outline polygon that touches the visible person-background border and continues smoothly where the border is not visible or has low contrast. Look up "Snakes" (e.g. here) for more information.

nikie
thank you seriously! you are very good,and professional.
carl
+1  A: 

Low-pass filter (blur) the images before you subtract them. Then use that difference signal as a mask to select the pixels of interest. A wide-enough filter will ignore the too-small (high-frequency) features that end up carving out "awful" regions inside your object of interest. It'll also reduce the highlighting of pixel-level noise and misalignment (the highest-frequency information).

In addition, if you have more than two frames, introducing some time hysteresis will let you form more stable regions of interest over time too.

Liudvikas Bukys
A: 

Background vs Foreground detection is very subjective. The application scenario defines background or foreground. However in the application you detail, I guess you are implicitly saying that the person is the foreground. Using the above assumption, what you seek is a person detection algorithm. A possible solution is:

  1. Run a haar feature detector+ boosted cascade of weak classifiers (see the opencv wiki for details)
  2. Compute inter-frame motion (differences)
  3. If there is a +ve face detection for a frame, cluster motion pixels around the face (kNN algorithm)

voila... you should have a simple person detector.

nav