views:

5444

answers:

3

Hi i have been working on this for a while and yet to have no good solution.

I am reading a video frame by frame and am using background subtraction to ' identify the region where is there movement and use cvFindContours() to get the rectangle boundary of the moving objects.

Assuming the program is kept simple there can be only 2 human.

These objects and move in a manner they can overlapped, make turn and move away at certain interval.

How can i label this humans x 2 correctly.

cvFindContour can return the boundary in a random manner. for Frame1,Frame2,Frame3....FrameN

I can initially compare rect boundary centroid to label the human correctly. Once the human overlapped and move away this approach will fail.

I tried to keep track of pixel color of the original obj (however the human are fairly similar and certain areas have similar colors like hand,leg,hair ) hence not good enough.

I was considering using Image Statistic like :

CountNonZero(), SumPixels() Mean() Mean_StdDev () MinMaxLoc () Norm ()

to uniquely distinguish the two objects. I believe that would be a better approach.

Any suggestions/comment ?? Thank you.

+1  A: 

You could try to remember one corner of each frame (top left for example). Then, when you receive your new set of frames, you compare the distance of their corners to the ones previously saved. This is of course no perfect solution.

  • If both blobs cross their paths at some point, it is not sure what the result of that is.
  • If both blobs move too fast, can also lead to unwanted results.
Stefan Schmidt
+1  A: 

Sounds difficult, especially if there is a lot of noise in the video.

Perhaps identifying the different cases in which two humans would interact. Some examples:

  1. Two humans meet, then either reverse course or continue on their heading
  2. Two humans meet, then only one human reverses course or continues on their heading
  3. Two humans meet, then one human remains and the other travels in a direction 'normal' to the camera's view, i.e., away or toward the camera

Computer Vision textbooks could help in determining other cases.

Consider measuring all of those functions you listed for every frame in the video and then graph their results. Determine from that if there is a way to match something like the standard deviation of pixel colour in a bounding box to which human is which after they cross paths.

Chris Cameron
+4  A: 

This is a difficult problem and any solution will not be perfect. Computer vision is jokingly known as an "AI-complete" discipline: if you solve computer vision and you have solved all of artificial intelligence.

Background subtraction can be a good way of detecting objects. If you need to improve the background subtraction results, you might consider using an MRF. Presumably, you can tell when there is a single object and when the two blobs have merged, based on the size of the blob. If the trajectories don't change quickly during the times the blobs are merged, you can do Kalman tracking and use some heuristics to disambiguate the blobs afterwards.

Even though the colors are similar between the two objects, you might consider trying to use a mean shift tracker. It's possible that you may need to do some particle filtering to keep track of multiple hypotheses about who is who.

There are also some even more complicated techniques called layered tracking. There is some more recent work by Jojic and Frey, by Winn, by Zhou and Tao, and by others. Most of these techniques come with very strong assumptions and/or take a lot of work to implement correctly.

If you're interested in this topic in general, I highly recommend taking a computer vision course and/or reading a textbook such as Ponce and Forsyth's.

Mr Fooz