views:

370

answers:

7

How hard would it be to take an image of an object (in this case of a predefined object), and develop an algorithm to cut just that object out of a photo with a background of varying complexity.

Further to this, a photo's object (say a house, car, dog - but always of one type) would need to be transformed into a 3d render. I know there are 3d rendering engines available (at a cost, free, or with some clause), but for this to work the object (subject) would need to be measured in all sorts of ways - e.g. if this is a person, we need to measure height, the curvature of the shoulder, radius of the face, length of each finger, etc.

What would the feasibility of solving this problem be? Anyone know any good links specialing in this research area? I've seen open source solutions to this problem which leaves me with the question of the ease of measuring the object while tracing around it to crop it out.

Thanks

+2  A: 

It sounds like you want to do several things, all in the domain of computer vision.

  1. Object Recognition (i.e. find the predefined object)
  2. 3D Reconstruction (make the 3d model from the image)
  3. Image Segmentation (cut out just the object you are worried about from the background)

I've ranked them in order of easiest to hardest (according to my limited understanding). All together I would say it is a very complicated problem. I would look at the following Wikipedia links for more information:

Computer Vision Overview (Wikipedia)

The Eight Point Algorithm (for 3d reconstruction)

Image Segmentation

Carlos Rendon
Image segmentation is not difficult, if you can assume what you're segmenting obeys certain rules. The example I remember from school was grains of rice being photographed as they proceeded along a conveyor belt. Through segmentation, their size and quality could be measured.
Charlie Salts
I agree that segmentation can be simple in a highly constrained environment such as a factory with very homogeneous objects. But in general the problem of segmenting objects is difficult.
Carlos Rendon
A: 

Assuming it's possible, that would be extremely difficult, especially with only one image of the object. The rasterizer has to guess at the depth and distances of objects.

What you describe sounds very similar to Microsoft PhotoSynth.

tsilb
A: 

It's certainly possible. I've seen solutions to recognise similar images (several solutions in different languages), and to cut them out can be done with AI (also seen solutions). The hardest part would be getting measurements because in the real world objects simply aren't made up of a series of simple shapes like lines, squares, etc.

I guess this would in itself require AI. So the system would take the measurement of a wheelbase (in my context, the subject is a car), but the system needs to be clever enough to measure this but not run into the wheels of a car.

I can use more than image per object (car) and because of the complexity, I don't mind making some compromises like marking where the measurements are done. For a non-modified car, these measurements would be the same anyway. Of course, it will be interesting to try this without user input from a learning point of view.

dotnetdev
A: 

You're right this is an extremely hard set of problems, particularly that of inferring 3D information from a 2D image. Only a very limited understanding exists of how our visual system extrapolates 3D information from 2D images, one such approach is known as "Shape from Shading" and the linked google search shows how much (and consequently how little) we know.

Rob

RobS
A: 

Something I am confused about.

Essentially I want to take a 2d image (typical image: http://benmartin3d.com/WIP/Project1/image1small.jpg , which is easier than a complex photo containing multiple objects, etc.

But effectively I want to turn that into a 3d image, so wouldn't what I want to do involve building a 3d rendering/modelling engine?

Furthermore, that link I have provided goes into 3ds max, with a few properties set, and a render is made.

dotnetdev
A: 

This is a very difficult task. The hardest part is not recognising or segmenting the object from the image, but rather inferring the 3-D geometry of the object from the 2-D image. You will have more success if you can use a stereoscopic camera (or a laser scanner, if you have access to one ;).

For the case of 2-D images, try googling for "shape-from-shading". This is a method for inferring 3-D shape from a 2-D image. It does make assumptions about illumination conditions and surface properties (BRDF and geometry) that may fail in many cases, but if you are using it for only a predefined class of objects (e.g. human faces) it can work reasonably well.

A: 

I'm still confused.

I can feed a 2d image (like that blueprint) to a 3d rendering app and get a render, which is exactly what I need.

So in what I am trying to do, I pretty much need to write a 3d modelling and rendering system, it seems.

Or am I wrong?

BTW I did google shape from shading. :)

dotnetdev
You're looking for a CAD system. Your original question is quite confusing which is why you've been getting all the answers you have. And all the answers you keep giving should be comments or re-edits to your question.
RobS