views:

2459

answers:

8

I have an image, taken from a live webcam, and I want to be able to detect a specific object in the image and extract that portion of it to do some further processing.

Specifically, the image would be of a game board, let's say for the purposes of this question that it's a Sudoku game board. Here is a sample image.

My initial approach was to look for contrasting areas and work it out from there, but I seem to end up with a lot of potential edges (many erroneous) and no real clue as to how to work out which ones are the ones I actually want!

Are there any algorithms, libraries, code samples, or even just bright ideas out there, as to how I would go about finding and extracting the relevant part of the image?

+2  A: 

You need to perform filters operation and masks on image.

I think so there are no simple ways to just fetch object from the image, you need to use edge-detection algorithms, clipping, and set the criteria for valid objects/image.

You can also use image thresholding to detect object. Here is article and algorithm from Stanford Uni.

You may want to look at below Image processing library.

  1. Filters API for C, C++, C#, Visual Basic .NET, Delphi, Python
  2. http://www.catenary.com/
  3. CIMG richer than above library however it is written in C++
Sun
+1  A: 

One of the (I guess many possible) approaches:

  1. Find a filter that "gets/calculates" straight lines (edges, etc.) from a given image.

  2. Now you have the collection (array) of all the lines (xStart,yStart & xEnd,yEnd). You can easily calculate all the line-lengths from the coordinates.

  3. Now, considering that you can always (!) expect "one-biggest-square / rectangle" inside the image, it would be quite easy to find and calculate the wanted-sudoku-rectangle region and crop it from the image to do some further processing.

EDIT: Solving/programming that kind of problems is always challenging BUT really interesting at the same time :).

+3  A: 

use the free AForge.Net image processing library for this. there's a ton of cool stuff to play with.

Mladen Prajdic
Just because someone gives you hammer doesn't mean you can build a house. Having tools is helpful but you have to know what you are going to do with the tool.
kigurai
A: 

You could try first to find the bold line intersections and use them as registration marks.

This would be a good start because:

  • They're pretty uniformly shaped
  • You know how many there are
  • You know where (roughly) they should be in relation to each other
  • Can tolerate scale variations

So:

  1. Apply an edge filter
  2. Scan a mask* of what the ideal + should look like across the image, recording all that are a good match
  3. Choose the set that matches your expectations best, according to location relative to one another
  4. You now also know where the numbers should be, so you can easily extract them.

* A more sophisticated solution would be to use a Neural Net instead of a mask to recognise the intersections. This might be worth it since your're probably going to use one for the OCR of the numbers.

pufferfish
+1 for idea, -1 for suggesting a NN
kigurai
Images are noisy, and neural nets handle noisy data well. That's why they're used in OCR. That said, I prefer your Harris detector suggestion... maybe as an input to a NN? ;-)
pufferfish
+2  A: 

This is the feature of a Coding4Fun blog entry that you might find helpful. This also means a second vote for the AForge library, since the author uses it in the example.

chsh
+1  A: 

You might try using the Hough Transform.

plinth
+1  A: 

I would start by using a corner detector (The Harris detector works nice) to find the intersections and corners of the sudoku grid.

Then I would use those points to do an image rectification to transform the image to have the grid as rectangular as possible. Now you should have no trouble finding each square to do OCR.

Image rectification is not simple and entails quite a lot of math.

Be prepared to do some reading :)

If the images of the game boards are already close to rectangular you can of course skip the rectification part and directly use the corner points to find your squares for OCR.

A lot of people have been suggesting to use Neural Networks. I am quite certain that throwing a neural network on this problem is totally unneccessary. NNs are (sometimes) good if you need to classify objects where the definition of the object is vague. "Find cars in image" is a problem which could have use for a Neural Network since cars can look very different but have some features the same. Thus, given enough data, you can train your NN to detect cars. In this problem you have something that is very regular and always looks almost the same, so a NN will not make anything easier or better.

kigurai
+1 for "Be prepared to do some reading :)"
nikie
A: 

Without rejecting any of the other ideas, step 1 really should be the detection of the image rotation. You can do this by determining the local gradient at each point and creating a histogram thereof. This will have 4 major components at 90 degree offsets. Ideally, these would be 0, 90, 180 and 270 but if they're not you should rotate your image. E.g. in the sample image you should start with a rotation over about 8 degrees CW.

MSalters