tags:

views:

224

answers:

5

Does anyone know of a .NET programmable/usable API for reading an image file, and comparing it to an existing set of images?

e.g. I have three pictures of the letters A, B, and C. I then copy the picture of A, and modify it so that it is flipped 180 degrees. I'd like to be able to have a piece of software that detects that it is a match to the existing letter A.

I appreciate any help!

EDIT: Replace the letters A, B, and C with an image of an apple, and orange, and a banana. This isn't so much about recognizing alphanumeric characters as it is comparing shapes/images.

EDIT #2: Imagine it as a way to detect the result of rolled dice. Imagine a standard pipped, six-sided die. When you roll a six, there are six dots. It could land any way, but what I'd like to try to do is have a camera take a picture of the die and compare it with control images to detect the value.

A: 

If you have Office installed, you can use its OCR component

Brann
I've used Office OCR and that was bug- and painfull.
boj
Will this work with non-alphanumeric images? The letters were just an example. You could replace A, B, and C with a picture of a banana, an orange, and an apple.
80bower
80Bower : no, it won't
Brann
+2  A: 

What you seem to be looking into (no pun intended) is part of the computer vision research programme. What exactly characterizes your target images? Are you only looking for stencil matches of exactly the same size but different orientations? Slight displacements cause significantly different pixel patterns for fine and slanted features. Should rescalings be recognized? What about slanted perspectives? Your edit suggests you are actually into much more difficult areas: full image recognition requires full AI. From what perspectives would you expect to recognize a banana, under what lighting conditions, and what kinds of bananas -- green, ripe, slightly squished from having sat on it... I hope you get my point!

Don't get me wrong: this is fun stuff, but requires heavy artillery. What libraries you may find will help you with the heavy linear algebra and statistics lifting, but you need to know a lot to apply those.

For more light-hearted reading (comparatively!), my introduction to the area came with Hofstadter's Gödel, Escher Bach, and his Metamagical themas on recognizing letter shapes. That got me interested in typography, too: I never knew there were so many ways of drawing a lower-case 'a'!

Pontus Gagge
Imagine it as a way to detect the result of rolled dice. Imagine a standard pipped, six-sided die. When you roll a six, there are six dots. It could land any way, but what I'd like to try to do is have a camera take a picture of the die and compare it with control images to detect the value.
80bower
You really should have put this in your question. It is very difficult to answer your question when the exact background to it is unknown.
RobS
A: 

Here is an open source Image Recognition program. It is in beta. It might be a start for you. From the description:

Search for images based on the characteristics of the image itself, very fast searching once the images are loaded. Shows a list of results sorted by likeness.

You can also search for duplicates within a library of images.

Khadaji
+1  A: 

In one of your comments above you mention that you want to detect die rolls using a camera system.

There are several approaches to this problem, here are two:

1) Very simple approach. Do circle detection using the hough transform on the pictures of your die faces and count the number of circles. You'll know approximately the size of the pips on the dice to that should help set up the hough algorithm.

2) Complex approach. Get images of each face of your die and compute a Fourier Transform and extract the power spectrum (2D then collapse across orientation). The power spectrum will give you a signature for each of the die faces independent of the orientation of the die relative to the camera. You can compare these signature power spectra with those from the die rolls. The closest match should be your pip count....

Hope this helps a little.

RobS
Thanks for the ideas. I'm gonna look around for a library that can help with these.
80bower
+1  A: 

If I understand your question (esp. Edit 2) correctly, you want to search for circular patterns in a digial image from a camera or scanner.

As RobS already said, hough transform and template matching in the fourier spectrum are good ways to do this. You will probably find lots of libraries for hough transformation or FFT, but I'm not sure you'll be able to use one of those without actually understanding the algorithms. For example: Standard hough transformation only works for lines, it has to be adapted for circles. Also, it needs some kind of preprocessing to find the edges of the circle. It has a few parameters (the size of the internal parameter space) that are hard to adjust if you don't know what they mean.

If you can binarize your image, i.e. if the circular patterns you're looking for are significantly brighter or darker than the background, it might be simpler to

  • Binarize the image
  • Group connected areas of pixels aka Blobs (e.g. using Flood-Fill)
  • Decide if a blob is one of your patterns by comparing some of it's characteristics (e.g. total area, number of boundary pixels, average brightness, average contrast) to the kind of pattern you'd expect

Neither of these subproblems (Binarization, Segmentation, Pattern matching) is simple or even solvable in general, but if your problem is simple enough, you might just get away with a few very simple algorithms.

Niki
Thanks, I'd forgotten that it is the generalized hough transform that is required. But I think that the highly constrained situation of the pattern matching that is required here should make the problem relatively easily solvable. Binarizing the image is a great start so +1.
RobS
I'd probably go with hough transform, too, as it doesn't require that the circles are (much) brighter than the background. But it's pretty advanced for a beginner, especially GHT with 3-5 degrees of freedom. Finding/measuring blobs is much easier conceptually.
Niki