views:

6363

answers:

3

I've been interested in machine learning and computer vision for a while, so I've decided to attempt to build a simple Optical Character Recognition demo in C#.

I'm looking for a description of some common OCR algorithms and how I would go about implementing them in C#. It's a learning exercise so I'm not looking for an OCR library.

Any information would be appreciated, thanks.

+3  A: 

I've been interested cracking captchas (though I haven't had time to start writing anything yet). These some bookmarks I was planning on starting with:

hypoxide
+9  A: 

OCR is a very broad field that includes things like image normalization (histogram equalization, color removal), feature extraction (textures, line segments, edge detection), and pattern classification / machine learning (neural networks, support vector machines, etc). You'll probably need to implement at least some sort of each of the above (normalize, extract features, do machine learning).

It sounds like you want to play around, write some algo's and learn about OCR. If that's the case, there's a wide literature on the subject that you can get access to if you have access to academic Journals (if not, go to the nearest University and spend a day making photocopies or printing things out).

This is a decent (if dated) survey:

Optical character recognition--a survey. Impedovo, S | Ottaviano, L | Occhinegro, S INT. J. PATTERN RECOG. ARTIF. INTELL. Vol. 5, no. 1-2, pp. 1-24. 1991

And IEEE PAMI would be good places to start.

Or, a google scholar search turns up quite a lot: http://scholar.google.com/scholar?hl=en&lr=&safe=off&client=firefox-a&q=optical+character+recognition&btnG=Search

You might also look in Duda, Hart, and Stork under "Tangent Distance" for a good example of a distance metric that is (was?) used in digit recognition.

You're going to need some data to play with, so instead of writing the numbers 0-9 4000 times each and scanning them, there's a UCI data set with all the numbers:

http://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits

For a quick first start, try hist-eq'ing the images, then calculate tangent distances and do some clustering, then train up a simple pattern classification algo on the resulting features.

Enjoy.

Pete
Like Lance said below, you can simplify mu suggested quick start using K-NN over the tangent-distance metric:1) hist-eq the images2) Calculate tangent distances3) K-NN Classification
Pete
+4  A: 

This looks interesting: Basic OCR in OpenCV, using a K-nearest-neighbor algorithm for classification.

Lance Richardson
Good call on the K-NN for classification. Simple to implement, simple to understand, and should work well for digits.
Pete