views:

1796

answers:

4

Does anybody have any experience with different fonts for OCR? I am generating an ID then trying to scan it with tesseract. At the moment I am just T&E'n different fonts, but this seems pretty inefficient. I've tried the OCR* family of fonts, and various others such as Arial and Georgia. The tesseract tends to get confused with the OCR* fonts.

Is there any font specifically designed for tesseract, or any system font which works well with it?

A: 

I'd probably use the same font that banks use for the routing numbers at the bottom of checks:

http://morovia.com/font/micr.asp

It was specifically designed to be unambiguously machine-readable.

benjismith
Huh? Why the downmod? Not even an explanatory comment?
benjismith
MICR was designed for ideal reading with magnetic technology, not optically. While it is not bad, it is far from ideal for OCR.
Sparr
There was some entertaining stuff relating to MICR in the movie, "Catch Me If You Can".
erickson
It also needs to support alphanumeric characters.
Chris Lloyd
+1 not worth wasting a down-vote on
MusiGenesis
Who would buy a MICR font? http://sandeen.net/GnuMICR/
Joe Koberg
Tesseract-OCR is not trained out-of-the-box for working with MICR fonts, though that could be done...
sventech
+4  A: 

Okay, a search on google comes up with this, a specific OCR font: OCR Font

Looks like it's a standard adopted in 1973.

McWafflestix
I should have been more specific, read the updated question.
Chris Lloyd
+2  A: 

I had always success by simply using times new roman..

David
Yes, Roman font should yield good results. Make sure the image is grayscale or bitonal at between 200 and 300dpi. But you would probably be better off training the engine for a limited domain (alphabet/words) for this type of use-case.
sventech
+1  A: 

I find that Calibri works the best for me. We use OCR software daily in an automated system and after testing dozens of fonts (including some OCR specific ones) that Calibri is consistently the best.

Good luck.

Chris