views:

901

answers:

3

Hi,

I would like a library (preferably free) that is capable to read characters from an image and convert them to text. The input I will have is not a document, but a set of bitmaps that contain only one character each, and return a character in ASCII format for each bitmap.

The format of the input bitmap will be binarized and the size can be changed from 10*10 pixels to about 100*100 pixels.

Thanks for any help provided.

Update

I noted the program is being built successfully, but when running the code, it is exiting the program on

tessnet2.Tesseract tessocr = new tessnet2.Tesseract();
+2  A: 

google's tesseract-ocr library has a .NET wrapper you can use (http://www.pixel-technology.com/freeware/tessnet2/)

maranas
thanks for your help ... I already tried to use this wrapper but I was unsuccessful. In the doc there is written "add a reference of the assembly Tessnet2.dll to your .NET project", but in the bin files there is not Tessnet.dll file....
mouthpiec
you have to choose which DLL you are going to use. tessnet2_32.dll is for 32 bit platforms, while tessnet2_64.dll is for 64 bit
maranas
A: 

My company, Atalasoft, has a couple of .NET OCR engines that you can work with, including a .NET wrapper for tesseract.

Be aware that most good OCR engines try to use greater context for matching (words with dictionary lookup). Without that surrounding context, you're likely to get very low-confidence results.

plinth
thanks but i need the DLLs to do the code myself.
mouthpiec
A: 

If you're ok with calling a web API to do the OCR, take a look at http://www.webservius.com/corp/docs/wisetrend.pdf - it uses a really advanced OCR engine on the back-end, you can test recognition quality by emailing test images to [email protected] (you will get OCR results back by email).

Eugene Osovetsky