tags:

views:

216

answers:

1

Hi folks,

I am using MODI to read tiff images and do what I need to do with the text. Some images work fine and then other tiff images always cause the method,

OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true)

to fail. I have researched this and tried different variations such as 'false','false' in the parameter list. I have also tried SYSDEFAULT instead of English but I still get the error. Can anyone please tell me why it would fail on some tiff images and not on others?

I have done some research and found this answer:

One possible cause is MODI trying to process a file without any recognisable text. A blank document, or one which has only drawings/scribbles and is effectively blank, will cause this exception.

Obviously this is not good enough as there is no way I can have an app that decides to OCR some images and not others. I handle the exception, but the OCR object is not then initalised so I can't do what I need to do from there.

This is a bloody nightmare! Why can't the method just do it's bloody job and if the image has some unreadable pages then just ignore them? I am using Windows 7 Ultimate and Office 2007 Ultimate.

Visual Studio version is 2008 Thanks,

IW

A: 

Hi!

I know it is not an aswer you are expecting, but if you really want working OCR, why don't you try commertial SDK with decent documentation, samples and support service?

One of most advanced tough not cheap is ABBYY: http://www.abbyy.com/ocr_sdk/

Best regards, Andrey

Tomato
And to add to that, the ABBYY engine is actually accessible on-demand on a pay-per-page basis through a web API: http://www.wisetrend.com/wisetrend_ocr_cloud.shtml
Eugene Osovetsky