views:

583

answers:

2

Scenario: You have configured Indexing Service to index your files, which also include scanned images saved as hi-res TIFF files. You also have installed MS Office 2003+ and configured MS Office Document Imaging (MODI) correctly so you can perform OCR on your images and even embed the OCR'd text into TIFFs.

Awesomeness: Indexing Service is able to index and find those TIFFs that you manually OCR'd and re-saved with text data (using MS Document Imaging tool).

Suck(TM): Whatever you do, you cannot make the Indexing Service to OCR and index the TIFF files without text data. You scour the web, find out how to turn on MODI debugging, and see that CISVC is calling MODI but somehow nothing seems to happen.

Hack: Turns out, Data Execution Prevention (DEP) which is deployed with Windows XP SP2 thinks MODI is malicious and refuses to let it do its magic. I have been able to remove the Suck(TM) by turning DEP off completely, but (even though DEP hasn't yet saved my posterior that I can remember) I found this solution to be inelegant (read: the customer's IT manager's gonna fry you, if xe hears about it).

I'll share a better solution here, if I can come across it.

A: 

Hi Ishmaeel, i tried the same thing and hit some of the same limitations. Also I found MODI just too slow for indexing large amounts of images.

Leon Bambrick
+3  A: 

There's a hotfix that appears to address this problem.

Greg Hurlman