views:

86

answers:

1

I have a need to write directly to a SharePoint search index after loading a .tif file to a document library. We have a custom OCR process that works quite well for scanned images, however, we need to write the OCR results to the search index for a given document in SharePoint. I know that SharePoint has a crawler that indexes the files, but this would be more of a forced index by either using their web services or connecting to their SQL Server database.

+1  A: 

For this kind of scenario I would suggest 3 different approaches.

  1. You develop a custom iFilter for your files. It will call your OCR code and report back the result as search properties. Install the iFilter, and the crawler will process your files and OCR result automatically.
  2. Hook up to the ItemUpdated list item event, call your OCR routines and update the item properties with the result. Let the crawler do its work normally.
  3. A timer job that loops through all your files in the document library, downloads the files and calls the OCR. Updates the properties of the document library item based on the result. Let the crawler do its work normally.
Magnus Johansson