views:

29

answers:

1

I'm implementing an IFilter for indexing image formats. One problem is photos - many users have tons of photos, photos are huge and loading and searching for text on them is time consuming.

Yes, sometimes people use cameras instead of scanners for digitizing documents, but the potential problems IMO far outweight the possibility of encountering a document digitized with a photo camera. So my implementation will not extract text from photos at all.

What should the IFilter do once it detects that a given file is a photo image - indicate an error or return empty text?

+1  A: 

If a Word filter didn't handle tracked changes, it wouldn't throw an error; it would just skip them. Even though in your case you're skipping entire files, it's the same principle. This is not an error condition. Just return no text.

Jeremy Stein