I'm implementing an IFilter for indexing image formats. One problem is photos - many users have tons of photos, photos are huge and loading and searching for text on them is time consuming.
Yes, sometimes people use cameras instead of scanners for digitizing documents, but the potential problems IMO far outweight the possibility of encountering a document digitized with a photo camera. So my implementation will not extract text from photos at all.
What should the IFilter do once it detects that a given file is a photo image - indicate an error or return empty text?