I need to extract information from hundreds of résumés. The ideal would be .doc, .docx, .pdf, .rtf --> hr-xml but since more than 90% of the résumés are .doc, the other formats are not a must have.
I'm looking to buy a third-party tool or a component.
Do you have any good/bad experience solving a similar problem?
Clarification: I'm not looking to use MS Indexing Services or Lucene or any other search indexing engine. It's not that straightforward. The biggest challenge is that the layout/format of the résumés is not the same, so simple indexing won't do.