I am looking to take a PDF and extract any text from it. I then want to make it available using Coldfusion's available Verity search to search the contents.
Are there any libraries out there that do this quite well already? I am including Java or .NET (Java prefered) libraries in the scope since they can be called from CF.
Any insights or experiences would be greatly appreciated... thanks!
Edit: Indexing PDF files works when the text is embedded in the PDF as far as I know with CF. The PDFs I'm having to deal with have the text scanned as an image.