How can you convert a pdf file which contains text to a file which I can search?
+2
A:
I remember having used Apache Lucene some time ago to perform searches inside different type of documents from Java, among them PDF and Word files.
However, this question entirely depends on the programming language you're using, so if you're not using Java you might want to specify it.
Seb
2009-03-16 15:57:37
Lucene looks interesting. +1
James McMahon
2009-03-16 16:03:01
Thank you! Lucene is also in MacPorts :)
Masi
2009-04-06 15:08:55
+2
A:
You can search PDF through Adobe Reader.
Programmiticaly, you maybe able to search it through iText. Which is aviable as a Java and .NET library.
I believe you would use the pdf parser class.
James McMahon
2009-03-16 15:58:31
Adobe Reader does not convert the text to a searchable form. Has Adobe any program which does that? I remember that you can convert forms easily in Adobe so that you can edit them.
Masi
2009-03-16 17:15:25
I thought you might have just been talking about searching the PDF itself. If you want to do something programmaticaly it is a different issue.
James McMahon
2009-03-16 17:53:11