tags:

views:

406

answers:

3

How can you convert a pdf file which contains text to a file which I can search?

+2  A: 

I remember having used Apache Lucene some time ago to perform searches inside different type of documents from Java, among them PDF and Word files.

However, this question entirely depends on the programming language you're using, so if you're not using Java you might want to specify it.

Seb
Lucene looks interesting. +1
James McMahon
Thank you! Lucene is also in MacPorts :)
Masi
+2  A: 

You can search PDF through Adobe Reader.

Programmiticaly, you maybe able to search it through iText. Which is aviable as a Java and .NET library.

I believe you would use the pdf parser class.

James McMahon
Adobe Reader does not convert the text to a searchable form. Has Adobe any program which does that? I remember that you can convert forms easily in Adobe so that you can edit them.
Masi
I thought you might have just been talking about searching the PDF itself. If you want to do something programmaticaly it is a different issue.
James McMahon
+1  A: 

I believ TallPDF allows for extracting text.

Ken H