views:

633

answers:

5

What is the best solution to convert PDF documents to be viewed in the browser as HTML? The site has several PDF documents and the visitor can click on view as HTML and this should be viewed on the screen as an HTML file.

Standard website running PHP, Linux, Apache.

A: 

There are lots of options:

http://www.google.com/search?hl=en&q=php+pdf+to+html

Matt Lacey
sorry, i tried to search a lot but nothing good enough
ToughPal
moreover, majority of Google searches resulted in showing HTML to PDF converters
ToughPal
Are you saying you want to write a converter yourself? Is there a reason you can't use an existing converter and want to repeat the effort of creating one?
Matt Lacey
no, i dont want to create my own converter. Any open source / free PDF to HTML converter which can be run on PDF documents will do.
ToughPal
A: 

In Acrobat, click File/Export/HTML, and then choose which version of HTML you want.

William Leara
yes, but how can you do this on the website when users randomly send you PDF files?
ToughPal
+1  A: 

Have you considered keeping the PDF data in a database and then either dynamically creating the PDF or the html page depending on what the visitors select?

Ian Jacobs
+1  A: 

If you have command line access at your hosting provider, there is a utility called pdftohtml inside of the poppler_utils package.

http://poppler.freedesktop.org/

Looks quite easy to use, have not called it from inside of PHP, but it should work.

Kevin K
+1  A: 

pdftohtml works fine : fast, stable but the html result is ugly at best. I have used it for quite some time for a web site that has many job resumes.

It is a good solution for extracting textual content however.

I would give the scribd API a try

http://www.scribd.com/developers/api

or the google apps document API. GOogle does a great job a displaying and converting pdf files

Alexis Perrier