Hello everybody,
I am using Xpdf for extracting text from PDF files which works well with -raw
option, but now we want to convert the PDF files to HTML files for extracting the HTML formating tags like bold <b>, italics <i> etc with the text. Xpdf with the -html
option does work, I have also tried using pdf2html for this but did not find it reliable as tags like <sup> and <sub> where missing.
We are now using Acrobat Reader to save the PDF files as HTML files which gives us all the HTML formatting tags.
Is there a way to use Acrobat Reader in Perl to save multiple PDF files as HTML files?
Thank you.