Is there any java library for converting document from pdf to html? | ansaurus

tags:

views:

770

answers:

2

Q:

Is there any java library for converting document from pdf to html?

Open source implementation will be preferred.

+1 A:

Only ones I know of have to be paid for.

Skunk 2008-12-11 11:08:28

+1 A:

Obviously, it isn't an easy task, PDF formatting is much richer than HTML's one (plus you must extract images and link them, etc.).
Simple text extraction is much simpler (although not trivial...).
I see in the sidebar of your question a similar question: Converting PDF to HTML with Python which points to a library (poppler, which is apparently written in C++, perhaps can be accessed with JNI/JNA) and to a related question which offers even more answers.

PhiLho 2008-12-11 12:59:35

related questions

Autosizing Textarea

Regular expression for parsing links from a webpage?

What are good tools for creating compiled HTML help files (.chm)?

Looking for WYSIWYG HTML editor

Any reason not to start using the HTML 5 doctype?

HTML comments break down

HTML Comments Markup

Setting a div's height in HTML with CSS

Wrapping lists into columns

Is a "Confirm Email" input good practice when user changes email address?

<XMP> Tag

HTML version choice

Options for HTML scraping?

How do you disable browser Autocomplete on web form field / input tag?

How do I make a checkbox toggle from clicking on the text label as well?

Html CSS Editor

Wordpress theme development offline tools

How do I give my web sites an icon for iPhone?

In HTML, how to word-break on a dash?

Detecting font in JavaScript

How do you test layout design across multiple browsers/OSs?

How do I print an HTML document from a web service?

Multiple submit buttons on a HTML form

How can I determine a web user's time zone?

Why doesn't the percentage width child in absolutely positioned parent work in IE7?