views:

770

answers:

2

Open source implementation will be preferred.

+1  A: 

Only ones I know of have to be paid for.

BFO
JPedal

Skunk
+1  A: 

Obviously, it isn't an easy task, PDF formatting is much richer than HTML's one (plus you must extract images and link them, etc.).
Simple text extraction is much simpler (although not trivial...).
I see in the sidebar of your question a similar question: Converting PDF to HTML with Python which points to a library (poppler, which is apparently written in C++, perhaps can be accessed with JNI/JNA) and to a related question which offers even more answers.

PhiLho