document-conversion

How can I take preview of documents?

I'm working on a file sharing website, I need a way to take screenshots of the uploaded documents. The site will support several file formarts, from plain text to office documents (doc, xls, ppt, ...), videos (mpeg, avi, ...), images (jpg, gif, png, ...) PDF's, Open Office, etc. Each document need to have a "preview" of it, the good pa...

Doc conversion using OpenOffice SDK

I have a need to be able to allow users to export their .doc files (which they upload) to a variety of formats. I got started on using OO SDK, and I set-up some custom filters using XSLT also. Everything works good and I am able to export word docs to pdf etc. However I want to run this as a web service. I wish to run this conversion se...

What Java framework might I use to provide a robust document conversion service?

I am starting a new open source project to develop an application that will provide services to convert various documents into other formats (E.g. doc -> html, pdf -> html, plain text -> html, etc). It will utilize many other open source tools to facilitate the document conversion. I am looking for a framework that I can use for this p...

Convert pdf, doc, ppt to html5

I've googled (without any luck) for open source software that can convert doc, ppt, and pdf to HTML5. (Exactly what Scribd does) Are there open source equivalents to the type of conversion Scribd does? If anyone knows of a paid service, that would also work. Scribd has an API, but that's for use with the flash viewer. Also, I would like...

Issue with PPT conversion using python-Django

I was just trying to convert a PPT using the following URL http://code.google.com/p/qifei/wiki/PDFConverter python code I could see the same thing happening with the command line option too python documentconverter.py /home/rajeev/Desktop/Downloads/Industry2.ppt /home/rajeev/Desktop/test.pdf It appears that the image overlaps on some ...

Convert Word document with MergeFields to PDF with form fields

I have a document template in Word .doc format. The Word document contains Merge fields that needs to be populated dynamically. I need to convert the Word document to a PDF with form fields. This PDF can then be populated from our Java application quite easily with iText. The problem I am experiencing is when I try to convert the Word ...