views:

152

answers:

4

Hi guys

I need to be able to convert dynamic HTML (html that is rendered on page load by javascript) to a PDF. I know there are plenty of HTML to PDF converters but none of the ones I have found thus far cope with dynamic HTML.

The given tool should be able to successfully convert the following page - http://www.simile-widgets.org/timeline/

Cheers Anthony

UPDATE:

I don't need the JavaScript functionality here... i.e. i don't need to be able to interact screen... I just want the finial rendering of the screen to be captured in the PDF - like taking a photo after the page is loaded. And in the example I provided the javascript is only rendering divs to the screen so its nothing that it shouldn't be able to handle as long as it "lets" the "page" render first.

A: 

Try xhtml2pdf. Here's the project page at python.org.

ghoppe
He wants a solution that will understand changes the page made by JavaScript.
Matthew Flaschen
I can't see where I can test a live HTML page and it doesn't say much about javascript or dhtml.
vdh_ant
Sorry, there is a link to the python project page for Pisa. I've changed my link.
ghoppe
The question still remains... can it handle the case that I have described... i.e. dhtml generated via JavaScript on load...
vdh_ant
A: 

You could use a javascript URI to alert the current DOM. eg:

javascript:alert("<html>" + document.documentElement.innerHTML + "</html>")

Copy the HTML and save to a file.
Then run it through the HTML2PDF converter.

Sean Hogan
It needs to be an automated process... i.e. the user clicks a button and they can download a report...
vdh_ant
If you mean a button in the page (rather than the browser) then you obviously control the site so you can use XMLHttpRequest to POST the HTML to the server and run it through the converter on the server.
Sean Hogan
+1  A: 

There is no way it can be done. The interfaces available for scripts in PDF are extremely limited compared to the full DOM and BOM access you enjoy in a web browser. Such interaction as you can achieve in PDF is not readily translatable from how it works in a browser and would almost certainly need hand authoring.

Your example page has many effects that PDF, as an essentially static document layout format, simply cannot reproduce at all.

Edit:

I just want the finial rendering of the screen to be captured in the PDF

Ah, OK, that's a far easier and more common problem then.

In that case you'll have to use and automate a real web browser (like Firefox), or a toolkit that provides all the logic of a web browser (like WebKit), then either:

  • export to PDF, either using built-in tools like ‘Print to file’ in Firefox (with background images/colours turned on) or one of the PDF export add-ons, or

  • take a image snapsnot of the browser (and include the image in a PDF if you have to)

See these questions for some discussion of browser snapshotting.

bobince
Not true. Flash can be added to pdf files now. However I am not aware of a javascript to actionscript/flash conversion path. :)
ghoppe
A: 

The fact that it uses any JavaScript at all means a lot of converters won't work. The JavaScript may be simple, but you still need an interpreter to handle it.

I haven't used it for myself, but you might try wkhtmltopdf. It uses the webkit rendering engine, and I believe it includes full javascript support. You would need to be able to install the software and run the executable, but otherwise it should be fairly straightforward.

BrianS