views:

3562

answers:

13

What is the best and easiest way of taking HTML and converting it into a PDF, similar to use CFDOCUMENT on ColdFusion?

UPDATE: I really appreciate all the comments and suggestions that people have given so far, however I feel that people leaving their answers are missing the point.

1) the solution has to be free or open sourced. one person suggested using pricexml and the other pd4ml. both of these solutions costs money (pricexml costing an arm and a leg) which i'm not about the fork over.

2) they must be able to take in html (either from a file, url or a string variable) and then produce the pdf. libraries like prawn, rprf, rtex are produced using their own methods and not taking in html.

please don't think i'm ungrateful for the suggestions, it's just that pdf generation seems like a really problem for people like me who use ColdFusion but want to convert to Rails.

A: 

If you want to have really good quality (think, proper layout/CSS etc) you need to create a process embedding a real webbrowser rendering it offscreen and printing it via PDF printer driver.

Johan Dahlin
actually I don't think would be the best solution as most websites have a "print" style to them that remove elements of the site.
rip747
A: 

There is for sure no direct way to do this in Ruby (at least there wasn't when I was trying to do it couple of months ago). If you really can't avoid generating your pdfs from html, you should probably look around for some external tools or services for this purpose.

Good luck!

Milan Novota
+1  A: 

Sounds like a job for Prince XML.

stesch
thanks for the suggestion, but i just can't justify that price. i'm hoping for an open source solution.
rip747
A: 

The closest thing I found as a "solution" to this is by using JRuby and then using The Flying Saucer Project. I just wish that it would get ported over to Ruby so it could be a native solution. God I wish I was better at Java.

rip747
A: 
  • There is RPDF in which you describe your pdf's in as a view.
  • There also is RTEX which uses (La)TeX to generates pdf's.
mat
A: 

I am trying the library from www.pd4ml.com

here is a code snip....

I pull in the html into the String content. Then I do a couple things like replace the radio buttons with checkboxes and then run it through the pd4ml library to render the pdf.
The result looks pretty much like the original page....

 String content = nextPage.generateResponse().contentString();

 content = content.replace("Print", "");
 content = content.replace("Back", "");

 content = content.replace("border=\"1\"", "border=\"0\"");
 content = content.replace("radio", "checkbox");

 java.net.InetAddress i = java.net.InetAddress.getLocalHost();
 String address =  i.getHostAddress()+":53000";

 content = content.replace("img src=\"/cgi-bin", "img src=\"http://"+address+"/cgi-bin");

 System.out.println(content);

    PD4ML html = new PD4ML();
 html.setPageSize( new java.awt.Dimension(650, 700) );
 html.setPageInsets( new java.awt.Insets(30, 30, 30, 30) );
 html.setHtmlWidth( 750 );
 html.enableImgSplit( false );
 html.enableTableBreaks(true);

 StringReader isr = new StringReader(content);
 baos = new ByteArrayOutputStream();
 html.render( isr, baos);
 PDFRegForm pdfForm = (PDFRegForm)pageWithName("PDFRegForm");
 pdfForm.baos = baos;
 pdfForm.generateResponse();
A: 

Have you tried looking at html2pdf? the home page is http://html2pdf.seven49.net/Web/

There is also a sourceforge project of it.

Alan

apolinsky
A: 

Not specific to Ruby, but the best solution I've found to this is the open source HTMLDOC.

Travis Beale
+1  A: 

After After a lot of blood sweat and tears I managed to write a pretty simple rails app that allows the user to write letter templates via TinyMce, these are then saved and can be viewed as pdf documents.

I decided to restrict the options in the wysiwyg editor as much as possible as some of the options don't work exactly as expected, but that's nothing a little gsub-ing couldn't solve if needed.

There is a ruby gem that wraps HTMLDOC which you'll need: PDF::HTMLDoc

Once you've got that, register the mime type, then you can do something like:

@letter_template = LetterTemplate.find(params[:id])

respond_to do |format|
      format.html
      format.pdf { send_data render_to_pdf({:action => 'show.rpdf',  :layout => 'pdf_report'}), :filename => @letter_template.name + ".pdf",  :disposition => 'inline' }
    end

In the application controller i've added the render_to_pdf method like:

def render_to_pdf(options =nil)
    data = render_to_string(options)
    pdf = PDF::HTMLDoc.new
    pdf.set_option :bodycolor, :white
    pdf.set_option :toc, false
    pdf.set_option :portrait, true
    pdf.set_option :links, false
    pdf.set_option :webpage, true
    pdf.set_option :left, '2cm'
    pdf.set_option :right, '2cm'
    pdf.set_option :footer, "../"
    pdf.set_option :header, "..."
    pdf.set_option :bottom, '2cm'
    pdf.set_option :top, '2cm'
    pdf << data
    pdf.generate
  end

You'll find more documentation on HTMLDOC site that Travis Beale Linked to. Hope this helps get you on your way and unless your documents are really complicated it should suffice. Let us know how you get on.

tsdbrown
A: 

tsbrown, what is look like your rdpf file? are you generate view content from string saved in database?

+6  A: 

WicketPDF does that (http://github.com/mileszs/wicked_pdf)

Sebastian
you sir are the man. EXACTLY what i was looking for!
rip747
A: 

The project may have stagnated, but Pisa (aka XHTML2PDF) is a Python lib which does CSS+HTML to PDF. used it in a (low-traffic) Rails app a while back by driving it as a commandline tool -- worked pretty well.

Thomas
A: 

HTMLDOC? http://www.htmldoc.org/documentation.php

  • converts HTML to PDF;
  • Free and Open Source (commercial support available);
  • can be used as...
    • ...standalone application,
    • ...for batch document processing,
    • ...or run as a web-based service;
  • interface is CLI or GUI.
pipitas