views:

862

answers:

2

After 10 hours and trying 4 other HTML to PDF tools I'm about ready to explode.

wkhtmltopdf sounds like an excellent solution...the problem is that I can't execute a process with enough permissions from asp.net so...

Process.Start("wkhtmltopdf.exe","http://www.google.com google.pdf");

starts but doesn't do anything.

Is there an easy way to either:

-a) allow asp.net to start processes (that can actually do something) or
-b) compile/wrap/whatever wkhtmltopdf.exe into somthing I can use from C# like this: WkHtmlToPdf.Save("http://www.google.com", "google.pdf");

+1  A: 

check this http://stackoverflow.com/questions/1331926/asp-net-calling-exe/1698839#1698839

MK
Thanks, that answer is actually what I ended up using (kinda). `process.StandardOutput` was empty if I specified a filename other than " -"...If I specified " -" the StandardOutput would return a valid PDF, but it would be a blank page. I had wkhtmltopdf save the file with a GUID...Then I copied the file to the Response.OutputStream after using File.OpenRead(GUID). Closed the file then deleted it. It is not the most performant solution...but it works.
David Murdoch
I eliminated the process of writing pdf to disk then sending it to the client by using: Response.Headers.Add("Content-Disposition", string.Format("attachment;filename={0}.pdf", Guid.NewGuid())); Response.Headers.Add("Content-Type", "application/pdf"); string output = process.StandardOutput.ReadToEnd(); byte[] buffer = process.StandardOutput.CurrentEncoding.GetBytes(output); process.Close(); Response.BinaryWrite(buffer);
MK
@MK. When I did that it would always be a blank PDF page.
David Murdoch
Did you check the generated PDF before sending it to client, is it generated blank!? also why are you doing stream copying, you can simply: if (System.IO.File.Exists(file)) { Response.AddHeader("Content-Disposition", string.Format("attachment;filename={0}.pdf", Guid.NewGuid())); Response.AddHeader("Content-Type", "application/pdf"); Response.Clear(); Response.BinaryWrite(System.IO.File.ReadAllBytes(file)); Response.End(); }
MK
A: 

Here is the actual code I used. Please feel free to edit this to get rid of some of the smells and other terribleness...I know its not that great.

using System;
using System.Diagnostics;
using System.IO;
using System.Web;
using System.Web.UI;

public partial class utilities_getPDF : Page
{
    protected void Page_Load(Object sender, EventArgs e)
    {
        string fileName = WKHtmlToPdf(myURL);

        if (!string.IsNullOrEmpty(fileName))
        {
            string file = Server.MapPath("~\\utilities\\GeneratedPDFs\\" + fileName);
            if (File.Exists(file))
            {
                var openFile = File.OpenRead(file);
                // copy the stream (thanks to http://stackoverflow.com/questions/230128/best-way-to-copy-between-two-stream-instances-c)
                byte[] buffer = new byte[32768];
                while (true)
                {
                    int read = openFile.Read(buffer, 0, buffer.Length);
                    if (read <= 0)
                    {
                        break;
                    }
                    Response.OutputStream.Write(buffer, 0, read);
                }
                openFile.Close();
                openFile.Dispose();

                File.Delete(file);
            }
        }
    }

    public string WKHtmlToPdf(string Url)
    {
        var p = new Process();

        string switches = "";
        switches += "--print-media-type ";
        switches += "--margin-top 10mm --margin-bottom 10mm --margin-right 10mm --margin-left 10mm ";
        switches += "--page-size Letter ";
        // waits for a javascript redirect it there is one
        switches += "--redirect-delay 100";

        // Utils.GenerateGloballyUniuqueFileName takes the extension from
        // basically returns a filename and prepends a GUID to it (and checks for some other stuff too)
        string fileName = Utils.GenerateGloballyUniqueFileName("pdf.pdf");

        var startInfo = new ProcessStartInfo
                        {
                            FileName = Server.MapPath("~\\utilities\\PDF\\wkhtmltopdf.exe"),
                            Arguments = switches + " " + Url + " \"" +
                                        "../GeneratedPDFs/" + fileName
                                        + "\"",
                            UseShellExecute = false, // needs to be false in order to redirect output
                            RedirectStandardOutput = true,
                            RedirectStandardError = true,
                            RedirectStandardInput = true, // redirect all 3, as it should be all 3 or none
                            WorkingDirectory = Server.MapPath("~\\utilities\\PDF")
                        };
        p.StartInfo = startInfo;
        p.Start();

        // doesn't work correctly...
        // read the output here...
        // string output = p.StandardOutput.ReadToEnd();

        //  wait n milliseconds for exit (as after exit, it can't read the output)
        p.WaitForExit(60000);

        // read the exit code, close process
        int returnCode = p.ExitCode;
        p.Close();

        // if 0, it worked
        return (returnCode == 0) ? fileName : null;
    }
}
David Murdoch
It isn't "if 0 or 2, it worked".It only works when 0. Other values:1: (or 8?) Generic failure code value of EXIT_ERROR.2: Error 404, not found (and empty PDF).3: Error 401, unauthorized. As per unix specification, any value other than a 0 returned by a process signals some sort of error. This is true for windows programs as well.
Christian Sciberras
thanks, +1 for the correction. The code (and comments) are originally from http://stackoverflow.com/questions/1331926/asp-net-calling-exe/1698839#1698839
David Murdoch