tags:

views:

1049

answers:

5

I have a client that's been struggling with slow loading pdf files on the web. My client has some very large pdf files that are almost 10 Mb. They take upwards of 3-4 minutes to download. The files will not display until the whole file is loaded. We and they have seen other's sites where the pdfs load one page at a time, so the end user can start looking at the file as the rest of the page is still loading in the background. Gives the illusion that the page has loaded faster.

According to the documentation they see, IIS 6 should automatically do this if the pdf file is created with “Optimized for fast web view” checked. It is checked, and the file will still not load a page at a time.

They have searched and found nothing other than IIS will do this automatically if the file is saved correctly.

How can they "stream" the pdf? Is this because the pdf's were saved in a special way? Is this a java script that handles the download? Or is there a change that needs to happen in IIS?

Thanks

Update: The file starts out like this:

%PDF-1.4
%âãÏÓ
171 0 obj << 0/Linearized 1

Linearized?

The PDF document isn't being served up from an aspx/asp page. (It's just posted directly to the site and linked to).

+1  A: 

Would it be possible to use a third party service, like Scribd? If you go this route you can embed their streaming viewer onto your client's website. Just a thought, although I know it's not really suitable for every type of business.

Marc Charbonneau
+1  A: 

This might happen if you are serving the PDF from an aspx page, to get the byte-serving that linearized pdf's need the page needs to be served directly or you need to provide the byte serving from the aspx code.

Tony Edgecombe
+1 to that - I've pulled the same trick with PHP to manage serving byte ranges
Paul Dixon
+1  A: 

You need to lineraize the PDF and not trust IIS to do this for you.

There are a number of apps that will do this for you. I have used CVision (thier compression is 2nd to none, but the licensing and SDK are a pain), there is also some cheaper alternatives here, but I dont know how well they work.

To clarify Tony's point... (I think)

If you have actually used these tools and your pdf is linearized, try converting the PDF to a byte array and Response.Write() the byte array (with content headers, etc) to the client (in a new browser window or frame)

StingyJack
+1  A: 

Save one of the files and open it up in a text editor. If you don't see something like

<< /Linearized 1.0 /L <number> /H [<number> <number>] /O <number> /E <number> ...

in the first couple hundred bytes or so, then you're not getting a linearized (ie, fast web) PDF.

plinth
+1  A: 

First, the document needs to be "linearized", as others have explained; you can linearize it in Acrobat or using pdfopt from Ghostscript. Second, the web server must be able to serve byte ranges (i.e., support the Range header); I have no idea how to configure IIS for this, but even if the document is linearized, the client has to be able to read particular byte ranges.

Jouni K. Seppänen