views:

288

answers:

8

I need customers to be able to download PDFs of letters that have been sent to them.

I have read the threads about database vs filesystem storage of documents or images, and it does sound like the consensus is that, for anything more than just a few images, filesystem is the way to go.

What I want to know: would a reasonable alternative be to just store the letter details in the database, and recreate the PDF 'on the fly' when it is requested?

Is that approach superior or inferior to fetching the PDF from the filesystem?

A: 

I'm inclined to say "it depends".

When one document is requested many times, it may be a saving if you compose it on the first request, and retrieve it subsequentially.

OTOH if most requests for a document are of the just-once type, and the creation process doesn't eat up most of your server capacity, on-the-fly will have clear advantage.

A: 

If you're using ASP.NET why not cache the PDF. Your cache can be stored in the database if you like or left in memory for as long as you may need it first. The enterprise library implements this for you in the caching application block and it's remarkably simple to use. If you cache the object, create a storage in the database using the block and then load it when you need it you won't have to worry about re-creating it.

Odd
+8  A: 

If it is for archival purposes, I would definitely store the PDF because in future, your PDF generation script may change and then the letter will not be exactly the same as what was originally sent. The customer will be expecting it to be exactly the same.

It doesn't matter what approach is superior, sometimes it is better to go for what approach is safer.

Adam Pierce
Everything I have ever done has required an archive method. We actually added a trigger to the table so that a new PDF was generated any time a record was inserted or updated.
Jarrett Meyer
A: 

Few things to consider, is the PDF generate based on data as it existed at some point in time. E.G. a Bill based on data from the prior month?

If so, Would you use the same template each month to generate this letter? What happens if/when the letter format changes, if you regenerate on the fly it is no longer the same that was sent to them. Is storing the PDF stream into the database a possibility?

I guess what I am getting at, do you need an exact representation of what was sent to the user, or is that flexible?

Brian Schmitt
What I am proposing is to save all letter information as of the date it was originally created in its own table(s). So the paragraph text and customer data will not change. Am I gaining (or losing) anything with the on-the-fly approach?
Tony
For example, is this approach inherently more (or less) secure than having extant PDFs sitting on a drive somewhere?
Tony
Neither More or Less secure, PDF could be regenerated and placed into the directory just as easy as the data could be edited. Neither is likely to happen, but are a possibility. The trouble with file based approaches is the actual management of them. Generating on-the-fly gets away from this.
Brian Schmitt
However the drawback is if/when the template changes, which may be a good or bad thing. If you have a new masthead graphic, maybe the business would want that to apply to all letters, not just the new ones. It's a tough call that would need to be decided between you and the business.
Brian Schmitt
+2  A: 

Is there a forensics reason why you have to maintain records of letters sent to customers? If you are going to regenerate on the fly, how do you know that future code changes won't rewrite the letter (or, at least, the customer can make that argument in court if the information is used in a lawsuit)...

Toybuilder
+2  A: 

I'd store it off for two reasons

1) If you ever change how you generate the PDF, you probably don't want historical items to change. If you generate them every time, either they will change or you need to keep compatibility code to generate "old-style" records

2) Disk space is cheap. User's patience isn't. Unless you're really pressed for storage or pulling out of storage is harder than generating the PDF, be kind to your users and store it off.

Obviously if you create thousands of these an hour from a sparse dataset, you may not have the storage. But if you have the space, I'd vote for "use it"

Philip Rieck
A: 

The question of whether to generate the pdfs dynamically or store them statically sounds more like a question of law than a question of programming.

If you don't have access to legal counsel that can provide guidance on this then it is going to be far safer to err on the side of caution and store them statically.

Andrew Edgecombe
A: 

As long as the PDF document is of permanent nature (not just a work doc, but something official signed and sent somewhere else in the company or outside the company), you should have a copy of this PDF file on your network, and a link to this file in your database.

You cannot rely on the available data to reproduce the very same document at a different time mainly because:

  1. Data can be changed (yes! suppose that the letter is settled to be signed by Head Of Department, and staff has changed?)
  2. Your report format will change (header, footer, logo, etc)
  3. The document you produced is kept by somebody else who will make use of the data available in the document.
Philippe Grondier