views:

1651

answers:

4

I have a web application that is generating hundreds of PDFs in batch, using ColdFusion 8 on a Windows/IIS server.

The process runs fine on my development and staging servers, but of course the client is cheap and is only paying for shared hosting, which isn't as fast as my dev/staging boxes. As a result, PDF generation threads are timing out.

The flow is something like this:

  1. Page is run to generate PDFs.
  2. A query is run to determine which PDFs need to be generated, and a loop fires off an application-scoped UDF call for each PDF that will need to be generated.
  3. That UDF looks up information for the given item, and then creates a thread for the PDF generation to work in, preventing generation from slowing down the page.
  4. The thread simply uses CFDocument to create a PDF and save it to disk, then terminates.

Threads do not re-join, and nothing is waiting for any of them to finish. The page that makes the UDF calls finishes in a few milliseconds; it's the threads themselves that are timing out.

Here is the code for the UDF (and thread creation):

<cffunction name="genTearSheet" output="false" returntype="void">
    <cfargument name="partId" type="numeric" required="true"/>
    <!--- saveLocation can be a relative or absolute path --->
    <cfargument name="saveLocation" type="string" required="true"/>
    <cfargument name="overwrite" type="boolean" required="false" default="true" />
    <cfset var local = structNew() />

    <!--- fix save location if we need to --->
    <cfif left(arguments.saveLocation, 1) eq "/">
     <cfset arguments.saveLocation = expandPath(arguments.saveLocation) />
    </cfif>

    <!--- get part info --->
    <cfif structKeyExists(application, "partGateway")>
     <cfset local.part = application.partGateway
     .getByAttributesQuery(partId: arguments.partId)/>
    <cfelse>
     <cfset local.part = createObject("component","com.admin.partGateway")
     .init(application.dsn).getByAttributesQuery(partId: arguments.partId)/>
    </cfif>

    <!--- define file name to be saved --->
    <cfif right(arguments.saveLocation, 4) neq ".pdf">
     <cfif right(arguments.saveLocation, 1) neq "/">
      <cfset arguments.saveLocation = arguments.saveLocation & "/" />
     </cfif>
     <cfset arguments.saveLocation = arguments.saveLocation & 
     "ts_#application.udf.sanitizePartNum(local.part.PartNum)#.pdf"/>
    </cfif>

    <!--- generate the new PDF in a thread so that page processing can continue --->
    <cfthread name="thread-genTearSheet-partid-#arguments.partId#" action="run" 
    filename="#arguments.saveLocation#" part="#local.part#" 
    overwrite="#arguments.overwrite#">
     <cfsetting requestTimeOut=240 />
     <cftry>
     <cfoutput>
     <cfdocument format="PDF" marginbottom="0.75" 
     filename="#attributes.fileName#" overwrite="#attributes.overwrite#">
      <cfdocumentitem type="footer">
       <center>
       <font face="Tahoma" color="black" size="7pt">
       pdf footer text here
       </font>
       </center>
      </cfdocumentitem>
      pdf body here
     </cfdocument>
     </cfoutput>
     <cfcatch>
     <cfset application.udf.errorEmail(application.errorEmail,
     "Error in threaded PDF save", cfcatch)/>
     </cfcatch>
     </cftry>
    </cfthread>
</cffunction>

As you can see, I've tried adding a <cfsetting requestTimeout=240 /> to the top of the thread to try and make it live longer... no dice. I also got a little excited when I saw that the CFThread tag has a timeout parameter, but then realized it only applies when joining threads (action=join).

Changing the default timeout in ColdFusion Administrator is not an option, as this is a shared host.

If anyone has any ideas on how to make these threads live longer, I would really appreciate them.

A: 

Take a look here http://www.bennadel.com/blog/749-Learning-ColdFusion-8-CFThread-Part-II-Parallel-Threads.htm

Stewart Robinson
I'm... running... parallel... threads...
Adam Tuttle
I think the -1 is harsh. I pointed out that URL because I don't see a <cfthread action="join">That is the line that would wait for the threads to stop before ending the master page request
Stewart Robinson
(It wasn't me who voted it down, I wouldn't waste my rep on it.) I don't *want* the threads to join, because there's no need. They should be started and forgotten. Problem is that they are individually timing out.
Adam Tuttle
+2  A: 

I don't think there is a way to make the thread live longer on a shared host where you don't have access to the cf admin. I think the cf admin limit always overrides the cfsetting requesttimeout value when it's activated, so I think your hands are pretty much tied on that front.

My advice would be to change the strategy from creating all the PDFs within the current page to instead launching another request for each PDF that needs to be created. Non-displaying iframes while inelegant may be the simplest solution. You can simply output one iframe for each PDF that needs to be generated and that way it should never run into the timeout issue since each individual PDF should generate within the time limit set by the host.

Isaac Dealey
Isaac, each PDF is generated in its own thread running in parallel to the page request. The number of them will cause some queuing, but there's not much I can do about that.
Adam Tuttle
Yeah, admittedly, but it seems like that's kind of the way shared hosting tends to pan out.
Isaac Dealey
+1  A: 

While this doesn't directly answer my original question of increasing the timeout of a thread, I have been able to make the process work (prevent timeouts) by improving PDF generation time.

According to the livedocs, ColdFusion 8 added a localUrl attribute to the CFDocument tag that indicates that image files are located on the same physical machine, and should be included as local files instead of making an HTTP request for them.

Changing my CFDocument code to the following has made the process run fast enough that the threads don't time out.

<cfdocument format="PDF" marginbottom="0.75" 
filename="#attributes.fileName#" overwrite="#attributes.overwrite#" localUrl="yes">
 <cfdocumentitem type="footer">
  <center>
  <font face="Tahoma" color="black" size="7pt">
   pdf footer text here
  </font>
  </center>
 </cfdocumentitem>
 pdf body here
    <img src="images/foo/bar.gif"/>
</cfdocument>
Adam Tuttle
+1  A: 

We use a 90 second overall timeout on our server (set in the CF administrator) but use statements in CFM files to override that setting where needed. We also have the server log any request that lasts 30 seconds or more so we know which ones need to be optimized and/or need the requesttimeout override (though a timed out request would be obvious for other reasons its nice to have a list of your slowest transactions in C:\ColdFusion8\logs\server.log).

For example at the top of a CFM that is run as an overnight task I see:

<cfsetting enablecfoutputonly="Yes" showdebugoutput="No" requesttimeout="80000">

I can tell you that the last night it ran it took 34,313 seconds to complete. Obviously room to improve that process but it finishes hours before the office day begins. If that requesttimeout parameter wasn't set in the CFM file that job would definitely timeout at the 90 second mark.

Before we had the longer tasks overriding that setting I had to start with a much higher request time out and watch the failures and rerun jobs as we tightened things up. Ideally with good hardware, code, and a good database structure I'd continue tightening my CF admin timeout down to the CF8 default of 30 seconds. Unfortunately my database structure and code is not up to that level yet.

pplrppl