views:

1087

answers:

7

Our CF server occasionally stops processing mail. This is problematic, as many of our clients depend on it.

We found suggestions online that mention zero-byte files in the undeliverable folder, so I created a task that removes them every three minutes. However, the stoppage has occurred again.

I am looking for suggestions for diagnosing and fixing this issue.

  • CF 8 standard
  • Win2k3

Added:

  • There are no errors in the mail log at the time the queue fails
  • We have not tried to run this without using the queue, due to the large amount of mail we send

Added 2:

  • It does not seem to be a problem with any of the files in the spool folder. When we restart the mail queue, they all seem to process correctly.

Added 3:

  • We are not using attachments.
+1  A: 

What does your cf mail error log say?

+3  A: 

Have you tried just bypassing the queue altogether? (In CF Admin, under Mail Spool settings, uncheck "Spool mail messages for delivery.")

Patrick McElhaney
+2  A: 

I have the same problem sometimes and it isn't due to a zero byte file though that problem did crop up in the past. It seems like one or two files (the oldest ones in the folder) will keep the queue from processing. What I do is move all of the messages to a holding folder, restart the mail queue and copy the messages back in a chunk at a time in reverse chronological order, wait for them to go out and move some more over. The messages which were holding up the queue are put in a separate folder to be examined latter.

You can probably programmatically do this by stopping the queue, moving the oldest file to another folder, then start the mail queue and see if sending begins successfully by checking folder file counts and dates. If removing the oldest file doesn't work, repeat the previous process until all of the offending mail files are moved and sending continues successfully.

I hope the helps.

Dan Roberts
+1  A: 

There is/was an issue with the mail spooler and messages with attachments in CFMX 8 that was fixed with one of the Hotfixes. Version 8.0.1, at least, should have had that fixed.

Al Everett
+5  A: 

We have not tried to run this without using the queue, due to the large amount of mail we send

Regardless, have you tried turning off spooling? I've seen mail get sent at a rate of 500-600 messages in a half second, and that's on kind of a crappy server. With the standard page timeout at 60 seconds, that would be ~72,000 emails you could send before the page would time out. Are you sending more than 72,000 at a time?

An alternative I used before CFMail was this fast was to build a custom spooler. Instead of sending the emails on the fly, save them to a database table. Then setup a scheduled job to send a few hundred of the messages and reschedule itself for a few minutes later, until the table is empty.

We scheduled the job to run once a day; and it can re-schedule itself to run again in a couple of minutes if the table isn't empty. Never had a problem with it.

Adam Tuttle
+4  A: 

What we ended up doing:

I wrote two scheduled tasks. The first checked to see if there were any messages in the queue folder older than n minues (currently set to 30). The second reset the queue every night during low usage.

Unfortunately, we never really discovered why the queue would come off the rails, but it only seems to happen when we use Exchange -- other mail servers we've tried do not have this issue.

Edit: I was asked to post my code, so here's the one to restart when old mail is found:

<cfdirectory action="list" directory="c:\coldfusion8\mail\spool\" name="spool" sort="datelastmodified">
<cfset restart = 0>
<cfif datediff('n', spool.datelastmodified, now()) gt 30>
    <cfset restart = 1>
</cfif>
<cfif restart>
    <cfset sFactory = CreateObject("java","coldfusion.server.ServiceFactory")>
    <cfset MailSpoolService = sFactory.mailSpoolService>
    <cfset MailSpoolService.stop()>
    <cfset MailSpoolService.start()>
</cfif>
Ben Doom
I used to have this happen periodically, and like the weird "<." issue, we never identified the cause... we could send out 100 identical mails, and the 35th would choke... removing the oldest universally fixed it... setting up a proc to monitor that is a fine solution to those bizarre unlogged jrun hiccups.
OhkaBaka
We have been experiencing the same problem, but stopping and starting the MailSpoolService as described here has no effect on the problem - all that works for us is restarting Coldfusion. Does anyone have any other suggestions we could try which will prevent us needing to restart Coldfusion?
Loftx
@Loftx -- What email server are you using? We only experienced this problem when the powers that be forced us to use Exchange. Would it be possible to use a small SMTP relay system? The free version of MailEnable works great for this in our production environment.
Ben Doom
@Ben Doom - we're using the SMTP server provided with IIS in Windows Server 2003. The server had been running fine for a couple of years, and then we've had this issue occur maybe 4 times in the last 6 months or so. If you prefer, I can start a new question rather than continue this one.
Loftx
I don't mind you commenting here. I would be concerned you won't see new answers, though. Sounds like a different problem, perhaps. I was seeing 4 drops per *day*.
Ben Doom
A: 

Hi Ben,

Would you mind posting your scheduled script. I'm having the same issue and it would save me some time. Thanks!

Posted. If it works for you, feel free to give me an upvote. :-)
Ben Doom