views:

184

answers:

4

For most production sites, you want to know when there has been an error as soon as possible. My question is how best to get this information.

Usually, it's probably best to get the errors in an email as I'm not going to sit every day and watch error logs until there is an error--this would be impossible since I have 20 or more production sites on different servers. These errors could be anything including, unset variables, invalid data received, or query errors.

At the moment I have followed the example on PHPs websites, found here. As a result, it creates a text string along with an XML file that is then sent by email. I have modified this slightly to keep all of the errors until the script ends and then send an email with the XML files attached. (I have crashed a couple mail servers sending over >500 000 emails because of an error in a loop.) Most of the time this works perfectly. (I have also created an object to do all of the error handling.)

The problem arises when there is a large amount of data for wddx_serialize_value() to processs. And then if there are multiple errors, then, it really ends up using a lot of memory, most of the time more than the script is allowed to use.

Because of this, I have added an addition gzcompress() to the XML file before storing it within the variable. This helps, but if the amount of data is very large, it still runs out of memory. (In a recent case it wanted to use around 2GB.)

I'm wondering what other solutions there are to this or how you have modified this to make it work?

So a few requirements:

  • it must be able to send more than just the error message to me and shouldn't make me login to the server to figure out what happened (so I can check when mobile and determine if it's an urgent matter)
  • there needs to be a limit on the number of emails sent. The best is still 1.
  • it needs to log to a file as per normal

Edit: I need other information related to the error, not just the error string. Often I find it's near to impossible to reproduce the error because it's caused by user input, which I don't know unless I get more information. I have tried my best to put in informative errors, but you never know how a user is going to use the system or what crap data they are going to put in. Therefore, I need more than just the error text/string.

Edit 2: Can't log errors to the database because for all I know the database may not be there. Need something that is pretty much guaranteed to run. Also, the websites are not all on 1 server and I often don't have access to cron on the server (stupid hosting companies).

+1  A: 

Instead of setting a custom error handler, I let the errors go to the error log as usual. I set up a cron that runs periodically and monitors changes in the error log - if it changed, it sends me an email with the changes only. You can improve this process and parse the changes to better suit your needs - for example send you only errors above a certain level (such as E_WARNING and above).

Eran Galperin
+1  A: 

One approach could be proper exception management in your application, i.e. to have control over which errors get logged.

Each raised exception would log the error details in a database.

Then, you could code a little application in order to search the error database, maybe just one for all your websites.

That way you avoid large unreadable log files, because everything is indexed and quickly searchable. When your database gets too large, you can truncate your log tables via cron jobs.

Franck
A: 

Anacron, a cron job that emails changes to the error log* and an error log file should suffice. The cron job can do all the processing required before sending the email.

partoa
A: 

One thing I have used in the past is epylog, it is a very flexible log monitoring app written in python. You can set it up to monitor your error logs and include the errors (or parts of them) in a log summary that is emailed to you.

I'd lean towards storing the more detailed error data in a flat file on the server and sending you an email to tell you to check the log. A cron job that watches the error directory or files for changes and has a rate limit set would be a good way to minimize impact on your running application.

Brian C. Lane