views:

128

answers:

4

I'm new to this site monitoring thing, so please bear with me.

I'm looking for a good monitoring service for my website. What I'm trying do is make sure the site is running and working not just running (ie. responsive and working, not just responsive). For example, the database could be down, so the site is responsive but not working, this means with every request you will get an error page instead of the page you want. How do you normally handle this?

In my app, I have something like a fatal error mode, if something goes wrong that the app can't recover from (like when the database is down for example), then the app is set to fatal error mode and will always transfer to a page that says something like 'We're having some technical problem and will be back online soon' (this is done to avoid returning errors with every request, which is not pleasant for the user experience, and also to avoid logging tons of errors which are all basically the same).

I'm thinking about writing a webservice that the monitoring service should call. The webservice should return a boolean value, so, if it returns true then the site is working but if it returns false, then this means that something is wrong. Is it possible to find a monitoring service that can check the value returned from the webservice and notify me if it's not the expected value?

Thanks for any suggestions

+1  A: 

Rather than creating services etc. you can create a simple asp.NET Web page, call it the "watchdog page", which can be invoked in one of two modes: verbose (human-oriented) and XML or simple text (for parsing by monitoring bots).

You then use off-the-shelf web monitoring tools. Quite a few seem to be available from open source / freeware), but I cannot recommend any one in particular. In the commercial world, we've had good luck with WhatsUp Gold products like this one. You can configure these tools to call the watchdog page. Typically you'll want to have one "local" monitoring service/app and one remote. The monitoring software can be configured to call, page, email, to alert support staff in case of errors. They typically have some logic to only warn after a confirmed outage/issue, to avoid crying wolf because of network burps and such.

The idea is to test the whole chain in the very same framework/technologies that the application in based upon. The test web page can/should even use the very same includes and assemblies referenced by the real application. Furthermore, the remote monitor tests the Internet access itself as well (Internet per-se, your gateways, firewall, etc. )

The Verbose mode is handy, as you can design this page with nice green, orange and red buttons to tell support staff the health of the system whenever they call this page from an plain browser. (Can even have a self refresh snippet in there to be fancy).

Finaly one thing to try and monitor is the monitors themselves, One the test of this web page is to read off some log or elsewhere the date/time of the last call from the monitoring IP, and to add an Orange warning when such a call hasn't taken place in the last, say 10 minutes.

All in all, a very simple trick. Low requirements (installs as part of application itself; it's just an extra web page), no need for setting up services at the level of the OS etc.

mjv
+2  A: 

Most external monitoring services provide the ability to look for specific strings in the calling URL's content. Take a look at alertsite.

Matt Wrock
We use alertsite and are quite happy with it. Don't monitor things yourself unless you can run the monitoring software from completely independent infrastructure (else the same outage that takes down your site will take down your monitor).
Eric J.
+2  A: 

I think you have better to use HTTP return code for example returning 500 Server Error when you're application encounter a fatal error.

Most of the monitoring tool are able to read the HTTP return code. If you are looking for a monitoring solution I advice you to use nagios. The check_http plugin should be able to monitor your website.

RageZ
Thanks, where was my mind when I asked this question?! .. I'm going to check the application status in the error page and return error 500 if it's in fatal error mode
Waleed Eissa
+1  A: 

Webmetric and Alertfox are worth to try. Also nagios is one more good option.

Mahesh
Actually I contacted Webmetric last week, they seemed quite interested at first but after I told them about my simple requirements I never got a reply from them. I think they are more interested in bigger clients.
Waleed Eissa
Mahesh