views:

49

answers:

2

I am writing a c# windows service which will perform some background processing - basically it is a consumer for a work queue.

It needs to not go down (stop processing new items), and if it does go down I need to be notified.

What are some design guidelines and considerations for a) ensuring that such a service is as reliable as possible, and b) sending out a notification if something does go wrong? I have considered, for instance, creating a watcher thread whose only job is to make sure the worker thread is still processing jobs.

+1  A: 

There are a number of things that you can do here to help improve the reliability, as well as gauge that you have a solution that is going to meet your needs.

Testing

First and foremost though, the testing process that you go through will need to be a very solid one, test for those "unexpected" situations, loss of network connection, etc. Make sure that you are testing those, and seeing what is happening. Notification on failure, can be a bit of a "mixed bag". For example, you can't e-mail yourself if you don't have network connections available.

Proper Code Design

In addition to setting up valid test scenarios, be sure that your code is a bullet proof as possible, since you are creating a windows service, be sure that you are capturing, logging, and dealing with all errors possible, as if an error bubbles up to the OS, your service will go down.

Monitoring

Consider putting monitoring, in my day-job we have two types of monitoring used, errors are reported the the Windows Event log in some cases and Microsoft MOM is used to notify us of any/all issues that are going on in the environment. A second process that we use is a second scheduled job that every X minutes validates that the critical job is in a "Started" state, if it isn't in a started state, it will re-start it. Not elegant, but it works.

Mitchel Sellers
A: 

I think a MOM and/or Solar Winds or some other monitoring application which your system administrator might be using to monitor the machine on which the service is deployed & take proper action (send email, ring phones :)

Sunny