views:

358

answers:

5

I am working on a windows service that polls for a connection to a network enabled devices every 15 seconds. If the service is not able to connect to a device, it throws an exception and tries again in 15 seconds. All of this works great.

But, lets say one of the devices is down for a day or more. I am filling up my exception log with the same exception every 15 seconds. Is there a standard way to prevent an exception from being written to the event log if the exception being thrown hasn't changed in the last x number of hours?

+1  A: 

Maybe have a workflow that if the polls fails for a certain number of times, the polling interval is increased. e.g. polls every 15 seconds for like 3 times, if it fails then increase the polling interval to one minute, if it fails for n times then increase the time to one hour.

To be honest the workflow above doesnt really solve your problem. If I were you, I would reverse the workflow. Instead of the server polling for devices, why not do it the other way round? When a device connected to a networked machine, your client side service sends a message to the server, so that the server knows that the device is connected and alive.

Hope this helps...

RWendi

RWendi
+1  A: 

If you use exception handling block in your application, I assume you do, you can switch between different exception handling policies. First it is policy that writes exception information to the event log and then after n tries or time period you can switch to policy that does not logs to the eventlog.

Increasing duration between attempts to connect probably will solve your problem. E.g. newTimeout = n*atomicTimeout where n is attemts number.

+3  A: 

One good way to achieve what you need is to employ the Circuit Breaker design pattern.

I first read about this in the book "Release It! Design and Deploy Production Ready Software" by Michael T. Nygard, from the Pragmatic Press, p104-107.

The idea of the circuit breaker is that it sits in the path of the connection between systems, passing connections through, watching for the "break condition". For example, it might trigger only if five connections in a row have all failed.

Once the circuit has broken, all calls through the circuit breaker fail immediately, without consulting the external service. This continues until a timeout occurs, when the breaker goes into a half-open state. The next call is attempted - a failure results in the timeout being reset, success in the breaker closing and the system resuming operation.

A quick google found a post by Tim Ross that reads well and goes into more detail.

In your case, you could use a circuit breaker with a timeout of 10 minutes, and a trigger of 5 failures. Your log files would then contain, in the case of an all day failure, five exceptions logged for the original problem, and then just six more an hour (compared with 240 at 15 second intervals), indicating that the problem persists.

Depending on your requirements, you could include a manual "reset" of the circuit breaker, or you could just leave it to automatically reset when the 10 minute timeout reveals things are back to normal. This could be useful - generally the fewer things the sysadmins need to fuss with, the better they like it.

Bevan
nice answer, very well put and handy pattern to have in your vocabulary
dove
A: 

What about...

 int count = 0;
 while (true)
 {
      try
      {
           AttemptStuff()
      }
      catch (Exception ex)
      {
           if(count < 10)
           {
                EventLog.WriteEntry("my service", ex.ToString(), EventLogEntryType.Error);
                count++;
           }
      }
 }
Matt Jacobsen
A: 

Circuit breaker patterns is a good idea id say

check out some design of PHP implementation but can be applied to any language

http://artur.ejsmont.org/blog/PHP-Circuit-Breaker-initial-Zend-Framework-proposal

Art79