views:

967

answers:

5

I have a windows service written in C# that creates a truck load of threads and makes many network connections (WMI, SNMP, simple TCP, http). When attempting to stop the windows service using the Services MSC snap-in, the call to stop the service returns relatively quickly but the process continues to run for about 30 seconds or so.

The primary question is what could be the reason that it is taking 30+ seconds to stop. What can I look for and how do I go about looking for it?

The secondary question is why is the service msc snap-in (service controller) returning even though the process is still running. Is there a way to get it to only return when the process is actually killed?

Here is the code in the OnStop method of the service

protected override void OnStop()
{
   //doing some tracing
   //......

   //doing some minor single threaded cleanup here
   //......

   base.OnStop();

   //doing some tracing here
}

Edit in response to Thread cleanup answers

Many of you have answered that I should keep track of all my threads and then clean them up. I don't think that is a practical approach. Firstly, i don't have access to all managed threads in one location. The software is pretty big with different components, projects and even 3rd party dlls that could all be creating threads. There is no way I can keep track of all of them in one location or have a flag that all threads check (even if i could have all threads check a flag, many threads are blocking on things like semaphores. When they are blocking they can't check. I will have to make them wait with a timeout, then check this global flag and the wait again).

The IsBackround flag is an interesting thing to check. Again though, how can I find out if I have any forground threads running arround? I will have to check every section of the code that creates a thread. Is there any other way, maybe a tool that can help me find this out.

Ultimately though, the process does stop. It would only seem that i need to wait for something. However, if i wait in the OnStop method for X ammount of time, then it takes the process approximately 30 seconds + X to stop. No matter what i try to do, it seems that the process needs approximately 30 seconds (its not always 30 seconds, it can vary) after the OnStop returns for the process to actually stop.

+4  A: 

The service control manager (SCM) will return when you return from OnStop. So you need to fix your OnStop implementation to block until all the threads have finished.

The general approach is to have OnStop signal all your threads to stop, and then wait for them to stop. To avoid blocking indefinitely you can give the threads a time limit to stop, then abort them if they take too long.

Here is what I've done in the past:

  1. Create a global bool flag called Stop, set to false when the service is started.
  2. When OnStop method is called, set the Stop flag to true then do a Thread.Join on all the outstanding worker threads.
  3. Each worker thread is responsible for checking the Stop flag, and exit cleanly when it is true. This check should be done frequently, and always before a long running operation, to avoid having it delay the service shutdown for too long.
  4. In the OnStop method, also have a timeout on the Join calls, to give the threads a limited time to exit cleanly... after which you just abort it.

Note in #4 you should give adequate time for your threads to exit in normal case. Abort should only happen in unusual case where thread is hung... in that case doing an abort is no worse than if the user or system kills the process (the latter if the computer is shutting down).

DSO
+1, there's no way to solve this problem until you know what all of your components are doing and have some way to join (and terminate) their long-running operations executing on different threads. You can deal with blocking semaphores by setting a timeout on their Wait operations, executing the wait inside a loop, and checking checks your shutdown flag as the loop's exit condition.
Jeff Sternal
A: 

Signal your threads loop exit, do it clean and do thread Join-s.. look for how long it takes as a measure/stopwatch where the problems are. Avoid abortive shutdown for various reasons..

rama-jka toti
A: 

To answer the first question (Why would the service continue to run for 30+ seconds): there are many reasons. For instance, when using WCF, stopping a the Host causes the process to stop accepting incoming requests, and it waits to process all current requests before stopping.

The same would hold true for may other types of network operations: the operations would attempt to complete before terminating. This is why most network requests have a built-in timeout value for when the request may have "hung" (server gone down, network problems, etc).

Without more information on what exactly it is you are doing there is not way to tell you specifically why it's taking 30 seconds, but it's probably a timeout.

To answer the second question (Why is the service controller returning): I'm not sure. I know that the ServiceController class has a WaitForState method that allows you to wait untill the given state is reached. It is possible that the service controller is waiting for a predetermined time (another timeout) and then forcibly terminating your application.

It is also very possible that the base.OnStop method has been called, and the OnStop method has returned, signalling to the ServiceController that the process has stopped, when in fact there are some threads that have not stopped. you are responsible for termingating these threads.

Oplopanax
+1  A: 

The call to stop the service returns as soon as your OnStop() callback returns. Based on what you've shown, your OnStop() method doesn't do much, which explains why it returns so fast.

There are a couple of ways to cause your service to exit.

First, you can rework the OnStop() method to signal all the threads to close and wait for them to close before exiting. As @DSO suggested, you could use a global bool flag to do this (make sure to mark it as volatile). I generally use a ManualResetEvent, but either would work. Signal the threads to exit. Then join the threads with some kind of timeout period (I usually use 3000 milliseconds). If the threads still haven't exited by then, you can call the Abort() method to exit them. Generally, Abort() method is frowned upon, but given that your process is exiting anyway, it's not a big deal. If you consistently have a thread that has to be aborted, you can rework that thread to be more responsive to your shutdown signal.

Second, mark your threads as background threads (see here for more details). It sounds like you are using the System.Threading.Thread class for threads, which are foreground threads by default. Doing this will make sure that the threads do not hold up the process from exiting. This will work fine if you are executing managed code only. If you have a thread that is waiting on unmanaged code, I'm not sure if setting the IsBackground property will still cause the thread to exit automatically on shutdown, i.e., you may still have rework your threading model to make this thread respond to your shutdown request.

Matt Davis
I accepted this answer because it mentioned the IsBackground thread property. That was the only thing I needed to change. I don't believe in creating a global flag that any and every component should use - that's too much coupling in my opinion. However, if threads are correctly marked as background threads then the service stops fine.
Mark
I would not use the global flag/event either. What I've done is created a wrapper around a System.Threading.Thread object. The constructor of this wrapper creates the thread, sets the name, and sets the IsBackground property. I've got public methods to start and stop the thread. The Stop() method, in particular, sets a private ManualResetEvent that signals the thread to stop running. To make it fully flexible, the constructor accepts what amounts to a System.Threading.ThreadStart delegate, allowing anyone to use this class without having to inherit from it.
Matt Davis
A: 

The simple way to do this may look like this:
-first crete an global event

ManualResetEvent shutdownEvent;

-at service start create the manual reset event and set it to an initial state of unsignaled
shutdownEvent = new ManualResetEvent(false);
-at service stop event
shutdownEvent.Set();

do not forget to wait for the end of the threads
do
{
 //send message for Service Manager to get more time
 //control how long you wait for threads stop
}
while ( not_all_threads_stopped );

-each thread must test from time to time, the event to stop

if ( shutdownEvent.WaitOne(delay, true) ) break;
lsalamon