views:

312

answers:

6

Now, this might be a very newbie question, but I don't really have experience with multithreaded programming and I haven't fully understood how threads work compared to processes.

When a process on my machine hangs, say it's waiting for some IO that never comes or something similar, I can kill and restart it because other processes aren't affected and can, for example, still operate my terminal. This is very obvious, of course.

I'm not sure whether it is the same with threads inside a process: If one hangs, are the others unaffected? In other words, can I run a "watchdog" thread which supervises the other threads and, for example kill and recreate hanging threads? For example, if I have a threadpool that I don't want to be drained by occasional hangups.

+1  A: 

If a thread hangs, the others will continue executing. However, if the hung thread has locked a semaphore, critical section or other kind of synchronization object, and another thread attempts to lock the same synchronization object, you now have a deadlock with two dead threads.

It is possible to monitor other threads from a thread. Depending on your platform, there are appliable API's: I refer you to those as you haven't stated what OS you are writing for.

John Källén
Thanks, my question was more conceptually than for a specific OS / language / API
Hanno Fietz
A: 

You didn't mention about the platform, but as far as I'm concerned, NT kernel schedules threads, not processes and threats them independently in that manner. This might not be and is not true on other platforms (some platforms, like Windows 3.1, do not use preemptive multithreading and if one thread goes in infinite loop, everything is affected).

Mehrdad Afshari
A: 

The simple answer is yes.

Typically though code in a thread will handle this likely hood itself. Most commonly many APIs that perform operations that may hang will have timeout features of their own.

Alternatively a thread will wait on not just an the operation that might hang but also a timer. If the timer signals first its assummed the operation has hung.

Since for a watch dog thread to be useful in this scenario would need some co-operation from code in the other threads having the threads themselves set timeouts makes more sense than a watchdog.

AnthonyWJones
A: 

Threads get scheduled independent of each other. So you could indeed stop and restart hanging threads. Threads do not run in a separate address-space so a misbehaving thread can still overwrite memory or take locks needed by other threads in the same process.

Mendelt
+3  A: 

Threads are independent, but there's a difference between a process and a thread, and that is that in the case of processes, the operating system does more than just "kill" it. It also cleans up after it.

If you start killing threads that seems to be hung, most likely you'll leave resources locked and similar, something that the operating system would close for you if you did the same to a process.

So for instance, if you open a file for writing, and start producing data and write it to the file, and this thread now hangs, for whatever reason, killing the thread will leave the file still open, and most likely locked, up until you close the entire program.

So the real answer to your question is: No, you can not kill threads the hard way.

If you simply ask a thread to close, that's different because then the thread is still in control and can clean up and close resources before terminating, but calling an API function like "KillThread" or similar is bad.

Lasse V. Karlsen
Very nice answer, particularly the thread vs. process mention. In short, the OS cleans up after a process and it doesn't clean up after a thread!
D.Shawley
A: 

There's a pretty good overview of some of the pitfalls of killing and suspending threads in the Java documentation explaining why the methods that do it are deprecated. Basically, if you expect to be able to kill a thread, you have to be very, very careful to make it work without some sort of corruption. If a thread is hung it's probably because of a bug...in which case killing it will probably result in corruption.

http://java.sun.com/j2se/1.4.2/docs/guide/misc/threadPrimitiveDeprecation.html

If you need to be able to kill things, use processes.

Erik Engbrecht