views:

254

answers:

5

Our scenario is a network scanner.

It connects to a set of hosts and scans them in parallel for a while using low priority background threads.

I want to be able to schedule lots of work but only have any given say ten or whatever number of hosts scanned in parallel. Even if I create my own threads, the many callbacks and other asynchronous goodness uses the ThreadPool and I end up running out of resources. I should look at MonoTorrent...

If I use THE ThreadPool, can I limit my application to some number that will leave enough for the rest of the application to Run smoothly?

Is there a threadpool that I can initialize to n long lived threads?

[Edit] No one seems to have noticed that I made some comments on some responses so I will add a couple things here.

  • Threads should be cancellable both gracefully and forcefully.
  • Threads should have low priority leaving the GUI responsive.
  • Threads are long running but in Order(minutes) and not Order(days).

Work for a given target host is basically:

  For each test
    Probe target (work is done mostly on the target end of an SSH connection)
    Compare probe result to expected result (work is done on engine machine)
  Prepare results for host

Can someone explain why using SmartThreadPool is marked wit ha negative usefulness?

+5  A: 

The CLR ThreadPool isn't appropriate for executing long-running tasks: it's for performing short tasks where the cost of creating a thread would be nearly as high as executing the method itself. (Or at least a significant percentage of the time it takes to execute the method.) As you've seen, .NET itself consumes thread pool threads, you can't reserve a block of them for yourself lest you risk starving the runtime.

Scheduling, throttling, and cancelling work is a different matter. There's no other built-in .NET worker-queue thread pool, so you'll have roll your own (managing the threads or BackgroundWorkers yourself) or find a preexisting one (Ami Bar's SmartThreadPool looks promising, though I haven't used it myself).

Jeff Sternal
Background workers are thought for scheduling long running tasks while keeping the UI responsive. They are not the right choice for application logic threading however.
ntziolis
@ntziolis - I agree, though we don't have enough information about the application to totally rule it out: a 'network scanner' could mean [a GUI application](http://www.softperfect.com/products/networkscanner/).
Jeff Sternal
@Jeff - That is true, maybe the creator could add some additional information?
ntziolis
The reason for a pool is to schedule all the work and set a maximum for the total number of concurrent threads that should run at any given time. I also forgot to mention that the user should be able to abort the threads.
LogicMagic
A: 

Use the thredpool. It has got good capabilities.

Alternatively you can look at Smart ThredTool implementation here: http://www.codeproject.com/KB/threads/smartthreadpool.aspx

or http://www.codeproject.com/KB/threads/ExtendedThreadPool.aspx for Limit on the maximum number of working threads

Kangkan
+4  A: 

In .NET 4 you have the integrated Task Parallel Library. When you create a new Task (the new thread abstraction) you can specify a Task to be long running. We have made good experiences with that (long being days rather than minutes or hours).

You can use it in .NET 2 as well but there it's actually an extension, check here.

In VS2010 the Debugging Parallel applications based on Tasks (not threads) has been radically improved. It's advised to use Tasks whenever possible rather than raw threads. Since it lets you handle parallelism in a more object oriented friendly way.

UPDATE
Tasks that are NOT specified as long running, are queued into the thread pool (or any other scheduler for that matter).
But if a task is specified to be long running, it just creates a standalone Thread, no thread pool is involved.

ntziolis
Doesn't TPL sit on top of the ThreadPool?
LogicMagic
The purpose of the TPL is to add a sense able layer of abstraction on threading and scheduling different tasks. The beauty of the TPL is that you can specify a **scheduler**. The scheduler **CAN** in fact be the thread pool. But you can also just as well define your own. For more info check:http://bit.ly/aW4Lq4andhttp://bit.ly/9VAkbf
ntziolis
+1  A: 

In your particular case, the best option would not be either threads or the thread pool or Background worker, but the async programming model (BeginXXX, EndXXX) provided by the framework.

The advantages of using the asynchronous model is that the TcpIp stack uses callbacks whenever there is data to read and the callback is automatically run on a thread from the thread pool.

Using the asynchronous model, you can control the number of requests per time interval initiated and also if you want you can initiate all the requests from a lower priority thread while processing the requests on a normal priority thread which means the packets will stay as little as possible in the internal Tcp Queue of the networking stack.

Asynchronous Client Socket Example - MSDN

P.S. For multiple concurrent and long running jobs that don't do allot of computation but mostly wait on IO (network, disk, etc) the better option always is to use a callback mechanism and not threads.

Pop Catalin
A: 

I'd create your own thread manager. In the following simple example a Queue is used to hold waiting threads and a Dictionary is used to hold active threads, keyed by ManagedThreadId. When a thread finishes, it removes itself from the active dictionary and launches another thread via a callback.

You can change the max running thread limit from your UI, and you can pass extra info to the ThreadDone callback for monitoring performance, etc. If a thread fails for say, a network timeout, you can reinsert back into the queue. Add extra control methods to Supervisor for pausing, stopping, etc.

using System;
using System.Collections.Generic;
using System.Threading;

namespace ConsoleApplication1
{
    public delegate void CallbackDelegate(int idArg);

    class Program
    {
        static void Main(string[] args)
        {
            new Supervisor().Run();
            Console.WriteLine("Done");
            Console.ReadKey();
        }
    }

    class Supervisor
    {
        Queue<System.Threading.Thread> waitingThreads = new Queue<System.Threading.Thread>();
        Dictionary<int, System.Threading.Thread> activeThreads = new Dictionary<int, System.Threading.Thread>();
        int maxRunningThreads = 10;
        object locker = new object();
        volatile bool done;

        public void Run()
        {
            // queue up some threads
            for (int i = 0; i < 50; i++)
            {
                Thread newThread = new Thread(new Worker(ThreadDone).DoWork);
                newThread.IsBackground = true;
                waitingThreads.Enqueue(newThread);
            }
            LaunchWaitingThreads();
            while (!done) Thread.Sleep(200);
        }

        // keep starting waiting threads until we max out
        void LaunchWaitingThreads()
        {
            lock (locker)
            {
                while ((activeThreads.Count < maxRunningThreads) && (waitingThreads.Count > 0))
                {
                    Thread nextThread = waitingThreads.Dequeue();
                    activeThreads.Add(nextThread.ManagedThreadId, nextThread);
                    nextThread.Start();
                    Console.WriteLine("Thread " + nextThread.ManagedThreadId.ToString() + " launched");
                }
                done = (activeThreads.Count == 0) && (waitingThreads.Count == 0);
            }
        }

        // this is called by each thread when it's done
        void ThreadDone(int threadIdArg)
        {
            lock (locker)
            {
                // remove thread from active pool
                activeThreads.Remove(threadIdArg);
            }
            Console.WriteLine("Thread " + threadIdArg.ToString() + " finished");
            LaunchWaitingThreads(); // this could instead be put in the wait loop at the end of Run()
        }
    }

    class Worker
    {
        CallbackDelegate callback;
        public Worker(CallbackDelegate callbackArg)
        {
            callback = callbackArg;
        }

        public void DoWork()
        {
            System.Threading.Thread.Sleep(new Random().Next(100, 1000));
            callback(System.Threading.Thread.CurrentThread.ManagedThreadId);
        }
    }
}
ebpower