views:

461

answers:

3

I'm trying to use WebClient to download a bunch of files asynchronously. From my understanding, this is possible, but you need to have one WebClient object for each download. So I figured I'd just throw a bunch of them in a queue at the start of my program, then pop them off one at a time and tell them to download a file. When the file is done downloading, they can get pushed back onto the queue.

Pushing stuff onto my queue shouldn't be too bad, I just have to do something like:

lock(queue) {
 queue.Enqueue(webClient);
}

Right? But what about popping them off? I want my main thread to sleep when the queue is empty (wait until another web client is ready so it can start the next download). I suppose I could use a Semaphore alongside the queue to keep track of how many elements are in the queue, and that would put my thread to sleep when necessary, but it doesn't seem like a very good solution. What happens if I forget to decrement/increment my Semaphore every time I push/pop something on/off my queue and they get out of sync? That would be bad. Isn't there some nice way to have queue.Dequeue() automatically sleep until there is an item to dequeue then proceed?

I'd also welcome solutions that don't involve a queue at all. I just figured a queue would be the easiest way to keep track of which WebClients are ready for use.

+2  A: 

What you want is a producer/consumer queue.

I have a simple example of this in my threading tutorial - scroll about half way down that page. It was written pre-generics, but it should be easy enough to update. There are various features you may need to add, such as the ability to "stop" the queue: this is often performed by using a sort of "null work item" token; you inject as many "stop" items in the queue as you have dequeuing threads, and each of them stops dequeuing when it hits one.

Searching for "producer consumer queue" may well provide you with better code samples - this was really just do demonstrate waiting/pulsing.

IIRC, there are types in .NET 4.0 (as part of Parallel Extensions) which will do the same thing but much better :) I think you want a BlockingCollection wrapping a ConcurrentQueue.

Jon Skeet
Why do you need the while loop around `Monitor.Wait(listLock);`? Shouldn't the consumer only be woken when there is an item ready for consumption?
Mark
Also, can't you `lock` on any object? Why not lock on the queue rather than having a separate lock object?
Mark
+1. Another good threading tutorial, with a producer consumer queue is this one- http://www.albahari.com/threading/part2.aspx#_Synchronization_Essentials (just do a find on producer)
RichardOD
@Mark: If you don't have the while loop, the consumer could be woken but then *another* consumer could come and grab the item before the woken one. Wait calls should almost *always* have a while loop testing some condition. And yes, you *can* (unfortunately IMO) lock on any object - but I believe it's a bad idea. It means you don't know what other code is going to lock on it. Does Queue take a lock on itself? If it does, what would happen? Maybe it would be okay in this case... but I find it easier to think about if I keep separate private locks.
Jon Skeet
Oh that's just confusing! Because all the consumers merrily enter that lock block, and chill at `Monitor.Wait`. You'd think that because it's a lock, there would only be one thread in there, but that isn't the case because `Monitor.Wait` releases the lock and lets everyone else in. That, and now you have a form of lazy busy-waiting; "take a nap and check back later." I can't say I'm very fond of this solution. You might save on a counter, but the overhead of wrapping your head around it isn't worth it, to me. Good to know though, so thanks.
Mark
It's not busy-waiting - it doesn't just "check back later"; the thread is awoken when the monitor is pulsed. As for the behaviour of Monitor.Wait - if you're going to write your own threading collections, you *really* should understand these things before you start. Once you do, it's not a problem at all.
Jon Skeet
Yes, but I mean when it's pulsed, it has to check the queue count again. Potentially, one thread could get stuck in that loop and check that count many more times than is necessary. Interestingly though, these monitors came in handy for solving an unrelated problem :D Releasing the lock is handy.
Mark
@Mark: That's not busy-waiting in any practical way though. Yes, it's theoretically possible for another thread to come in at the exact time required to "steal" the next item every single time - but how often do you think that really happens in real life?
Jon Skeet
+1  A: 

Here's an example using a Semaphore. IMO it is a lot cleaner than using a Monitor:

public class BlockingQueue<T>
{
    Queue<T> _queue = new Queue<T>();
    Semaphore _sem = new Semaphore(0, Int32.MaxValue);

    public void Enqueue(T item)
    {
        lock (_queue)
        {
            _queue.Enqueue(item);
        }

        _sem.Release();
    }

    public T Dequeue()
    {
        _sem.WaitOne();

        lock (_queue)
        {
            return _queue.Dequeue();
        }
    }
}
DSO
You don't need a separate counter variable anyway - the queue maintains its own count.
Jon Skeet
Yeah... that's what I was thinking too. Just find it strange that this isn't a "standard" class. I'll probably go with this solution, thanks!
Mark
@Jon: It maintains its own count, yes, but getting the count isn't thread-safe is it? So you'd have to lock the queue to get the count, and then if the count is 0, release it, wait, grab the lock again..... it just gets nastier, doesn't it?
Mark
If you've already got a lock whenever you access the queue (which you have in my example) then you're fine. My version works without a problem but doesn't have a separate counter variable.
Jon Skeet
True about the queue having its own count (not sure why the other answer using a Monitor declared an explicit count). Still I think the Semaphore is a bit more elegant than a Monitor for this particular problem, but I guess thats subjective.
DSO
I agree with DSO. Simpler to understand, and no loops. It's working out quite well in my program :D
Mark
A: 

I use a BlockingQueue to deal with exactly this type of situation. You can call .Dequeue when the queue is empty, and the calling thread will simply wait until there is something to Dequeue.

public class BlockingQueue<T> : IEnumerable<T>
{
    private int _count = 0;
    private Queue<T> _queue = new Queue<T>();

    public T Dequeue()
    {
        lock (_queue)
        {
            while (_count <= 0)
                Monitor.Wait(_queue);
            _count--;
            return _queue.Dequeue();
        }
    }

    public void Enqueue(T data)
    {
        if (data == null)
            throw new ArgumentNullException("data");
        lock (_queue)
        {
            _queue.Enqueue(data);
            _count++;
            Monitor.Pulse(_queue);
        }
    }

    IEnumerator<T> IEnumerable<T>.GetEnumerator()
    {
        while (true)
            yield return Dequeue();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return ((IEnumerable<T>) this).GetEnumerator();
    }
}

Just use this in place of a normal Queue and it should do what you need.

Richard
Any reason for using your own count instead of asking the queue?
Jon Skeet
Don't ever use Monitor.Pulse, there are always better alternatives.
erikkallen
@erikkallen: That's a bold statement to make without giving any reason. It seems reasonable to me in this case - what would you do differently?
Jon Skeet