views:

656

answers:

10

Hello good people,

I was using a foreach loop to go through a list of data to process (removing said data once processed--this was inside a lock). This method caused an ArgumentException now and then.

Catching it would have been expensive so I tried tracking down the issue but I couldn't figure it out.

I have since switched to a for loop and the problem seems to have went away. Can someone explain what happened? Even with the exception message I don't quite understand what took place behind the scenes.

Why is the for loop apparently working? Did I set up the foreach loop wrong or what?

This is pretty much how my loops were set up:

foreach (string data in new List<string>(Foo.Requests))
{
    // Process the data.

    lock (Foo.Requests)
    {
        Foo.Requests.Remove(data);
    }
}

and

for (int i = 0; i < Foo.Requests.Count; i++)
{
    string data = Foo.Requests[i];

    // Process the data.

    lock (Foo.Requests)
    {
        Foo.Requests.Remove(data);
    }
}

EDIT: The for* loop is in a while setup like so:

while (running)
{
    // [...]
}

EDIT: Added more information about the exception as requested.

System.ArgumentException: Destination array was not long enough. Check destIndex and length, and the array's lower bounds
  at System.Array.Copy (System.Array sourceArray, Int32 sourceIndex, System.Array destinationArray, Int32 destinationIndex, Int32 length) [0x00000] 
  at System.Collections.Generic.List`1[System.String].CopyTo (System.String[] array, Int32 arrayIndex) [0x00000] 
  at System.Collections.Generic.List`1[System.String].AddCollection (ICollection`1 collection) [0x00000] 
  at System.Collections.Generic.List`1[System.String]..ctor (IEnumerable`1 collection) [0x00000]

EDIT: The reason for the locking is that there is another thread adding data. Also, eventually, more than one thread will be processing data (so if the entire setup is wrong, please advise).

EDIT: It was hard to pick a good answer.

I found Eric Lippert's comment deserving but he didn't really answer (up-voted his comment anyhow).

Pavel Minaev, Joel Coehoorn and Thorarin all gave answers I liked and up-voted. Thorarin also took an extra 20 minutes to write some helpful code.

I which I could accept all 3 and have it split the reputation but alas.

Pavel Minaev is the next deserving so he gets the credit.

Thanks for the help good people. :)

+2  A: 

Three things:
- I wouldn't put them lock within the for(each) statement, but outside of it.
- I wouldn't lock the actual collection, but a local static object
- You can not modify a list/collection that you're enumerating

For more information check:
http://msdn.microsoft.com/en-us/library/c5kehkcz(VS.80).aspx

lock (lockObject) {
   foreach (string data in new List<string>(Foo.Requests))
        Foo.Requests.Remove(data);
}
Zyphrax
He's not modifying the collection that is being enumerated. He made a copy of it.
Pavel Minaev
In the second example (the for code) he's clearly modifying the existing collection. Not a copy.
Zyphrax
Yes, but this won't make the `for` loop fail (though it may give incorrect results if you don't account for it). His question is why `foreach` fails.
Pavel Minaev
If you lock the inside of the foreach the List isn't fully locked. It could possible change with each loop-cycle.
Zyphrax
+5  A: 

Your locking scheme is broken. You need to lock Foo.Requests() for the entire duration of the loop, not just when removing an item. Otherwise the item might become invalid in the middle of your "process the data" operation and enumeration might change in between moving from item to item. And that assumes you don't need to insert the collection during this interval as well. If that's the case, you really need to re-factor to use a proper producer/consumer queue.

Joel Coehoorn
Enumeration won't change, because he makes a copy of the collection specifically to enumerate. And `Remove` won't complain if it's passed an item that isn't in the collection (e.g. because another thread removed it already).
Pavel Minaev
That might not be what you want. Processing the data can potentially take a long time, while new data gets added asynchronously. Then again, that would mean there is still data to be processed after the loop has ended, seeing he's making a shallow copy of Foo.Requests.
Thorarin
@Pavel: He's copying references. I'm talking more about another thread removing (and probably processing as well) the referred object from the base Requests collection. He could end up with double work. Or a thread could insert in the middle and he could end up missing something.
Joel Coehoorn
+2  A: 

To be completely honest, I would suggest refactoring that. You are removing items from the object while also iterating over that. Your loop could actually exit before you've processed all items.

Polaris878
If he were removing items from the source he's iterating over, he'd be looking at a very big exception *every* time. That's not the case here.
Thorarin
In his second case with the for loop he is...
Polaris878
The question is about the foreach however. But I agree on your suggestion to refactor :)
Thorarin
+9  A: 

Your problem is that the constructor of List<T> that creates a new list from IEnumerable (which is what you call) isn't thread-safe with respect to its argument. What happens is that while this:

 new List<string>(Foo.Requests)

is executing, another thread changes Foo.Requests. You'll have to lock it for the duration of that call.

[EDIT]

As pointed out by Eric, another problem List<T> isn't guaranteed safe for readers to read while another thread is changing it, either. I.e. concurrent readers are okay, but concurrent reader and writer are not. And while you lock your writes against each other, you don't lock your reads against your writes.

Pavel Minaev
+4  A: 

After seeing your exception; it looks to me that Foo.Requests is being changed while the shallow copy is being constructed. Change it to something like this:

List<string> requests;

lock (Foo.Requests)
{
    requests = new List<string>(Foo.Requests);
}

foreach (string data in requests)
{
    // Process the data.

    lock (Foo.Requests)
    {
        Foo.Requests.Remove(data);
    }
}


Not the question, but...

That being said, I somewhat doubt the above is what you want either. If new requests are coming in during processing, they will not have been processed when your foreach loop terminates. Since I was bored, here's something along the lines that I think you're trying to achieve:

class RequestProcessingThread
{
    // Used to signal this thread when there is new work to be done
    private AutoResetEvent _processingNeeded = new AutoResetEvent(true);

    // Used for request to terminate processing
    private ManualResetEvent _stopProcessing = new ManualResetEvent(false);

    // Signalled when thread has stopped processing
    private AutoResetEvent _processingStopped = new AutoResetEvent(false);

    /// <summary>
    /// Called to start processing
    /// </summary>
    public void Start()
    {
        _stopProcessing.Reset();

        Thread thread = new Thread(ProcessRequests);
        thread.Start();
    }

    /// <summary>
    /// Called to request a graceful shutdown of the processing thread
    /// </summary>
    public void Stop()
    {
        _stopProcessing.Set();

        // Optionally wait for thread to terminate here
        _processingStopped.WaitOne();
    }

    /// <summary>
    /// This method does the actual work
    /// </summary>
    private void ProcessRequests()
    {
        WaitHandle[] waitHandles = new WaitHandle[] { _processingNeeded, _stopProcessing };

        Foo.RequestAdded += OnRequestAdded;

        while (true)
        {
            while (Foo.Requests.Count > 0)
            {
                string request;
                lock (Foo.Requests)
                {
                    request = Foo.Requests.Peek();
                }

                // Process request
                Debug.WriteLine(request);

                lock (Foo.Requests)
                {
                    Foo.Requests.Dequeue();
                }
            }

            if (WaitHandle.WaitAny(waitHandles) == 1)
            {
                // _stopProcessing was signalled, exit the loop
                break;
            }
        }

        Foo.RequestAdded -= ProcessRequests;

        _processingStopped.Set();
    }

    /// <summary>
    /// This method will be called when a new requests gets added to the queue
    /// </summary>
    private void OnRequestAdded()
    {
        _processingNeeded.Set();
    }
}


static class Foo
{
    public delegate void RequestAddedHandler();
    public static event RequestAddedHandler RequestAdded;

    static Foo()
    {
        Requests = new Queue<string>();
    }

    public static Queue<string> Requests
    {
        get;
        private set;
    }

    public static void AddRequest(string request)
    {
        lock (Requests)
        {
            Requests.Enqueue(request);
        }

        if (RequestAdded != null)
        {
            RequestAdded();
        }
    }
}

There are still a few problems with this, which I will leave to the reader:

  • Checking for _stopProcessing should probably be done after every time a request is processed
  • The Peek() / Dequeue() approach won't work if you have multiple threads doing processing
  • Insufficient encapsulation: Foo.Requests is accessible, but Foo.AddRequest needs to be used to add any requests if you want them processed.
  • In case of multiple processing threads: need to handle the queue being empty inside the loop, since there is no lock around the Count > 0 check.
Thorarin
+1 based on the exception trace.
Daniel Earwicker
Some people are advocating putting the lock around the entire foreach loop. I was taught the lock should be as short as possible (much like what's being done here. How viable is this approach? Will it work with multiple threads?
Fake Code Monkey Rashid
It will work, but what is best in your situation depends on a number of things. How long does the processing take? If you put the lock around the loop, the process that's adding the request will block while requests are being processed. I imagine this is not what you want, because in that case it would be far easier to just use a single threaded solution.
Thorarin
I've added a bunch of code which should do what I *think* you want to do...
Thorarin
This is vaguely what I'm doing if you reduce it to pseudo code but it's not exactly what I'm doing. This is an approach that I hadn't thought of and can learn from. Of course it won't work with multiple threads as-is but it's still interesting nonetheless so thanks for sharing. :)
Fake Code Monkey Rashid
+2  A: 

The problem is the expression

new List<string>(Foo.Requests)

inside your foreach, because it's not under a lock. I assume that while .NET copies your requests collection into a new list, the list is modified by another thread

chris166
+1  A: 
foreach (string data in new List<string>(Foo.Requests))
{
    // Process the data.
    lock (Foo.Requests)
    {
        Foo.Requests.Remove(data);
    }
}

Suppose you have two threads executing this code.

at System.Collections.Generic.List1[System.String]..ctor

  • Thread1 starts processing the list.
  • Thread2 calls the List constructor, which takes a count for the array to be created.
  • Thread1 changes the number of items in the list.
  • Thread2 has the wrong number of items.

Your locking scheme is wrong. It's even wrong in the for loop example.

You need to lock every time you access the shared resource - even to read or copy it. This doesn't mean you need to lock for the whole operation. It does mean that everyone sharing this shared resource needs to participate in the locking scheme.

Also consider defensive copying:

List<string> todos = null;
List<string> empty = new List<string>();
lock(Foo.Requests)
{
  todos = Foo.Requests;
  Foo.Requests = empty;
}

//now process local list todos

Even so, all those that share Foo.Requests must participate in the locking scheme.

David B
A: 

You are trying to remove objects from list as you are iterating through list. (OK, technically, you are not doing this, but that's the goal you are trying to achieve).

Here's how you do it properly: while iterating, construct another list of entries that you want to remove. Simply construct another (temp) list, put all entries you want to remove from original list into the temp list.

List entries_to_remove = new List(...);

foreach( entry in original_list ) {
   if( entry.someCondition() == true ) { 
      entries_to_remove.add( entry );
   }
}

// Then when done iterating do: 
original_list.removeAll( entries_to_remove );

Using "removeAll" method of List class.

Valters Vingolds
A: 

your for loop and your foreach loop is not equal to each other when it comes to the work performed. As noted in a comment to the question you're not visiting all items in the for loop if your list starts out with 10 elements when you've removed 5 the count property will be 5 and i will be 5 as well exiting the for loop.

EDIT: I was tired when I read the code. As John points out my comment on the new'ing must have comed from the abyss. Any how the statement that the for loop only does half the work (which was the important one) still stands

Rune FS
Why do you think it will create a new list in every iteration?
John Saunders
A: 

I know it's not what you asked for, but just for the sake of my own sanity, does the following represent the intention of your code:

private object _locker = new object();

// ...

lock (_locker) {
    Foo.Requests.Clear();
}
John Saunders
Hmm, nope. I don't think that's it. I'm receiving data and putting it into a list to be processed. Even while said list is being processed, data is still being added it to. So under no circumstance would I want to just clear the whole list.
Fake Code Monkey Rashid
I see. I missed the significance of your comment "process the data".
John Saunders