views:

551

answers:

7

I was testing out some synchronization constructs and I noticed something that confused me. When I was enumerating through a collection while writing to it at the same time, it threw an exception (this was expected), but when I looped through the collection using a for loop, it did not. Can someone explain this? I thought that a List does not allow a reader and writer to operate at the same time. I would have expected looping through the collection to exhibit the same behavior as using an enumerator.

UPDATE: This is a purely academic exercise. I undersand that enumerating a list is bad if it is being written to at the same time. I also understand that I need a synchronization construct. My question again was about why operation one throws an exception as expected but the other does not.

Code is below:

   class Program
   {
    private static List<string> _collection = new List<string>();
    static void Main(string[] args)
    {
        ThreadPool.QueueUserWorkItem(new WaitCallback(AddItems), null);
        System.Threading.Thread.Sleep(5000);
        ThreadPool.QueueUserWorkItem(new WaitCallback(DisplayItems), null);
        Console.ReadLine();
    }

    public static void AddItems(object state_)
    {
        for (int i = 1; i <= 50; i++)
        {
            _collection.Add(i.ToString());
            Console.WriteLine("Adding " + i);
            System.Threading.Thread.Sleep(150);
        }
    }

    public static void DisplayItems(object state_)
    {
        // This will not throw an exception
        //for (int i = 0; i < _collection.Count; i++)
        //{
        //    Console.WriteLine("Reading " + _collection[i]);
        //    System.Threading.Thread.Sleep(150);
        //}

        // This will throw an exception
        List<string>.Enumerator enumerator = _collection.GetEnumerator();
        while (enumerator.MoveNext())
        {
            string value = enumerator.Current;
            System.Threading.Thread.Sleep(150);
            Console.WriteLine("Reading " + value);
        }
    }
}
+1  A: 

The enumerator becomes invalid once the list is changed. If you are changing the list while enumerating over the list, you will need to rethink your strategy a little.

Get a fresh enumerator when you begin your display function and lock the list while this is going on. Alternatively, do a deep copy of your List into a new _displayCollection List and enumerate through this separate collection, which will not be written to except being filled before the display process begins. Hope this helps.

Chris Ballance
This does not fully explain why the code snippet posted does not work
Mitch Wheat
Not a problem with the code, per se. It is a problem with not understanding how enumerators work. It could be fixed by locking the enumerator while displaying and getting a fresh enumerator before each display.
Chris Ballance
It is a problem with the code, per se.
Mitch Wheat
A: 

The code is flawed, in that you are sleeping for 5 seconds but not all of the items have been added to the list. This means you start to display the items on one thread before the first thread has finished adding items to the list, causing the underlying collection to chnage and invalidating the enumerator.

Removing the Thread.Sleep from the Add code highlights this:

public static void AddItems(object state_)
{     
   for (int i = 1; i <= 50; i++)      
   {        
       _collection.Add(i.ToString());      
       Console.WriteLine("Adding " + i);  
   }   
}

Rather than sleeping you should use a synchronisation mechanism that waits for the first thread to complete it's work of adding items.

Mitch Wheat
And why the downvote?
Mitch Wheat
This has nothing to do with the sleep mechanism. The problem is with the synchronization as you listed below the code snippet. (so I removed the downvote)
Chris Ballance
@Chris Ballance: if you take a look at my answer I am explaining why the code posted is bahaving as described. I clearly state why the enumerator is becoming invalid and throwing an exception. sometimes SO is crazy!
Mitch Wheat
BTW, I am not suggesting that removeing the Thread.Sleep() from the Add loop is a fix!!
Mitch Wheat
A: 

You can't change a collection WHILE ENUMERATING through it.

the problem is that you start to enumerate while your collection is NOT FULL, and try to KEEP ADDING items WHILE ENUMERATING

roman m
+1  A: 

The difference is when you say you are "looping through the collection" you aren't actually looping through the collection, you're iterating over integers between 1 and 50, and adding to the collection at those indices. This has no effect on the fact that the numbers between 1 and 50 still exist.

When you enumerate the list, you are enumerating the items, not the indices. So when you add items while enumerating, you invalidate the enumeration. It's built that way to prevent cases like what you are doing, where potentially you could enumerate to item 6 in the list at the same time as inserting an item at index 6, where you would enumerate potentially the old or the new item, or some undefined state.

Look for a "threadsafe" List if you want to do this, but be prepared to deal with the inaccuracies of read+write at the same time :)

Chris Cameron
+11  A: 

You can't modify a collection while enumerating over it. That rule exists even without threading issues considered. From MSDN:

An enumerator remains valid as long as the collection remains unchanged. If changes are made to the collection, such as adding, modifying, or deleting elements, the enumerator is irrecoverably invalidated and its behavior is undefined.

A integer-based for loop is not actually an enumerator. In most scenarios is accomplishes the same thing. However, the interface for IEnumerator guarantees that you can iterate through the entire collection. The platform enforces this internally by throwing an exception if a call to MoveNext occurs after the collection has been modified. This exception is thrown by the enumerator object.

The integer-based for loop only goes through its list of numbers. When you index the collection by integer, you are just getting the item at that position. If something has been inserted or deleted from the list, you may skip an item or run the same item twice. This can be useful in certain situations when you need to modify a collection while traversing it. The for loop has no enumerator object to guarantee the IEnumerator contract, therefore no exception is thrown.

Matt Brunell
that is correct, but it does not actually explain WHY the posted code is throwing.
Mitch Wheat
It looks like the code was designed specifically to violate IEnumerator given the length of the Thread.Sleep() calls. I understood the question more as 'How are these two iteration stragies different?'
Matt Brunell
@Matt Brunell, we have a constant running history of your posts so you don't need to pepper your answers with "Update:" or "Edit:". In one click I can see the edits you made if I wish. Just strive to make your post as clear as possible without the pseudo update/edit tags. Good answer.
Simucal
Good point. Thanks.
Matt Brunell
+1  A: 

The list has an internal version counter, that is updated when you change the content of the list. The enumerator keeps track of the version and throws an exception when it sees that the list has changed.

When you are just looping the list there is nothing that keeps track of the version, so there is nothing that catches that the list has changed.

If you change the list while you are looping it you may get unwanted effects, which is what the enumerator protects you against. If you for example remove an item from the list without changing the loop index so that it still points to the same item, you can miss items in the loop. Likewise if you insert items without correcting the index, you may iterate over the same item more than once.

Guffa
+1  A: 

To answer your actual question...

When enumerating, you will get an IEnumerator that is bound to the state of the list as it was when you asked for it. Further operations operate on the enumerator(MoveNext, Current).

When using a for loop, you are making a sequence if calls to get a particular item by index. There isn't an outside context such as the enumerator that knows that you're in a loop. For all the collection knows, you're only asking for one item. Since the collection never handed out an enumerator, there's no way for it to know that the reason you're asking for item 0, then item 1, then item 2, etc is because you're walking the list.

If you're mucking with the list at the same time as walking it, you're going to get errors either way. If adding items, then the for loop may skip some silently, while the foreach loop will throw. If removing items, then the for loop may throw an index out of range if you're unlucky, but will probably work most of the time.

But I think you understand all that, your question was simply why two ways of iterating behaved differently. The answer to that is the state of the collection is known(to the collection) when you call GetEnumerator in one case, and when you call get_Item in the other case.

Darren Clark