views:

278

answers:

10

Pseudo-code:

for each x in someArray {
    // possibly add an element to someArray
}

I forget the name of the exception this throws in some languages.

I'm curious to know why some languages prohibit this use case, whereas other languages allow it. Are the allowing languages unsafe -- open to some pitfall? Or are the prohibiting languages simply being overly cautious, or perhaps lazy (they could have implemented the language to gracefully handle this case, but simply didn't bother).

Thanks!

+3  A: 

Suppose your array has 10 elements. You get to the 7th element, and decide there that you need to add a new element earlier in the array. Uh-oh! That element doesn't get iterated on! for each has the semantics, to me at least, of operating on each and every element of the array, once and only once.

Nick Lewis
+1  A: 

Your pseudo example code would lead to an infinite loop. For each element you look at, you add one to the collection, hence if you have at least 1 element to start with, you will have i (iterative counter) + 1 elements.

Alex
He says "possibly add" not "always add" :)
nos
+1  A: 

Arrays are typically fixed in the number of elements. You get flexible sized widths through wrapped objects (such as List) that allow the flexibility to occur. I suspect that the language may have issues if the mechanism they used created a whole new array to allow for the edit.

Chance
+4  A: 

Depending on the language (or platform, as .Net), iteration may be implemented differently.

Typically a foreach creates an Iterator or Enumerator object on the array, which internally keeps its state about the iteration details. If you modify the array (by adding or deleting an element), the iterator state would be inconsistent in regard to the new state of the array.

Platforms such as .Net allow you to define your own enumerators which may not be susceptible to adding/removing elements of the underlying array.

A generic solution to the problem of adding/removing elements while iterating is to collect the elements in a new list/collection/array, and add/remove the collected elements after the enumeration has completed.

devio
+9  A: 

What would you want the behavior to be?

list = [1,2,3,4]
foreach x in list:
    print x
    if x == 2: list.remove(1)

possible behaviors:

list is some linked-list type iterator, where deletions don't affect your current iterator:

[1,2,3,4]

list is some array, where your iterator iterates via pointer increment:

[1,2,4]

same as before, only the system tries to cache the iteration count

[1,2,4,<segfault>]

The problem is that different collections implementing this enumerable/sequence interface that allows for foreach-looping have different behaviors.

Jimmy
A: 

To implement the lists and enumerators to handle this, would mean a lot of overhead. This overhead would always be there, and it would only be useful in a vast miniority of the cases.

Also, any implementation that were chosen would not always make sense. Take for example the simple case of inserting an item in the list while enumerating it, would the new item always be included in the enumeration, always excluded, or should that depend on where in the list the item was added? If I insert the item at the current position, would that change the value of the Current property of the enumerator, and should it skip the currently current item which is then the next item?

Guffa
A: 

This only happens within foreach blocks. Use a for loop with an index value and you'll be allowed to. Just make sure to iterate backwards so that you can delete items without causing issues.

Spencer Ruport
A: 

From the top of my head there could be two scenarios of implementing iteration on a collection.

  1. the iterator iterates over the collection for which it was created

  2. the iterator iterates over a copy of the collection for which it was created

when changes are made to the collection on the fly, the first option should either update its iteration sequence (which could be very hard or even impossible to do reliably) or just deny the possibility (throw an exception). The last of which obviously is the safe option.

In the second option changes can be made upon the original collection without bothering the iteration sequence. But any adjustments will not be seen in the iteration, this might be confusing for users (leaky abstraction).

I could imagine languages/libraries implementing any of these possibilities with equal merit.

NomeN
+1  A: 

Many compiled languages implement "for" loops with the assumption that the number of iterations will be calculated once at loop startup (or better yet, compile time). This means that if you change the value of the "to" variable inside the "for i = 1 to x" loop, it won't change the number of iterations. Doing this allows a legion of loop optimizations, which are very important in speeding up number-crunching applications.

If you don't like that semantics, the idea is that you should use the language's "while" construct instead.

Note that in this view of the world, C and C++ don't have proper "for" loops, just fancy "while" loops.

T.E.D.
A: 

In a word, efficiency. If a data structure is guaranteed, at compile time, to have fixed size then, as one of the earlier respondents noted, all sorts of useful optimisations for access can be applied. And for those of us who play linear algebra on computers this behaviour is ideal. Of course, the actual size may be allocated at run time, but the compiler knows that the size won't change while the array remains allocated so can apply many optimisations.

A lot of these optimisations will arise because the program can compute a dope vector for the array once and read or write any element with one address calculation. If you allow variable sizes you either:

-- have to allocate more space for the resized array and move the data across from old to new array; this is very expensive in time; OR

-- start using pointers to find next elements, or jump over deleted elements; this can be expensive in space and somewhat expensive in time too.

Other optimisations come from the compiler knowing that if you want the next, say, 24 array elements, it only has to shift the next 24 x b bytes from memory to cache, where b is the size of each element in the array. Since memory bandwidth is one of the key bottlenecks in high-performance computing this sort of behaviour is very very desirable.

I'll stick my neck out and claim that if it isn't fixed in size it isn't an array, it's something else: a linked list, a stack, a set, a what-have-you but not an array. When you get to 2D arrays (and I mean real 2D arrays, not 1D arrays of 1D arrays) removing or adding single elements is conceptually a difficult problem too.

So, to answer your question, some languages implement the ADT of an Array correctly, that is they don't provide resizing operations because Arrays don't have resizing operations.

Regards

Mark

High Performance Mark