views:

213

answers:

9

I would think that Iterator.copy() would be quite a handy function. You could implement iterator filters in a much better way.

For example, the only reason in Googles Java Collection for the filter (and similar) functions to use UnmodifiableIterator (which is just an Iterator without remove) is because you cannot implement such a filter Iterator otherwise without being able to copy it at some point. (Really, that is not possible with the current interface; try yourself.)

Another advantage would be that you could use an iterator in a for-each-loop: Because a copy-able iterator would automatically also be iterable. See also this question. Right now, the main design reason to not allow this is because an Iterator which implements Iterable and Iterator<T> iterator() { return this; } would invalidate the iterator. By having a copy function, it is as simple as Iterator<T> iterator() { return copy(); } and it would not invalidate the original iterator. Thus there is no reason anymore to not allow this.

Is there any reason? Just to make it less complicated to implement it?

+4  A: 

Although they usually are, Iterators do not theoretically have to be linked to a collection. The copy method over an input stream, for instance, would be difficult to implement, and would very easily cause obscure memory problems.

ILMTitan
When copying an iterator, I don't meant to copy the underlying collection. Naturally, they would link of course to the same collection. (You have that also in other languages. That is just the most natural behavior.)
Albert
But streams don't hold very much data, so to copy a stream based iterator, one would have to store that data so it could be re-iterated.
ILMTitan
But you can already create multiple separate iterators of the same stream. Copying an iterator would naturally behave in the same way as another iterator on the same stream which was created in a somewhat other way.
Albert
I would think that an iterator and its copy would always return the same sequence of objects. Two different iterators from the same stream, however, I would expect to always pull the next element from the stream (leaving aside that two living iterators of the same stream would have all sorts of concurrency issues).
ILMTitan
A: 

Is there any reason? Just to make it less complicated to implement it?

It would be simple to design and implement an Iterator wrapper class that supported a copy operation. I'm not sure it would be generally useful though, not least because in the general case it would be an expensive operation. This alone would be sufficient reason for the Java designers to not want to add copy() to the Iterator interface.

FOLLOWUP

This is the kind of thing I'm thinking of:

public class CopyableIterator<T> implements Iterator<T> {
    private Iterator<T> it;
    private List<T> copy = new ArrayList<T>();
    private int pos;
    public CopyableIterator(Iterator<T> it) {
        while (it.hasNext()) {
            copy.append(it.next());
        }
        this.it = copy.iterator();
    }
    public T next() {
        T res = next();
        pos++;
        return res;
    }
    public boolean hasNext() {
        return it.hasNext();
    }
    public Iterator<T> copy() {
        return copy.sublist(pos, copy.size()).iterator();
    }
    public void remove() {
        throw new UnsupportedOperationException();
    }
}

The reasoning is this:

  • If I'm wrapping an opaque Iterator, then the only way I can copy it is by reading it using next() and hasNext() and constructing the copy Iterator from that.

  • But I have to do that before I start using the original iterator.

  • The simple way to do that is to make that copy of the iterator's content before I start using it. (It could possibly be done with a lazy incremental copying, but the implementation could get very complicated ... especially when you consider copying copied iterators.)

The method proposed in the other answer is limited to plain collection iterators. If you have a wrapped iterator, or an iterator from some other source that (for instance) doesn't implement Iterable, then you are toasted.

And even with that precondition, the method above doesn't return a true copy of the iterator. Rather, it returns a new iterator for the underlying collection. That is an important distinction. Unless you actually copy the iterated element references, there's no guarantee that the iterators will return the same sequence. Look at the documented behaviour of the iterators for the Concurrent... collection types.

Stephen C
`Iterator` wrapper around what? And why would it be expensive? I cannot think of a single case where copying an iterator would be an expensive operation (as it is not much more than a pointer).
Albert
An Iterator wrapper around an Iterator. And it would be expensive because it has to keep its own internal copy of the iterated objects. The only cases where you can implement `copy` efficiently are for iterators that directly iterate an in-memory collection or a random access (seekable) stream.
Stephen C
Why do they need a copy of the iterated objects?
Albert
+1  A: 

What exactly would it mean to copy an Iterator? Do you mean than it should be able to create a new Iterator just like itself except starting at the beginning? That's the responsibility of an Iterable... it doesn't make sense to duplicate that functionality, especially given the stateful nature of iterators... it would just confuse things.

What would you expect to happen if you wrote:

Iterator<Foo> iter = someIterable.iterator();
iter.next();
iter.next();
for (Foo foo : iter) {
  ...
}

Would you expect the for loop to iterate over every item the iterator would return, or every one except the first two? Would you expect the iterator to be empty after the for loop completes?

ColinD
A copy of an iterator would just mean the most natural thing: having an independent copy of it (like in most other languages which have an iterator concept). I.e. advancing the first one would not advance the second one.
Albert
And what I would expect is quite obvious and natural: The for-loop would iterate from third element to the end.
Albert
@Albert: While it seems obvious and natural to you, I don't think it can be objectively said that it IS natural. The expectation with an iterator is that as you iterate through it, its cursor moves forward. After you've iterated to the end, you're done. It would be unnatural to iterate through the elements in a for loop and have the iterator still at the beginning after it completes. That said, I think the main issue is the extra burden on implementations of `Iterator`.
ColinD
@CollinD I think Albert means a bit different thing than you do. The iterator copying would be handy if you for example want to go through a list of numbers and mark those that are divisible by 3 by storing their iterator to some other list for further processing. You cannot really implement that without copying the current iterator in the for loop.
dark_charlie
@dark_charlie: I'm not sure what you mean by "storing their iterator to some other list". If what you mean is that you want to get a list of numbers that are divisible by 3 from the iterator, you could do that just fine without copying the iterator, unless you're saying you want the iterator to remain unchanged. In that case, the object you're working with should probably be an `Iterable` and not an `Iterator` to begin with since that's exactly the function of an `Iterable`!
ColinD
The numbers probably weren't a good example. Consider a case when you loop over a list and need to make "bookmarks" when you find an interesting object. You do not want to save references to the interesting objects themselves because you want to be able to take the bookmark and find an item in the list that follows/precedes it (for example). The most elegant way to achieve this is to store a copy of the iterator as the bookmark.
dark_charlie
@dark_charlie: Even if you could make a copy of an iterator, that wouldn't work well for your example because iterators are for iterating, not marking specific locations in a list. If you found an item and made a copy of the iterator there, there wouldn't be any convenient way to get the item you're "bookmarking" because an iterator only has `next()` (`prev()` too for list iterators) and no `current()`. Iterator cursors are always BETWEEN items, not on them. That only differs for `remove()`, which uses the most recent item the iterator returned.
ColinD
@ColinD You can always dereference an iterator (what would be the point of it if that weren't possible), therefore it can be used for marking positions. As you say, in Java this is done using the next() method which is another weird design in Java iterators - the method actually has two responsibilities. This, of course, makes "bookmarking" more difficult. What do you suggest to use for the "bookmarking" if not iterators? What objects should one store instead?
dark_charlie
@dark_charlie: There's nothing weird about `next()` if you consider iterators as objects that are solely intended for _iteration_ use, which they are. I'd say that if you have some concept of a specific location in the sequence, you should probably be using a `List` and storing indices in that list as bookmarks. If you're dealing with something other than a list, you may need to come up with some other solution. Iterators just aren't intended for such a thing.
ColinD
@ColinD The Java iterators apparently are not. In other languages the iterator concept is used as a more general term, more like pointers to the sequence that can be dereferenced, incremented (and possibly decremented). Is there any generic interface in Java that serves this generalized "bookmarking" purpose? I suppose not. And that's what Albert's question is basically about - why not? Storing indices is not really a solution, just a workaround.
dark_charlie
+1  A: 

The only reason why Google have UnmodifiableIterator is to basically guarantee immutability in their collections. They're making sure that there's no way that you can change the internal state of a collection.

Don't forget that the original idea for an iterator is that it's a pointer to the current element during transveral, and it manages to next/previous transversal (for reverse for doubly-linked iterators) to the element next/previous to it.

There is no actual reason why iterators aren't Cloneable, quite simply, cloning an iterator will still mean having an iterator pointing to the same collection elements (except it now lives in 2 different address space). Unless you want the cloned iterator to point to another collections, there is no point.

The Elite Gentleman
No, that is not the reason why they use `UnmodifiableIterator` for functions like `filter`. There is just no reason to not have `filter` or similar functions also for normal iterators. By doing that, it would also work for `Unm..Iterator` and it would be more general. The point is, this is not possible because `Iterator` is not copyable. Check out their implementation. Or try to implement `filter` yourself on a normal `Iterator`.
Albert
@Albert, Don't forget that Google strictly follow Joshua Bloch Collections API (which is standard in Java) and thus is compatible with the java JDK. `ImmutableMap`, `ImmutableList`, `ImmutableSet` all are immutable collections and according to the Collections API, you need to return an iterator. If you can provide a non-modifiable iterator, I can simple do `iterator.remove()` and the state of the collection has changed. That's not what Google wanted. Download Guava and read the source code.
The Elite Gentleman
@The Elite: I have. That was also not my point. My point is, it is not possible to implement `filter` otherwise. If it would be, i.e. if there would be a `copy`, they would have made it more general. And by using `Unm..Iterator` together with that more general `filter`, it would still be unmodifiable.
Albert
+1  A: 

You can always implement your own CopyableIterator that implements Iterator. And then you can do

new CopyableItereator(collection);

The class would be like this

class CopyableIterator implements Iterator{
Iterator iterator;
Collection collection;
int index=0;

public CopyableIterator(Collection collection){
super();
this.collection = collection;
this.iterator = collection.iterator();
}

public CopyableIterator(Collection collection, int index){
super();
this.collection =collection;
this.iterator = collection.iterator();
this.advanceToIndex(iterator,index); //This function just moves the iterator till the index.
this.index=index;
}

//Override the functions of Iterator here returning iterator.function()

@Override
public Object next(){
index++;
return this.iterator.next();
}

public CopyableIterator copy(){
return new CopyableIterator(this.collection,this.index)

}

}

Disclaimer: This is roughly the class. It has not been tested.

pakore
The problem is that all standard containers do not implement this interface.
Albert
Yeah, I know, but it's a solution that allows you to do what you want without changing the JDK :P.
pakore
A: 

Iterators were created to step through all the objects in a collection one at a time, using data backed by said collection.

Iterator<T> is almost always implemented using a private inner class that may use state that is part of the outer class. Thus, you can't really modify an Iterator's behavior without writing your own Collection (or whatever) as well.

Copying the Iterator could cause a host of new problems, such as falling out of sync with the backing collection.

R. Bemrose
But you already have the same situation already if you call `iterator()` several times on an `Iterable` object.
Albert
+1  A: 

I wanted something like this, here is what I've done (based on some work done on Lambdaj).
The main flaw is that this Iterator will basically fill a List with all the supposed content of the Iterator which could be really heavy in memory.

Why did I used a List, because sometimes an Iterator iterates in a specific order, so the "sub-Iterators" must do the same (and the ListIterator really helps me here).

public class IterableIterator<T> implements Iterable<T>, Iterator<T> {
    //The content of the given iterator. Will be filled by its iterators.
    private final List<T> iteratorContent = new ArrayList<T>();
    private final Iterator<T> originalIterator;
    private final Iterator<T> innerIterator;

    public IterableIterator(Iterator<T> originalIterator) {
        this(originalIterator, false);
    }

    public IterableIterator(Iterator<T> originalIterator, boolean cache) {
        if (originalIterator == null) {
            throw new IllegalArgumentException("Parameter can't be null");
        }

        this.originalIterator = originalIterator;
        if (cache) {
            while (originalIterator.hasNext()) {
                iteratorContent.add(originalIterator.next());
            }
        }

        innerIterator = iterator();
    }

    @Override
    public Iterator<T> iterator() {
        return new IteratorIterator();
    }

    @Override
    public boolean hasNext() {
        return innerIterator.hasNext();
    }

    @Override
    public T next() {
        return innerIterator.next();
    }

    @Override
    public void remove() {
        innerIterator.remove();
    }

    private class IteratorIterator implements Iterator<T> {
        private ListIterator<T> innerIterator = iteratorContent.listIterator();

        @Override
        public boolean hasNext() {
            return innerIterator.hasNext() || originalIterator.hasNext();
        }

        @Override
        public T next() {
            if (!innerIterator.hasNext() && originalIterator.hasNext()) {
                T item;
                synchronized (originalIterator) {
                    item = originalIterator.next();
                    iteratorContent.add(item);
                }
                innerIterator = iteratorContent.listIterator(innerIterator.nextIndex());
            }
            if (innerIterator.hasNext()) {
                try {
                    return innerIterator.next();
                } catch (ConcurrentModificationException e) {
                    //Quick and dirty solution if you have a concurrent modification.
                    //It can't happen from the outside, so you can easily suppose that another originalIterator
                    //from this class has been called and had added elements to the list.
                    //Best thing to do, reset the originalIterator to the current position.
                    innerIterator = iteratorContent.listIterator(innerIterator.nextIndex());
                    return innerIterator.next();
                }
            }

            throw new NoSuchElementException();
        }

        @Override
        public void remove() {
            throw new UnsupportedOperationException();
        }
    }
}
Colin Hebert
A: 

An iterator represents a position in a stream (or Iterable in java speak), and there is no guarantee that the stream can be copied. The fact that the iterators you usually see are over a collection is incidental.

Christoffer Hammarström
A: 

Ok, seems that no one really can explain it, so the answer seems to be:

Bad design decision.

Albert