views:

337

answers:

8

I've written my first C# iterator today. Woohoo.

Interestingly, it has side effects. My iterator filters out invalid files from a directory and returns a sequence of valid files to process. Wheneve it encounters an invlaid file, it moves it to another directory.

I tried implementing it as a LINQ query, but really don't like the fact that the predicate for the where clause has side effects. That's a definite smell.

I could implement it explicitly, looping over all files and handling the good or the bad in turn, but it's not very elegant. A better solution is to split it into two lists (good and bad) and process each in turn.

But then I remembered iterators. And I've now got an iterator that yields the valid files and handles (moves) the invalid ones.

So, my question is this: is it a bad idea for an iterator to have side effects such as this? Am I hiding too much functionality in an iterator?

Cheers Matt

+3  A: 

I'd say it's generally a bad idea to have side-effects in an iterator, but it's not a complete no-no. If you have side-effects, it makes it hard/impossible for callers to work in a purely functional way. This may or may not be a problem depending on your use case.

I suggest you have two ways of getting the iterator - one which has the side-effect (which is presumably an optimisation, basically) and one which doesn't (slower, but simpler to reason about). This could just be by passing a flag into the method, or having two methods named differently.

Jon Skeet
+1  A: 

Iterators that are logically enumerations over collections should not have side effects, no. In particular, they won't be idempotent when restarted with the IEnumerator.Reset() method.

However, the fact that iterators are effectively a kind of coroutine, they can be useful for implementing some things that are awkward to implement in other ways, e.g. steps in an asynchronous workflow.

Barry Kelly
IEnumerator.Reset() is very rarely implemented, mind you - and *never* implemented in an iterator block.
Jon Skeet
Sure - but the fact that it is part of the API indicates that it is a logical expectation that the enumeration is stable and restarting is idempotent.
Barry Kelly
The fact that it's part of the API was basically a mistake :) But yes, I think it's reasonable to expect that at least *most* iterators are idempotent and repeatable. That won't always be the case, of course - e.g. a LINQ query fetching results from a webservice may give different results each time.
Jon Skeet
(Continued) But one would hope that the reason for giving different results would be a "natural" one as opposed to "because you called me last time".
Jon Skeet
+4  A: 

Iterators with side-effects are BAD mkay? :)

If you have the sequence containing all files, you can have something visitor-ish that visits all the items and calls a function for each case. The discrimination in the visitor can be either as a predicate you can supply, or intrinsic in the visitor.

So, I don't speak C#, but something like this pseudo code:

good_handler = new FileHandler() {
  handle(File f) { print "Yay!"; }
}

bad_handler = new FileHandler() {
  handle(File f) { print "Nay!"; }
}

files = YourFileSequence();
visitor = new Visitor(good_handler, bad_handler);
visitor.visit(files);
Henrik Gustafsson
A: 

My rule of thumb is if I'm iterating over a collection, no. But in Python, a for loop is often used idiomatically to execute code a certain number of times, in which case I have no problem using it with side-effects.

J.T. Hurley
A: 

Thanks folks - and blimey! quick responses!

I have to agree that side effects in an iterator are a bad idea. The fact that I had to ask indicates a smell. Should have listened to my spidey-sense.

I think the main reason I asked was because my side effect was quite isolated from the main task, and so neatly encapsulated within the iterator. However, it's still hidden functionality, which is not very nice.

Also, I think I conflated the visitor idea with the iterator, which is also not a good idea.

I've since changed my implementation to produce 2 sequences from my original sequence of all files - one good, one bad. I can now process them in a more obvious and intuitive manner. Hooray.

So, I still haven't used an iterator in the Real World. Oh well.

Thanks! Matt

A: 

I would say side effects are a bad idea but not harmful. If you have side effects, you're essentially doing two operations. It is better to separate these operations into two functions so the code is easier to maintain and you could do them separately.

In this case, you're moving bad files out of the folder and something else to good files. Separating these operations allows you to move bad files without selecting good ones or lets you operate on good files (such as count them) without moving bad ones. Your code will also be more compartmentalized so it will be easier to optimize one of these operations should you need to.

Dan Goldstein
A: 

I think that there's actually a more immediate concern than the fact that your iterator has hidden side effects. Which is: you're changing the membership of the collection that it's iterating over. Even if the side effects didn't have a bad code smell, this would be something you'd have to be cautious about. There are ways you could implement this that might seem sensible (cache the file list and reuse it when resetting the iterator, say) that break if you're removing things from the collection.

Robert Rossney
A: 

I won't comment on the general case but in your case, I think that's dangerous. A good metric for interface quality is how easy it is to use the interface correctly and how hard it is to use it wrongly.

Applying that metric, your design scores quite low because it's incredibly easy to use it wrongly: just iterate over it twice.

I'd actually go further than Jon and say: don't even offer the option. It might be helpful but the price of potentially using this wrong might be too high. On the other hand, it could be argued that if the user intentionally makes a choice, (s)he has to deal with the consequences.

Konrad Rudolph