tags:

views:

297

answers:

9

So I ran into a situation today where some production code was failing precisely because a method performed exactly as documented in MSDN. Shame on me for not reading the documentation. However, I'm still scratching my head as to why it behaves this way, even if "by design", since this behavior is exactly opposite what I would have expected (and other, known behaviors) and therefore seems to violate the principle of least surprise.

The All() method allows you to supply a predicate (such as a lambda expression) to test an IQueryable, returning a Boolean value that indicates whether all collection members match the test. So far so good. Here's where it gets weird. All() also returns true if the collection is empty. This seems completely backwards to me, for the following reasons:

  • If the collection is empty, a test like this is, at best, undefined. If my driveway is empty, I cannot assert that all cars parked there are red. With this behavior, on an empty driveway all cars parked there are red AND blue AND checkerboard - all of these expressions would return true.
  • For anyone familiar with the SQL notion that NULL != NULL, this is unexpected behavior.
  • The Any() method behaves as expected, and (correctly) returns false because it does not have any members that match the predicate.

So my question is, why does All() behave this way? What problem does it solve? Does this violate the principle of least surprise?

I tagged this question as .NET 3.5, though the behavior also applies to .NET 4.0 as well.

EDIT Ok, so I grasp the logic aspect to this, as so excellently laid out by Jason and the rest of you. Admittedly, an empty collection is something of an edge case. I guess my question is rooted in the struggle that, just because something is logical doesn't mean it necessarily makes sense if you're not in the correct frame of mind.

+2  A: 

"If the collection is empty, a test like this is, at best, undefined. If my driveway is empty, I cannot assert that all cars parked there are red."

Yes you can.

To prove me wrong, show me a car on your empty driveway that is not red.

For anyone familiar with the SQL notion that NULL != NULL, this is unexpected behavior.

This is a quirk of SQL (and not quite true: NULL = NULL and NULL <> NULL are both undefined, and neither will match any rows.)

finnw
So show me a car on my empty driveway that is not red, that is red.
GalacticCowboy
Both statements are true. (1) There is no car on your driveway that is red and (2) There is no car on your driveway that is **not** red. No contradiction there.
finnw
I think that was what I was getting at - divorced from the whole All => !(!Any) question, the test can be both true and false from the perspective that there is not an element that does *not* match, but there is not an element that *does* match, either.
GalacticCowboy
+3  A: 

If the length of the items that return true is the same as the length of all the items, then return true. Simple as that

Driveway.Cars(a=>a.Red).Count() == Driveway.Cars.Count()

See: http://stackoverflow.com/questions/145509/why-does-abcd-startswith-return-true

Dested
+1  A: 

Returning true is also logical. You have two statements: "Have a car?" and "Is it red?" If the first statement is false, it doesn't matter what the second statement is, the result is true by modus ponens.

codekaizen
A: 

False means the query didn't return any results even without the predicate. Which is just an incomplete way of stating what Dested posted as I was typing this.

Wade
+2  A: 

I think it makes sense. In logic, the complement of FOR ALL is NOT (THERE EXIST). FOR ALL is like All(). THERE EXIST is like Any()

So IQueryable.All() is equivalent to !IQueryable.Any(). If your IQueryable is empty, then both returns true based on MSDN doc.

David
A: 

It's very similar to the basic concept of the number zero. Even though it represents the existence of absence, it still possesses and represents a value. IQueryable.All() should return true, because it will successfully return all of the members of the collection. It just so happens that if the collection is empty, the function won't return any members, but not because the function couldn't return any members. It was only because there were no members to return. That being said, why should IQueryable.All() have to experience failure due to the lack of support from the collection? It was willing, it was able...it was capable. Sounds to me like the collection couldn't hold up their end of the bargain...

http://mathforum.org/dr.math/faq/faq.divideby0.html

Neil T.
+1  A: 

You will find this behaviour quite often in other areas of mathematics or computer science.

The SUM operator in Math will return 0 (the neutral element of +) in cases where the ranges are invalid (the SUM from 0 up to -1). The MULTIPYL operator will return 1 (neutral element for multiplication).

Now if you have Boolean expressions, it's quite similar: The neutral element for OR is false (a OR false = a) whereas the neutral element for AND is true.

Now on Linq's ANY and ALL: They are similar to this:

ANY = a OR b OR c OR d ...
ALL = a AND b AND c AND d ...

So this behavior is just what "you would expect" if you have a math/cs background.

Marcel J.
+10  A: 

If my driveway is empty, I cannot assert that all cars parked there are red.

Consider the following statements.

S1: My driveway is empty.

S2: All the cars parked in my driveway are red.

I claim that S1 implies S2. That is, the statement S1 => S2 is true. I will do this by showing that it's negation is false. In this case, the negation of S1 => S2 is S1 ^ ~S2; this is because S1 => S2 is false only when S1 is true and S2 is false. What is the negation of S2? It is

~S2: There exists a car parked in my driveway that is not red.

What is the truth value of S1 ^ ~S2? Let's write it out

S1 ^ ~S2: My driveway is empty and there exists a car parked in my driveway that is not red.

The only way that S1 ^ ~S2 can be true is if both S1 and ~S2 are true. But S1 says that my driveway is empty and S2 says that there exists a car in my driveway. My driveway can not be both empty and contain a car. Thus, it is impossible for S1 and ~S2 to both be true. Therefore, S1 ^ ~S2 is false so its negation S1 => S2 is true.

Therefore, if your driveway is empty you can assert that all cars parked there are red.

So now let's consider an IEnumerable<T> elements and a Predicate<T> p. Let us suppose that elements is empty. We wish to discover the value of

bool b = elements.All(x => p(x));

Let's consider its negation

bool notb = elements.Any(x => !p(x));

For notb to be true, there must be at least one x in elements for which !p(x) is true. But elements is empty so it is impossible to find an x for which !p(x) is true. Therefore notb can not be true so it must be false. Since notb is false, its negation is true. Therefore b is true and elements.All(x => p(x)) must be true if elements is empty.

Here's one more way to think of this. The predicate p is true if for all x in elements you can not find any for which it is false. But if there are no items in elements then it is impossible to find any for which it is false. Thus, for an empty collection elements, p is true for all x in elements

Now, what about elements.Any(x => p(x)) when elements is an empty IEnumerable<T> and p is a Predicate<T> as above? We already know the result will be false because we know its negation is true, but let's reason through it anyway; the intuition is valuable. For elements.Any(x => p(x)) to be true there must be at least one x in elements for which p(x) is true. But if there aren't any x in elements it is impossible to find any x for which p(x) is true. Therefore, elements.Any(x => p(x)) is false if elements is empty.

Finally, here's a related explanation on why s.StartsWith(String.Empty) is true when s is a non-null instance of string:

Jason
S1: My driveway is empty. S2: All cars parked on my driveway are actually red balloons that are floating five feet above my driveway.
DanM
+1  A: 

Any() and All() are just implementations of the usual mathematical operators ∃ (the "existential quatifier" or "there exists") and ∀ (the "universal quatifier" or "for all").

"Any" means that there exists some item for which the predicate is true. For the empty collection, this would be false.

"All" means that there does not exist any item for which the predicate is false. For the empty collection, this would always be true.

Jeffrey L Whitledge