Can anyone please explain why the third query below is orders of magnitude slower than the others when it oughtn't to take any longer than doing the first two in sequence?
var data = Enumerable.Range(0, 10000).Select(x => new { Index = x, Value = x + " is the magic number"}).ToList();
var test1 = data.Select(x => new { Original = x, Match = data.Single(y => y.Value == x.Value) }).Take(1).Dump();
var test2 = data.Select(x => new { Original = x, Match = data.Single(z => z.Index == x.Index) }).Take(1).Dump();
var test3 = data.Select(x => new { Original = x, Match = data.Single(z => z.Index == data.Single(y => y.Value == x.Value).Index) }).Take(1).Dump();
EDIT: I've added a .ToList() to the original data generation because I don't want any repeated generation of the data clouding the issue.
I'm just trying to understand why this code is so slow by the way, not looking for faster alternative, unless it sheds some light on the matter. I would have thought that if Linq is lazily evaluated and I'm only looking for the first item (Take(1)) then test3's:
data.Select(x => new { Original = x, Match = data.Single(z => z.Index == data.Single(y => y.Value == x.Value).Index) }).Take(1);
could reduce to:
data.Select(x => new { Original = x, Match = data.Single(z => z.Index == 1) }).Take(1)
in O(N) as the first item in data is successfully matched after one full scan of the data by the inner Single(), leaving one more sweep of the data by the remaining Single(). So still all O(N).
It's evidently being processed in a more long winded way but I don't really understand how or why.
Test3 takes a couple of seconds to run by the way, so I think we can safely assume that if your answer features the number 10^16 you've made a mistake somewhere along the line.