tags:

views:

79

answers:

1

I am messing around with LINQ and I am curious to see what I can do with it. I would like to know if it is possible to have a LINQ query that imposes a condition over the resultant set. For example, let's say I have a List of several words, and I wish to find sets of words that form a chain (i.e. last letter of a word = first letter of next word, no constraint on first or last word in the chain). Something like "hello, old, dairy, yellow, world...".

From these sets, I would then want to take the set that forms the longest chain.

Can LINQ do something like this?

var chains = (from word in words
             select word
             where result.IsChain()).Max(x => x.Length);
+2  A: 

LINQ can do almost anything - although I had to introduce a constraint that words can only appear once in any chain otherwise I kept getting stack overflow errors.

var words = new[]
{
    "old", "dairy", "yellow",
    "world", "dog", "dad",
    "yard", "yolk", "yeah",
    "king", "weld", "goat",
    "hello",
};

Func<IEnumerable<IEnumerable<string>>, IEnumerable<string>, IEnumerable<IEnumerable<string>>> lengthenChains = (css, ws) =>
{
    var endsWith = from cs in css
                   select new
                   {
                       Letter = cs.Last().Last(),
                       Chain = cs,
                   };

    var startsWith = from w in ws
                     select new
                     {
                         Letter = w.First(),
                         Word = w,
                     };

    return from ew in endsWith
           join sw in startsWith on ew.Letter equals sw.Letter
           where ew.Chain.Contains(sw.Word) == false
           select ew.Chain.Concat(new[] { sw.Word });
};

Func<IEnumerable<string>, IEnumerable<IEnumerable<string>>> makeChain = ws =>
        from w in ws
        select (new[] { w, }).AsEnumerable();

Func<IEnumerable<IEnumerable<string>>, IEnumerable<string>, IEnumerable<IEnumerable<string>>> makeChains = null;

makeChains = (css, ws) =>
    css.Any()
    ? css.Concat(makeChains(lengthenChains(css, ws), ws))
    : Enumerable.Empty<IEnumerable<string>>();

var chains = from cs in makeChains(makeChain(words), words)
             select String.Join(", ", cs.ToArray());

chains.Run(chain => Console.WriteLine(chain));

I'll leave it for you to get the maximum length chain. It wasn't clear from your question if the length of the chain is a count of the number of words or if it is the character length of the concatenated words.

Here's the last 8 that get generated from the above code:

yellow, world, dairy, yeah, hello, old, dad, dog, goat
yellow, world, dad, dairy, yeah, hello, old, dog, goat
yellow, weld, dairy, yeah, hello, old, dad, dog, goat
yellow, weld, dad, dairy, yeah, hello, old, dog, goat
yeah, hello, old, dairy, yellow, world, dad, dog, goat
yeah, hello, old, dairy, yellow, weld, dad, dog, goat
yeah, hello, old, dad, dairy, yellow, world, dog, goat
yeah, hello, old, dad, dairy, yellow, weld, dog, goat

Enjoy.


Roly wanted more of a "prolog backtracking algorithm" - although his question didn't say that! ;-)

Here it is:

var starting = from w in words
               let c = (new[] { w }).AsEnumerable()
               select new Working(c.ToArray(), words.Except(c).ToArray());

var chains = (from cs in Chains(starting)
              select String.Join(", ", cs.ToArray())).ToArray();

IEnumerable<IEnumerable<string>> Chains(IEnumerable<Working> workings)
{
    foreach (var w in workings)
    {
        yield return w.Chain;
        var last = w.Chain.Last().Last();
        var nexts = (from r in w.Remaining
                     where r.First() == last
                     let c = (new[] { r }).AsEnumerable()
                     select new Working(w.Chain.Concat(c).ToArray(), w.Remaining.Except(c).ToArray()));
        foreach (var chain in Chains(nexts))
        {
            yield return chain;
        }
    }
}

This method is backtracking by using an iterator method, the CLR stack, and recursive calls. Prolog would do this more elegantly, but it turns out my comment on the probable efficiency of this method was wrong. It's actually about two times quicker than my first method.

I also feel that this second method is straying further from the use of "pure" LINQ, but it is cleaner, smaller and more efficient. I know I'd rather maintain this version.

Oh, the Working class (used to track the working state) is essentially this:

class Working
{
  string[] Chain { get; set; }
  string[] Remaining { get; set; }
}

The output from this approach clearly shows that it is backtracking:

...
yeah, hello, old, dog
yeah, hello, old, dog, goat
yeah, hello, old, dad
yeah, hello, old, dad, dairy
yeah, hello, old, dad, dairy, yellow
yeah, hello, old, dad, dairy, yellow, world
yeah, hello, old, dad, dairy, yellow, world, dog
yeah, hello, old, dad, dairy, yellow, world, dog, goat
yeah, hello, old, dad, dairy, yellow, weld
yeah, hello, old, dad, dairy, yellow, weld, dog
yeah, hello, old, dad, dairy, yellow, weld, dog, goat
yeah, hello, old, dad, dairy, yard
yeah, hello, old, dad, dairy, yard, dog
yeah, hello, old, dad, dairy, yard, dog, goat
yeah, hello, old, dad, dairy, yolk
yeah, hello, old, dad, dairy, yolk, king
yeah, hello, old, dad, dairy, yolk, king, goat
yeah, hello, old, dad, dog
yeah, hello, old, dad, dog, goat
...
Enigmativity
Definitely an impressive display of linq and lambdas. But I guess my question was really geared at asking if LINQ could perform the backtracking required to answer this question recursively the way a language like Prolog would.
Roly
Yes, there is a way to simulate prolog backtracking with LINQ, but it's not true backtracking and, because of the imperative nature of C#, it can be very inefficient. My method is reasonably efficient in comparison (although it is not overly optimized).
Enigmativity
Cool...where could I learn more about that? (just out of my own curiosity)
Roly
@`Roly` - I updated the answer with a backtracking algorithm.
Enigmativity