tags:

views:

288

answers:

5

Hi everyone, Let's say I have a sequence.

IEnumerable<int> sequence = GetSequenceFromExpensiveSource();
// sequence now contains: 0,1,2,3,...,999999,1000000

Getting the sequence is not cheap and is dynamically generated, and I want to iterate through it once only.

I want to get 0 - 999999 (i.e. everything but the last element)

I recognize that I could do something like:

sequence.Take(sequence.Count() - 1);

but that results in two enumerations over the big sequence.

Is there a LINQ construct that lets me do:

sequence.TakeAllButTheLastElement();

?

Thanks!

-Mike

+6  A: 

I don't know a Linq solution - But you can easily code the algorithm by yourself using generators (yield return).

public static IEnumerable<T> TakeAllButLast<T>(this IEnumerable<T> source) {
    var it = source.GetEnumerator();
    bool hasRemainingItems = false;
    bool isFirst = true;
    T item = default(T);

    do {
        hasRemainingItems = it.MoveNext();
        if (hasRemainingItems) {
            if (!isFirst) yield return item;
            item = it.Current;
            isFirst = false;
        }
    } while (hasRemainingItems);
}

static void Main(string[] args) {
    var Seq = Enumerable.Range(1, 10);

    Console.WriteLine(string.Join(", ", Seq.Select(x => x.ToString()).ToArray()));
    Console.WriteLine(string.Join(", ", Seq.TakeAllButLast().Select(x => x.ToString()).ToArray()));
}

Or as a generalized solution discarding the last n items (using a queue like suggested in the comments):

public static IEnumerable<T> SkipLastN<T>(this IEnumerable<T> source, int n) {
    var  it = source.GetEnumerator();
    bool hasRemainingItems = false;
    var  cache = new Queue<T>(n + 1);

    do {
        if (hasRemainingItems = it.MoveNext()) {
            cache.Enqueue(it.Current);
            if (cache.Count > n)
                yield return cache.Dequeue();
        }
    } while (hasRemainingItems);
}

static void Main(string[] args) {
    var Seq = Enumerable.Range(1, 4);

    Console.WriteLine(string.Join(", ", Seq.Select(x => x.ToString()).ToArray()));
    Console.WriteLine(string.Join(", ", Seq.SkipLastN(3).Select(x => x.ToString()).ToArray()));
}
Dario
And this solution will actually work ;-)
Dario
Well done for copying most of my solution. Would generally be more courteous to fix it instead of taking the rep for yourself.
Noldorin
I was already writing my solution when you posted yours. But I can upvote yours if it makes you feel happy.
Dario
@Dario: Sheer coincidence that they have the same names and structure then? Meh, I'll take your word for it anyway, as I know it happens from experience. +1 to you.
Noldorin
Now can you generalize it to take all but the final n?
Eric Lippert
Done ( .... 15 chars ... )
Dario
Nice. One small error; queue size should be initialized to n + 1 since that is the max size of the queue.
Eric Lippert
Right - Fixed it
Dario
+4  A: 

Nothing in the BCL (or MoreLinq I believe), but you could create your own extension method.

public static IEnumerable<T> TakeAllButLast<T>(this IEnumerable<T> source)
{
    var enumerator = source.GetEnumerator();
    bool first = true;
    T prev;
    while(enumerator.MoveNext())
    {
        if (!first)
            yield return prev;
        first = false;
        prev = enumerator.Current;
    }
}
Noldorin
This code won't work... should probably be `if (!first)` and pull `first = false` out of the if.
Caleb
This code looks off. The logic seems to be 'return the uninitialized `prev` in the first iteration, and loop forever after that'.
Novelocrat
The code seems to have "compile time" error. May be you would like to correct it.But yes, we can write an extender which iterates once and returns all but the last item.
Amby
@Caleb: You are absolutely right - I wrote this in a real rush! Fixed now. @Amby: Erm, not sure what compiler you're using. I had nothing of the sort. :P
Noldorin
+1  A: 

As an alternative to creating your own method and in a case the elements order is not important, the next will work:

var result = sequence.Reverse().Skip(1);
Kamarey
Note that this requires that you have enough memory to store the entire sequence, and of course it STILL iterates the entire sequence twice, once to build the reversed sequence and once to iterate it. Pretty much this is worse than the Count solution no matter how you slice it; it's slower AND takes far more memory.
Eric Lippert
I don't know how the 'Reverse' method work, so I'm not sure about amount of memory it use. But I agree about two iterations. This method shouldn't be used on large collections or if a performance is important.
Kamarey
Well, how would *you* implement Reverse? Can you figure out a way *in general* to do it without storing the entire sequence?
Eric Lippert
Don't see anything better than O(n) of space. Sure that better to use solution of custom method for this purpose.
Kamarey
I like it, and it does meet the criteria of not generating the sequence twice.
David B
+2  A: 

Because I think Dario's solution is less elegant than it could be, an alternative:

public static IEnumerable<T> DropLast<T>(this IEnumerable<T> source)
{
    if (source == null)
        throw new ArgumentNullException("source");

    T buffer = default(T);
    bool buffered = false;

    foreach (T x in source)
    {
        if (buffered)
            yield return buffer;

        buffer = x;
        buffered = true;
    }
}

As per Eric Lippert's suggestion, it easily generalizes to n items:

public static IEnumerable<T> DropLast<T>(this IEnumerable<T> source, int n)
{
    if (source == null)
        throw new ArgumentNullException("source");

    if (n < 0)
        throw new ArgumentOutOfRangeException("n", 
            "Argument n should be non-negative.");

    Queue<T> buffer = new Queue<T>(n + 1);

    foreach (T x in source)
    {
        buffer.Enqueue(x);

        if (buffer.Count == n + 1)
            yield return buffer.Dequeue();
    }
}

Where I changed the order of buffering to handle n == 0 without special casing.

Joren
A: 

Why not just .ToList on the seq, then call count and take like you did originally.. but since it's been pulled into a list, it shouldnt do an expensive enumeration twice. Right?

boomhauer