tags:

views:

180

answers:

5

I have an array of elements where the element has a Flagged boolean value.

1 flagged
2 not flagged
3 not flagged
4 flagged
5 not flagged
6 not flagged
7 not flagged
8 flagged
9 not flagged

I want to break it into arrays based on the flagged indicator

output >

array 1 {1,2,3}
array 2 {4,5,6,7}
array 3 {8,9}
A: 

I don't think LINQ is suited for this very well. It could be done with Aggregate() but I think you'd be better of just looping with a foreach() building up the result.

Frank Schwieterman
+3  A: 

Linq doesn't have an operator for this, but I've written an extension method that you may be able to use (in the process of submitting it to MoreLinq, which you should also check out):

Using the operator below, you would write:

var result = 
   items.Segment( (item,prevItem,idx) => item.Flagged )
        .Select( seq => seq.ToArray() )  // converts each sequence to an array
        .ToList();

Here's the code of the extension method:

public static IEnumerable<IEnumerable<T>> Segment<T>(IEnumerable<T> sequence, Func<T, T, int, bool> newSegmentIdentifier) 
     { 
         var index = -1; 
         using (var iter = sequence.GetEnumerator()) 
         { 
             var segment = new List<T>(); 
             var prevItem = default(T); 

             // ensure that the first item is always part 
             // of the first segment. This is an intentional 
             // behavior. Segmentation always begins with 
             // the second element in the sequence. 
             if (iter.MoveNext()) 
             { 
                 ++index; 
                 segment.Add(iter.Current); 
                 prevItem = iter.Current; 
             } 

             while (iter.MoveNext()) 
             { 
                 ++index; 
                 // check if the item represents the start of a new segment 
                 var isNewSegment = newSegmentIdentifier(iter.Current, prevItem, index); 
                 prevItem = iter.Current; 

                 if (!isNewSegment) 
                 { 
                     // if not a new segment, append and continue 
                     segment.Add(iter.Current); 
                     continue; 
                 } 
                 yield return segment; // yield the completed segment 

                 // start a new segment... 
                 segment = new List<T> { iter.Current }; 
             } 
             // handle the case of the sequence ending before new segment is detected 
             if (segment.Count > 0) 
                 yield return segment; 
         } 
     } 
LBushkin
Why is the condition in the lambda function 'item.Flagged != prevItem.Flagged'? You would create a new segment if the flag changes from false to true. I think the condition should be just 'item.Flagged', and not dependant from the previous item at all.
Philip Daubmeier
@phild: You are correct, `item.Flagged` is sufficient. I was in a hurry earlier and misinterpreted the OP's expectations. I've updated my response.
LBushkin
+1  A: 

I don't think LINQ is the right tool for this task. What about this:

public static List<List<T>> PartitionData<T>(T[] arr, Func<T, bool> flagSelector){
    List<List<T>> output = new List<List<T>>();
    List<T> partition = null;
    bool first = true;

    foreach(T obj in arr){
        if(flagSelector(obj) || first){
            partition = new List<T>();
            output.Add(partition);
            first = false;
        }
        partition.Add(obj);
    }

    return output;
}

A small example, with the Data from Fábio Batistas post:

var arrayOfElements = new[] {
    new { Id = 1, Flagged = true },
    new { Id = 2, Flagged = false },
    new { Id = 3, Flagged = false },
    new { Id = 4, Flagged = true },
    new { Id = 5, Flagged = false },
    new { Id = 6, Flagged = false },
    new { Id = 7, Flagged = false },
    new { Id = 8, Flagged = true },
    new { Id = 9, Flagged = false }
};

var partitioned = PartitionData(arrayOfElements, x => x.Flagged);
Philip Daubmeier
I think this needs a output.Add(currentList); after the foreach statement, or your last array never gets added.
Kenoyer130
I updated it, and made a generic function out of it.
Philip Daubmeier
+1  A: 

Considering:

var arrayOfElements = new[] {
    new { Id = 1, Flagged = true },
    new { Id = 2, Flagged = false },
    new { Id = 3, Flagged = false },
    new { Id = 4, Flagged = true },
    new { Id = 5, Flagged = false },
    new { Id = 6, Flagged = false },
    new { Id = 7, Flagged = false },
    new { Id = 8, Flagged = true },
    new { Id = 9, Flagged = false }
};

You can write:

var grouped = 
    from i in arrayOfElements
    where i.Flagged
    select 
        (new[] { i.Id })
        .Union(arrayOfElements.Where(i2 => i2.Id > i.Id).TakeWhile(i2 => !i2.Flagged).Select(i2 => i2.Id))
        .ToArray();

This works if your elements are ordered by the Id attribute. If they don't, you'll have to inject a Sequence on your original array, that should be easy to do with linq as well, so you'll get a sequence.

Also, a better alternative should be:

// for each flagged element, slice the array,
// starting on the flagged element until the next flagged element
var grouped = 
    from i in arrayOfElements
    where i.Flagged
    select 
        arrayOfElements
            .SkipWhile(i2 => i2 != i)
            .TakeWhile(i2 => i2 == i || !i2.Flagged)
            .Select(i2 => i2.Id)
            .ToArray();

Note that those answers are using pure linq.

Fábio Batista
I like Linq a lot, and use it where I can, but dont you think this is rather ugly compared to the classical approach (see my post)? I even think this is going to be worse in performance.
Philip Daubmeier
I evolved it a little bit, to remove the need to order them by Id. But I agree, it's worse on performance. I liked the @LBushkin answer, creating a new extension method for that.
Fábio Batista
+1 I like your second alternative. Nice to see how many tasks can be done with pure linq, that I would not even have thought about doing in linq.
Philip Daubmeier
+2  A: 

I had a similar problem with this, and solved it using GroupBy and closure.

//sample data
var arrayOfElements = new[] {
    new { Id = 1, Flagged = true },
    new { Id = 2, Flagged = false },
    new { Id = 3, Flagged = false },
    new { Id = 4, Flagged = true },
    new { Id = 5, Flagged = false },
    new { Id = 6, Flagged = false },
    new { Id = 7, Flagged = false },
    new { Id = 8, Flagged = true },
    new { Id = 9, Flagged = false }
};

//this is the closure which will increase each time I see a flagged
int flagCounter = 0;

var query = 
    arrayOfElements.GroupBy(e => 
        {
            if (e.Flagged)
                flagCounter++;
            return flagCounter;
        });

What it does is grouping on an int (flagCounter), which is increased each time a Flagged element is found.
Please note this won't work with AsParallel().

Testing the results:

foreach(var group in query)
{
    Console.Write("\r\nGroup: ");
    foreach (var element in group)
        Console.Write(element.Id);
}

Outputs:

Group: 123
Group: 4567
Group: 89

Alex Bagnolini
Very interesting approach. Very different to the other solutions, but you know, many roads lead to rome...
Philip Daubmeier
While interesting, I think it's kinda "hackish" to introduce state to the query. The end result, however, is nice (a linq grouping). Maybe there's a way to do it w/o the closure?
Fábio Batista
@Fábio: Closures are a great addition to C#, and are most suited for use in delegates (like in this case). My knowledge can't get me into a thread-safe solution using them yet, but surely this code is as readable as performant and maintainable.
Alex Bagnolini