ansaurus

Question

Algorithm for text selection intersection

Answer 1

+5 A:

Put all the users' begin and end selections in a list, sorted. Start at the top of the list and increment a counter for each begin point you find, decrement for each end point you find. The highest value of the counter is the start of your most highlighted / most overlapped section of text. The next item in the list is the end of that selection.

dthorpe 2010-07-08 20:46:36

what if two selections don't have the same start position and length but are overlapped?

Comma 2010-07-08 20:49:11

+1: This looks correct.

Moron 2010-07-08 20:56:36

James Curran 2010-07-08 21:05:09

Answer 2

+1 A:

This should convert your data into the structure needed for dthorpe algorithm. If I had more time, I could probably Linq-ify the rest.

UPDATE: Ok, Now complete (if still untested....) UPDATE2: Now, actually tested & Working!

// Existing data structure
class StartLen
{
    public int Start {get; set;}
    public int Len   {get; set;}
    public string UserId {get; set;}
}

// Needed data struct
class StartEnd
{
    public int Pos {get; set;}
    public bool IsStart {get; set;}
}

class Segment
{
    public int Start { get; set; }
    public int End { get; set; }
    public int Count { get; set; }
}    

int count = 0, lastStart = -1;   // next rev, I figure a way to get rid of these. 

 // this can't be a lambda, it has to be a real function
IEnumerable<StartEnd> SplitSelection(StartLen sl)
 {  
    yield return new StartEnd() {Pos = sl.Start, IsStart = true} ; 
    yield return new StartEnd() {Pos = sl.Start+sl.Len -1 , IsStart = false} ; 
 }

List<StartLen> startLen = new List<StartLen>();
// we fill it with data for testing
// pretending to be the real data
startLen.Add(new StartLen() { Start=10, Len=10, UserId="abc123" });
startLen.Add(new StartLen() { Start=15, Len=10, UserId="xyz321" });

 var mostSelected  =  
    startLen.SelectMany<StartLen, StartEnd>(SplitSelection) 
        .OrderBy(se=>se.Pos) 
        .Select(se=>
        { 
            if (se.IsStart) 
            { 
                lastStart = se.Pos; 
                count++; 
            } 
            else 
            { 
                count--; 
                if (lastStart > 0) 
                { 
                    var seg = new Segment  
                        { Start = lastStart, End = se.Pos, Count = count }; 
                    lastStart = -1; 
                    return seg;
                } 
            } 
            // Garbage, cuz I need to return something 
            return new Segment { Start = 0, End = 0, Count = -1 };  
        }) 
       .OrderByDescending(seg => seg.Count) 
       .First(); 

// mostSelected  holds Start & End positions
}

James Curran 2010-07-08 20:58:23

Comma 2010-07-08 21:12:12

James Curran 2010-07-08 21:30:36

thORpe! thORpe! :P

dthorpe 2010-07-08 21:42:51

@dthorpe: I is a programmer, not a speller. (Feel free to reverse the middle letters of my last name... ;-)

James Curran 2010-07-08 21:49:51

Ok if I understand this correctly the next step is to find the highest isStart = true and the lowest isStart = false?

Comma 2010-07-08 21:52:33

James Curran 2010-07-08 21:58:49

Comma 2010-07-08 23:32:57

@James: Fair enough!;>

dthorpe 2010-07-09 05:36:39

ansaurus

tags:

views:

answers:

Algorithm for text selection intersection

related questions