ansaurus

Question

Algorithm/pattern for selecting sub-collections using LINQ and C#

Answer 1

+4 A:

Not sure what the list of page breaks is for. I would think of it this way. A collection of strings, a page number, and the size of the page. Then you could do something like:

List<string> strings = ...
int pageNum = ...
int pageSze = ...

if (pageNum < 1) pageNum = 1;
if (pageSize < 1) pageSize = 1;

List<string> pageOfStrings = strings.Skip( pageSize*(pageNum-1) ).Take( pageSize ).ToList();

In the case where the number of pages vary per page as per your comment, try something like below. You may need to adjust the edge condition checking...

List<string> strings = ...
List<int> sizes = ...

int pageNum = ...
int itemsToSkip =  0;
int itemsToTake = 1;

if (pageNum > 1)
{
   sizes.Take( pageNum - 2).Sum();

   if (pageNum <= sizes.Count)
   {
       itemsToTake = sizes[pageNum-1]
   }
{

List<string> pageOfStrings = strings.Skip( itemsToSkip ).Take( itemsToTake );

tvanfosson 2008-10-26 18:03:11

Yep, was going to answer this. The paging pattern in Linq is implemented by the Skip and Take methods. if i is the number of items in a page, and j is the number of the page you wish to view, you would skip i * (j-1) items and take i.

Will 2008-10-26 18:07:28

That's a great answer but the number of lines per page can vary on each page so the page break collection is an index into where each line that starts a new page is.

Guy 2008-10-26 18:55:34

This variable page solution has some merit. First page is bugged and an assignment is missing.

David B 2008-10-27 00:31:37

Answer 2

+2 A:

"Pure" Linq isn't a good fit for this problem. The best fit is to rely on the methods and properties of List(T). There aren't -that- many special cases.

//pageNum is zero-based.
List<string> GetPage(List<string> docList, List<int> pageBreaks, int pageNum)
{

  // 0 page case
  if (pageBreaks.Count != 0)
  {
    return docList;
  }

  int lastPage = pageBreaks.Count;

  //requestedPage is after the lastPage case
  if (requestedPage > lastPage)
  {
    requestedPage = lastPage;
  }


  int firstLine = requestedPage == 0 ? 0  :
      pageBreaks[requestedPage-1];
  int lastLine = requestedPage == lastPage ? docList.Count :
      pageBreaks[requestedPage];

  //lastLine is excluded.  6 - 3 = 3 - 3, 4, 5

  int howManyLines = lastLine - firstLine;

  return docList.GetRange(firstLine, howManyLines);
}

You don't want to replace the .Count property with linq's .Count() method. You don't want to replace the .GetRange() method with linq's .Skip(n).Take(m) methods.

Linq would be a better fit if you wanted to project these collections into other collections:

IEnumerable<Page> pages =
  Enumerable.Repeat(0, 1)
  .Concat(pageBreaks)
  .Select
  (
    (p, i) => new Page()
    {
      PageNumber = i,
      Lines = 
        docList.GetRange(p, ((i != pageBreaks.Count) ? pageBreaks[i] : docList.Count)  - p)
    }
  );

David B 2008-10-26 19:05:21

Would you mind elaborating on why GetRange() is a better option than Skip() and Take()?

Joel Mueller 2008-10-26 21:08:30

GetRange leverages the internal implementation of List. Skip/Take uses the contract of IEnumerable (which involves enumerating). If the List has a million elements in it, you'll get to elements 950,000 through 950,010 faster using GetRange.

David B 2008-10-26 23:18:38

"`pageBreaks.Count != 0`"Do you mean `pageBreaks.Count == 0`? `pageBreaks.Any()` is generally preferable: it doesn't require counting.

Jay Bazuzi 2008-10-27 05:45:43

.Count is a property on List(T) that does not require counting (also does not require allocation of an Enumerator). I think you mean Enumerable.Count() which is a method I'm not using in this post.

David B 2008-10-27 14:51:47

ansaurus

tags:

views:

answers:

Algorithm/pattern for selecting sub-collections using LINQ and C#

related questions