views:

115

answers:

2

Edit: Entity Framework seems to be the issue, discussed further in question Entity Framework & Linq performance problem.

I am supporting a PagedList (using linq/generics) class written by someone long departed - and it has 2 lines that have very bad performance - on a dataset of just 2 thousand rows it takes up to one minute to run.

The two offending lines are:

TotalItemCount = source.Count();

I probably can fix that by passing in the count as a parameter. But the other line that is slow is the AddRange in this snippet:

IQueryable<T> a = source.Skip<T>((index) * pageSize).Take<T>(pageSize);
AddRange(a.AsEnumerable());

I don't understand why AddRange is so slow or what I can do to improve it?

The entire class source listing is

public class PagedList<T> : List<T>, IPagedList<T>
{
    public PagedList(IEnumerable<T> source, int index, int pageSize)
     : this(source, index, pageSize, null)
    {
    }

    public PagedList(IEnumerable<T> source, int index, int pageSize, int? totalCount)
    {
        Initialize(source.AsQueryable(), index, pageSize, totalCount);
    }

    public PagedList(IQueryable<T> source, int index, int pageSize)
     : this(source, index, pageSize, null)
    {
    }

    public PagedList(IQueryable<T> source, int index, int pageSize, int? totalCount)
    {
        Initialize(source, index, pageSize, totalCount);
    }

    #region IPagedList Members

    public int PageCount { get; private set; }
    public int TotalItemCount { get; private set; }
    public int PageIndex { get; private set; }
    public int PageNumber { get { return PageIndex + 1; } }
    public int PageSize { get; private set; }
    public bool HasPreviousPage { get; private set; }
    public bool HasNextPage { get; private set; }
    public bool IsFirstPage { get; private set; }
    public bool IsLastPage { get; private set; }

    #endregion

    protected void Initialize(IQueryable<T> source, int index, 
            int pageSize, int? totalCount)
    {
        //### argument checking
        if (index < 0)
        {
            throw new ArgumentOutOfRangeException("PageIndex cannot be below 0.");
        }
        if (pageSize < 1)
        {
            throw new ArgumentOutOfRangeException("PageSize cannot be less than 1.");
        }

        //### set source to blank list if source is null to prevent exceptions
        if (source == null)
        {
            source = new List<T>().AsQueryable();
        }

        //### set properties
        if (!totalCount.HasValue)
        {
            TotalItemCount = source.Count();
        }
        PageSize = pageSize;
        PageIndex = index;
        if (TotalItemCount > 0)
        {
            PageCount = (int)Math.Ceiling(TotalItemCount / (double)PageSize);
        }
        else
        {
            PageCount = 0;
        }
        HasPreviousPage = (PageIndex > 0);
        HasNextPage = (PageIndex < (PageCount - 1));
        IsFirstPage = (PageIndex <= 0);
        IsLastPage = (PageIndex >= (PageCount - 1));

        //### add items to internal list
        if (TotalItemCount > 0)
        {
            IQueryable<T> a = source.Skip<T>((index) * pageSize).Take<T>(pageSize);
            AddRange(a.AsEnumerable());
        }
    }
}
+2  A: 

It's probably not AddRange that is slow, it's probably the query over the source. The call to AsEnumerable() will come back immediately but the query actually won't be executed until inside of the AddRange call when the sequence is actually enumerated.

You can prove this by changing this line:

AddRange(a.AsEnumerable());

to:

T[] aa = a.ToArray();
AddRange(aa);

You will probably see that the ToArray() call is what takes most of the time because this is when the query is actually executed. Now why that is slow is anyone's guess. If you're using LINQ to SQL or Entity Framework, you can try profiling the database to see where the bottleneck is. But it's probably not the PagedList class.

Josh Einstein
When I'm stepping through the code it doesn't come back immediately, its a significant delay. Yes its using EF. I will fire up the sql profiler and see what that says. And the `data` parameter is just a simple EF getAll() query (so no complex joins or etc that would result in bad sql performance)
JK
Now that's strange: sql profiler shows for 1958 rows selected, that it made 3916 (thats 1958*2) separate selects of the entire contents of another table (the child table of a master-child). There was no where clause in these 3916 selects
JK
Should I be asking a separate question since this seems to be an EF problem? All it does it does is `var data =_service.GetAll()`, followed by `data.Skip((index) * pageSize).Take(pageSize)`. This shouldn't have a performance problem
JK
Yeah I would definitely suggest opening another question. It also might be a good idea to see if you can reproduce the problem outside of your application using something like LINQPad. That way you can just post the LINQ query in question and that should help to isolate the issue.
Josh Einstein
But for what it's worth, it sounds like it's an issue of eager-loading relationship data. There's probably a better way of doing it in EF but I don't have a ton of experience with it to know.
Josh Einstein
Asked as http://stackoverflow.com/questions/3615138/entity-framework-linq-performance-problem
JK
Accepted as answer because it lead to finding where the real problem lays
JK
A: 

Why don't you just make a an IEnumerable? Then you won't have to call AsEnumerable() at all.

Although come to think of it, the query doesn't actually execute until AsEnumerable is called, so it probably won't make any difference here.

Robert Harvey
a is already IEnumerable by way of IQueryable which derives from IEnumerable. I suspect the OP probably threw in the AsEnumerable to try and isolate the bottleneck.
Josh Einstein
That's the code exactly as I inherited, so cant say why it does that.
JK