views:

55

answers:

3

Hi All

I have the following method:

    /// <summary>
    /// Calculates the last trade date.
    /// </summary>
    /// <param name="tradesDictionary">The trades dictionary.</param>
    /// <returns>The last trade date.</returns>
    public DateTime CalculateLastTradeDate(ConcurrentDictionary<Guid, TradeRecord> tradesDictionary)
    { 
        // Calculate the last trade date
        _lastTradeDate = (from tradeRecord in tradesDictionary
                          where (tradeRecord.Value.OrderRecord.PairRecord.Id == _pairId)
                          select tradeRecord.Value.Date)
                         .Max();
        // Return _lastTradeDate
        return _lastTradeDate;
    }

which takes +- 129 seconds, i.e. +-2 minutes to execute on ConcurrentDictionary of 21353 objects in memory. Is there anything i can do in the query implemented by the above method, to drastically reduce it's execution time?

Any help would be appreciated!

+1  A: 

Well, the first thing to note is that you're not really doing most of the query on a ConcurrentDictionary or anything related to it. You're doing it on a copy of the values, in a list.

My first port of call would be to work out where the time is going - separate out the tradesDictionary.Values.ToList() call from the rest of the query. 4 minutes does sound way over the top though. Once you've worked out which part is causing the problem, I would consider using a non-parallel query, just for comparison purposes.

Beyond that, it really depends on what the various properties of the records are doing. Are they accessing a database or something like that? Does your computer appear to be idle for those four minutes, or is it running at full pelt?

It does strike me that you don't really need to order the whole set - you just need to find the minimum value. However, that should only take the complexity from O(n log n) to O(n), and it's relatively hard to do within "normal" LINQ to Objects or Parallel LINQ.

Jon Skeet
Hi Jon. I changed the query to use Max, without converting to a list. Query performance has improved by 100%, however - 2 minutes is still way too slow! Any thing else i can try? Not doing anything to the database. Objects are all stored in memory. Please see changes i made to the original question.
c0D3l0g1
+1  A: 
TradeRecord.OrderRecord.PairRecord

Looks like there's three levels of database records involved. Are you 100% certain that all records are in memory? Did you check by setting the log property of your datacontext, or by checking the sql profiler?

David B
Hi David. You got me thinking in the right direction. These objects are all stored in memory, but have a lazy loading feature on there foreign objects. Everytime TradeRecord.OrderRecord.PairRecord is called, there is a lookup to the database happening underneath. Thanks so much - this is one of those i GOT YOU problems!
c0D3l0g1
A: 

From the information here it strikes me that the max trade date may be more efficiently attained by just querying the database.

SELECT MAX(TradeRecordTable.Date) AS MaxTradeDate
FROM   <appropriate table join>
WHERE  PairRecordTable.Id = _pairId

If the amount of matching rows in PairRecordTable isn't large this should run very quickly. I realize there are many reasons this may not be a solution for you. But I've also often seen the simple solution overlooked.

Sorax