views:

283

answers:

4

I have a Dictionary<int, MyClass>

It contains 100,000 items

10,000 items value is populated whilst 90,000 are null.

I have this code:

var nullitems = MyInfoCollection.Where(x => x.Value == null).ToList();
nullitems.ForEach(x => LogMissedSequenceError(x.Key + 1));

private void LogMissedSequenceError(long SequenceNumber)
        {
            DateTime recordTime = DateTime.Now;

            var errors = MyXDocument.Descendants("ERRORS").FirstOrDefault();
            if (errors != null)
            {

                errors.Add(
                    new XElement("ERROR",
                        new XElement("DATETIME", DateTime.Now.ToString("dd/MM/yyyy HH:mm:ss:fff")),
                        new XElement("DETAIL", "No information was read for expected sequence number " + SequenceNumber),
                        new XAttribute("TYPE", "MISSED"),
                        new XElement("PAGEID", SequenceNumber)
                        )
                );
            }
        }

This seems to take about 2 minutes to complete. I can't seem to find where the bottleneck might be or if this timing sounds about right?

Can anyone see anything to why its taking so long?

+2  A: 

This is what I would most likely do.

private void BuildErrorNodes()
{
    const string nodeFormat = @"<ERROR TYPE=""MISSED""><DATETIME>{0}</DATETIME><DETAIL>No information was read for expected sequence number {1}</DETAIL><PAGEID>{1}</PAGEID></ERROR>";

    var sb = new StringBuilder("<ERRORS>");
    foreach (var item in MyInfoCollection)
    {
        if (item.Value == null) 
        {
            sb.AppendFormat(
                nodeFormat,
                DateTime.Now.ToString("dd/MM/yyyy HH:mm:ss:fff"),
                item.Key + 1
            );
        }
    }

    sb.Append("</ERRORS>");

    var errorsNode = MyXDocument.Descendants("ERRORS").FirstOrDefault();
    errorsNode.ReplaceWith(XElement.Parse(sb.ToString()));
}
ChaosPandion
+3  A: 

If your MyInfoCollection is huge, I wouldn't call ToList() on it just so you can use the ForEach extension method. Calling ToList() is going to create and populate a huge list. I'd remove the ToList() call, and make the .ForEach into a for each statement, or write a .ForEach extension method for IEnumerable<T>.

Then profile it and see how long it takes. One other thing to do is remove the find and null check of the ERRORS element. If it's not there, then don't call the for each statement above. That way you null check it one time instead of 90,000 times.

Plus as Michael Stum pointed out, I'd define a string to hold the value DateTime.Now.ToString("dd/MM/yyyy HH:mm:ss:fff"), then reference it or pass it in. Plus, you don't even use this call:

DateTime recordTime = DateTime.Now;
Aaron Daniels
So far I have just implemented the foreach loop rather than the ForEach extension method and they seem the same speed. The ToList() was quick its just the logging taking the time.
Jon
A: 

Did you run a profiler? I'd be interested at why you run DateTime.now twice (as it's relatively expensive to call) and how slow it really is when called 90k times...

Michael Stum
I've removed the DateTime and only call it once and pass it as a string. I haven't got Visual Studio Team etc with the Profiler in it.
Jon
+1  A: 

How about replacing the method call with a LINQ query?

static void Main(string[] args)
{

    var MyInfoCollection = (from key in Enumerable.Range(0, 100000)
                            let value = (MoreRandom() % 10 != 0)
                                                    ? (string)null
                                                    : "H"
                            select new { Value = value, Key = key }
                           ).ToDictionary(k => k.Key, v => v.Value);

    var MyXDocument = new XElement("ROOT",
                                    new XElement("ERRORS")
                                  );
    var sw = Stopwatch.StartNew();
    //===
    var errorTime = DateTime.Now.ToString("dd/MM/yyyy HH:mm:ss:fff");
    var addedIndex = MyInfoCollection.Select((item, index) =>
                                                    new
                                                    {
                                                        Value = item.Value,
                                                        Key = item.Key,
                                                        Index = index
                                                    });
    var errorQuery = from item in addedIndex
                     where string.IsNullOrEmpty(item.Value)
                     let sequenceNumber = item.Key + 1
                     let detail = "No information was read for expected " +
                                  "sequence number " + sequenceNumber
                     select new XElement("ERROR",
                        new XElement("DATETIME", errorTime),
                        new XElement("DETAIL", detail),
                        new XAttribute("TYPE", "MISSED"),
                        new XElement("PAGEID", sequenceNumber)
                        );

    var errors = MyXDocument.Descendants("ERRORS").FirstOrDefault();
    if (errors != null)
        errors.Add(errorQuery);
    //===
    sw.Stop();
    Console.WriteLine(sw.ElapsedMilliseconds); //623
}
static RandomNumberGenerator rand = RandomNumberGenerator.Create();
static int MoreRandom()
{
    var buff = new byte[1];
    rand.GetBytes(buff);
    return buff[0];
}
Matthew Whited
You da man! What is the keyword 'let' for?
Jon
@Jon: the "let" allows you assign a contextual variable. It's like create a variable assignment inside of the foreach loop. You could put the same information in the "select" statement, but I find this model to be much easier to understand and reuse.
Matthew Whited
I also did a test with using the DateTime.Now inside of the LINQ query. It does slightly slow down the process, but nothing like looking up the "ERRORS" element over and over.
Matthew Whited
Thanks! On the PageID element instead of sequence number I need the position in the collection. Is that possible?
Jon
@Jon: the sequenceNumber on the let statement is calculated the same way as your example. But there is an overload for .Select that allow you to get the position of the item in the IEnumerable<> set. I will add an example to my answer.
Matthew Whited
@Jon: I updated my example to include using the postion out of the IEnumerable<> set.
Matthew Whited