views:

289

answers:

8

I'm trying to determine which approach to removing a string is the fastest.

I simply get the start and end time and show the difference.

But the results are so varied, e.g. as shown below the same method can take from 60 ms to 231 ms.

What is a better method to get more accurate results?

alt text

using System;
using System.Collections;
using System.Collections.Generic;

namespace TestRemoveFast
{
    class Program
    {
        static void Main(string[] args)
        {
            for (int j = 0; j < 10; j++)
            {
                string newone = "";
                List<string> tests = new List<string>();
                for (int i = 0; i < 100000; i++)
                {
                    tests.Add("{http://company.com/Services/Types}ModifiedAt");
                }

                DateTime start = DateTime.Now;
                foreach (var test in tests)
                {
                    //newone = ((System.Xml.Linq.XName)"{http://company.com/Services/Types}ModifiedAt").LocalName;
                    newone = Clean(test);
                }

                Console.WriteLine(newone);
                DateTime end = DateTime.Now;
                TimeSpan duration = end - start;
                Console.WriteLine(duration.ToString());
            }

            Console.ReadLine();
        }

        static string Clean(string line)
        {
            int pos = line.LastIndexOf('}');
            if (pos > 0)
                return line.Substring(pos + 1, line.Length - pos - 1);
                //return line.Substring(pos + 1);
            else
                return line;
        }
    }
}
+11  A: 

You should use System.Diagnostics.Stopwatch, and you might want to consider a large sample. For example, repeat this test something like 10,000 times and average the results. If you think about it scientifically, it makes sense. The larger the sample, the better. You can weed out a lot of edge cases this way and really see what the core performance is like.

Another thing to consider is that JIT compilation and object creation can definitely skew your results, so make sure that you start and stop your Stopwatch at the appropriate times, and call the methods you want to test at least once before you begin your tests. Try to segregate just the parts you want to test from the rest of your code as much as possible.

Scott Anderson
Also, I usually make sure to have invoked the methods to test to avoid having JIT compilation times included in the measurement.
Fredrik Mörk
Yeah, great point, Fredrik. You want to make sure to break your test into just the bits you want to monitor. A lot of times people don't think about how creating new objects and JIT can skew results.
Scott Anderson
Definitely add repeat. At chunks of <10s system handing CPU time to other tasks may affect test performance seriously. A test of ~10 minutes is more reliable.Also, kill all other CPU-intensive and disk-intensive processes.
SF.
A: 

You can use the Stopwatch class.

The Stopwatch measures elapsed time by counting timer ticks in the underlying timer mechanism. If the installed hardware and operating system support a high-resolution performance counter, then the Stopwatch class uses that counter to measure elapsed time.

var sw = new Stopwatch();

sw.Start();
// ...
sw.Stop();
João Angelo
A: 

You may have to apply some statistical techniques here to iron out the variance. Try running the same piece of code 1000 times and then take the average time, and compare that. Simulations usually employ some sort of methods to 'clean up' the numbers, and this is one of those.

Extrakun
+3  A: 

If you're only worried about testing it in your own code...use a System.Diagnostics.Stopwatch

I usually prefer breaking this kind of thing out of my code and using a true Profiler like RedGate's Performance Profiler

Justin Niessner
+5  A: 

Three simple notes:

  1. Use System.Diagnostics.Stopwatch.

  2. Don't profile your code on the same input one million times. Try to find your expected distribution of inputs and profile on that. That is profile on real-world input, not laboratory input.

  3. Run the Clean method once before entering the profiling loop to eliminate JITting time. Sometimes this is important.

Of these, notes 1. and 2. are by far the most important.

Your profiling results are meaningless if you are not using a high-resolution timer. Note that we don't time Usain Bolt using a water clock.

Your profiling results are meaningless if you are not testing on real-world input. Note that crash tests crash cars head on into other cars at 35 MPH, not into walls made of marshmellows at 5 MPH.

Thus:

// expectedInput is string[1000000]
// populate expectedInput with real-world input
Clean(expectedInput[0]);
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < 1000000; i++) {
    string t = Clean(expectedInput[i]);
}
sw.Stop();
Console.WriteLine(sw.Elapsed);

One complex note:

If you really need to do profiling, get a profiler like ANTS.

Jason
A: 

Generally: Don't look at the wallclock time but the CPU time consumed by your process to determine how long it ran. Especially for things that are just computation this is much more reliable because it will be unaffected by other processes running at the same time.

Joey
This can be problematic however, because the CPU time will also not count things like waiting for I/O, or for a page fault (?). These impact the performance of the real system and need to be taken into account.
sleske
That's why I noted that this is mainly just for computation-only tasks.
Joey
A: 

Hi

The trouble with all the clock-based approaches is that you are never quite certain what you are timing. You might, whether you realise it or not, include in your timing:

  • delays while the o/s pre-empts your processor;
  • context-switching
  • stalls while the program waits for data;
  • and much much more.

I'm not saying that all of these apply to this particular timing but to timings in general. So you should supplement any timing you do with some consideration of how many basic operations your alternative codes execute -- some considerations of complexity not ignoring (as we usually do) the constant terms.

In your particular case you should aim to time much longer executions; when your times are sub-second the o/s is extremely likely to mess you around. So run 10^6 iterations and use the average over enough runs to give you a meaningful calculation of average and variance. And make sure, if you take this approach, that you don't inadvertently speed up the second trial by having data already loaded after the end of the first trial. You have to make sure that each of the 10^6 trials does exactly what the 1st trial does.

Have fun

Mark

High Performance Mark
A: 

I'll have to recommend the highly profiler included in Visual Studio Team Suite or Development Edition (or the upcoming Visual Studio 2010 Premium or Ultimate is even better) as the best way. It is highly configurable, extremely powerful, dead simple to use, very fast, and works with both native and managed code. I'm not familiar with the workflow of ANTS, but that appears to be another option. Without a doubt using a profiler is the only option for a developer that is concerned with their application's performance. There is no substitute, and you really can't take any commercial developer working on performance that would pass up a profiler seriously.

However, if you are interested in measurements for one of the following, you may not have access such a profiler, in which case the Stopwatch class can form a basis for your measurement technique:

  • A student or hobbyist interested in performance of their projects (a commercial profiler may be out of reach for financial reasons)
  • In a publicly released application, you may want to time sections of code that execute on the UI thread and report statistics to may sure the operations never cause noticeable delays for any of your users. The Office team used a method like this with enormous success (Outlook 2007 SP2 anyone?), and I know the Visual Studio team has this code in at least the 2010 release.
280Z28