tags:

views:

1374

answers:

4

I know plenty about the different ways of handling parsing text for information. For parsing integers for example, what kind of performance can be expected. I am wondering if anyone knows of any good stats on this. I am looking for some real numbers from someone who has tested this.

Which of these offers the best performance in which situations?

Parse(...)  // Crash if the case is extremely rare .0001%

If (SomethingIsValid) // Check the value before parsing
    Parse(...)

TryParse(...) // Using TryParse

try
{
    Parse(...)
}
catch
{
    // Catch any thrown exceptions
}
+19  A: 

Always use T.TryParse(string str, out T value). Throwing exceptions is expensive and should be avoided if you can handle the situation a priori. Using a try-catch block to "save" on performance (because your invalid data rate is low) is an abuse of exception handling at the expense of maintainability and good coding practices. Follow sound software engineering development practices, write your test cases, run your application, THEN benchmark and optimize.

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%" -Donald Knuth

Therefore you assign, arbitrarily like in carbon credits, that the performance of try-catch is worse and that the performance of TryParse is better. Only after we've run our application and determined that we have some sort of slowdown w.r.t. string parsing would we even consider using anything other than TryParse.

(edit: since it appears the questioner wanted timing data to go with good advice, here is the timing data requested)

Times for various failure rates on 10,000 inputs from the user (for the unbelievers):

Rate      Try-Catch          TryParse        Slowdown
  0%   00:00:00.0131758   00:00:00.0120421      0.1
 10%   00:00:00.1540251   00:00:00.0087699     16.6
 20%   00:00:00.2833266   00:00:00.0105229     25.9
 30%   00:00:00.4462866   00:00:00.0091487     47.8
 40%   00:00:00.6951060   00:00:00.0108980     62.8
 50%   00:00:00.7567745   00:00:00.0087065     85.9
 60%   00:00:00.7090449   00:00:00.0083365     84.1
 70%   00:00:00.8179365   00:00:00.0088809     91.1
 80%   00:00:00.9468898   00:00:00.0088562    105.9
 90%   00:00:01.0411393   00:00:00.0081040    127.5
100%   00:00:01.1488157   00:00:00.0078877    144.6


/// <param name="errorRate">Rate of errors in user input</param>
/// <returns>Total time taken</returns>
public static TimeSpan TimeTryCatch(double errorRate, int seed, int count)
{
    Stopwatch stopwatch = new Stopwatch();
    Random random = new Random(seed);
    string bad_prefix = @"X";

    stopwatch.Start();
    for(int ii = 0; ii < count; ++ii)
    {
        string input = random.Next().ToString();
        if (random.NextDouble() < errorRate)
        {
           input = bad_prefix + input;
        }

        int value = 0;
        try
        {
            value = Int32.Parse(input);
        }
        catch(FormatException)
        {
            value = -1; // we would do something here with a logger perhaps
        }
    }
    stopwatch.Stop();

    return stopwatch.Elapsed;
}

/// <param name="errorRate">Rate of errors in user input</param>
/// <returns>Total time taken</returns>
public static TimeSpan TimeTryParse(double errorRate, int seed, int count)
{
    Stopwatch stopwatch = new Stopwatch();
    Random random = new Random(seed);
    string bad_prefix = @"X";

    stopwatch.Start();
    for(int ii = 0; ii < count; ++ii)
    {
        string input = random.Next().ToString();
        if (random.NextDouble() < errorRate)
        {
           input = bad_prefix + input;
        }

        int value = 0;
        if (!Int32.TryParse(input, out value))
        {
            value = -1; // we would do something here with a logger perhaps
        }
    }
    stopwatch.Stop();

    return stopwatch.Elapsed;
}

public static void TimeStringParse()
{
    double errorRate = 0.1; // 10% of the time our users mess up
    int count = 10000; // 10000 entries by a user

    TimeSpan trycatch = TimeTryCatch(errorRate, 1, count);
    TimeSpan tryparse = TimeTryParse(errorRate, 1, count);

    Console.WriteLine("trycatch: {0}", trycatch);
    Console.WriteLine("tryparse: {0}", tryparse);
}
sixlettervariables
Aristotle never would have got his hands dirty by running an experiment. Shame, shame. You need to assert something as obviously true. It's the internet way!!!
chris
@chris: I got modded down for some reason... I guess the truth hurts.
sixlettervariables
Thanks for doing the quickie benchmark, even if the try-catch version is wrong-headed to being with so the fact that TryParse() is faster shouldn't even need to be proven...
Michael Burr
@Mike B, it is extremely important to run benchmarks and test things whether you think you know the answer or not. I just modified this code to add in a case not handling exceptions at all for the 0% case. That is one aspect I also wanted, and found it is equivalent to the TryParse.
Brendan Enrick
Sorry - I threw the 'wrong-headed' bit in there because "sixlettervariables" made a few jabs at the fact he believes the exception-throwing version of Parse() is simply bad design/wrong.
Michael Burr
@Mike B: now I'm a smart man, but I can't take credit for the sound recommendations made by lots of smart(er) people at Microsoft/Sun/et al. :D
sixlettervariables
If you're going to use that quote, you should use the full quote - "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%".
Scott Dorman
+5  A: 

Try-Catch will always be the slower. TryParse will be faster.

The IF and TryParse are the same.

Daok
To be completely clear, Try-Catch will only be slower if the parse fails; not throwing/catching an exception doesn't cost anything.
technophile
Yes, part of the reason I was asking is because I am wondering what the cost of doing the try-catch block is versus maybe doing nothing at all.
Brendan Enrick
If the error is unlikely to happen, what kind of performance can be expected? This is why I asked for some stats on this and not just "This one is faster"
Brendan Enrick
@benrick: it is more an abuse of the exception framework than a performance issue. Therefore you assume try-catch will always be slower and TryParse will always be faster.
sixlettervariables
@Daok: +1, no stats necessary unless you're a member of the unwashed masses.
sixlettervariables
Thx sislettervariables
Daok
+4  A: 

Although I haven't personally profiled the different ways, this chap has:

http://blogs.msdn.com/ianhu/archive/2005/12/19/505702.aspx

Kev
Nice information. thanks for pointing it out.
Vijesh VP
A: 
Option 1: Will throw an exception on bad data.
Option 2: SomethingIsValid() could be quite expensive - particularly if you are pre-checking a string for Integer parsability.
Option 3: I like this.  You need a null check afterwards, but it's pretty cheap.
Option 4 is definitely the worst.

Exception handling is comparatively expensive, so avoid it if you can.

In particular, bad inputs are to be expected, not exceptional, so you shouldn't use them for this situation.

(Although, before TryParse, It may have been the best option.)

chris