views:

381

answers:

5

I have values stored as strings in a DataTable where each value could really represent an int, double, or string (they were all converted to strings during an import process from an external data source). I need to test and see what type each value really is.

What is more efficient for the application (or is there no practical difference)?

  1. Try to convert to int (and then double). If conversion works, the return true. If an exception is thrown, return false.
  2. Regular expressions designed to match the pattern of an int or double
  3. Some other method?
+4  A: 

would use double.TryParse , has performance benefits.

gil
+1  A: 

I'd personally use int.tryparse, then double.tryparse. Performance on those methods is quite fast. They both return a Boolean. If both fail then you have a string, per how you defined your data.

Matt Dawdy
+4  A: 

I would say, don't worry so much about such micro performance. It is much better to just get something to work, and then make it as clear and concise and easy to read as possible. The worst thing you can do is sacrifice readability for an insignificant amount of performance.

In the end, the best way to deal with performance issues is to save them for when you have data that indicates there is an actual performance problem... otherwise you will spend a lot of time micro-optimizing and actually cause higher maintenance costs for later on.

If you find this parsing situation is really the bottleneck in your application, THEN is the time to try and figure out what the fastest way to solve the problem is. I think Jeff (and many others) have blogged about this sort of thing a lot.

Mike Stone
+3  A: 

You'll get different results for the different methods depending on whether you compile with optimisations on. You basically have a few options:

object o;

//checking with is
o is int

//check type
o.GetType() != typeof( int )

//cast and catch exception
try{ int j = (int) o; } 
catch {}

//use the tryparse
int.TryParse( Convert.ToString( o ), out j )

You can easily set up a console app that tries each of these 10,000 times and returns durations for each (test when o is an int and when it's something else).

The try-catch method is the quickest if the object does hold an int, and by far the slowest if it doesn't (even slower than GetType). int.TryParse is pretty quick if you have a string, but if you have an unknown object it's slower.

Interestingly, with .Net 3.5 and optimisations turned on the o is int check takes the same time as try-catch when o actually is an int. o is int is only slightly slower if o actually is something else.

Annoyingly FxCop will throw up warnings if you do something like:

if( o is int )
    int j = (int) o;

But I think that's a bug in FxCop - it doesn't know int is a value type and recommends you to use o as int instead.

If your input is always a string int.TryParse is best, otherwise the is operator is quickest.

As you have a string I'd look at whether you need to know that it's an int, rather than a double. If int.TryParse passes then so will double.TryParse so you could half the number of checks - return either double or string and floor the doubles when you expect an int.

Keith
+3  A: 

The trouble you have is that there could be situations where the answer could be all three types.

3 could be an int, a double or a string!

It depends upon what you are trying to do and how important it is that they are a particular type. It might be best just to leave them as they are as long as you can or, alternatively, some up with a method to mark each one (if you have control of the source of the original string).

gods gift
The ultimate goal was to try to determine the most exclusive data type for the object. 3 would be an int. 3.5 would be a double. "Three" would be a string. I eventually put together a function that tried a bunch of object.TryParse calls until it could determine what was the "best fit" data type.
Yaakov Ellis