views:

303

answers:

5

What's the cleanest/best way in C# to convert something like 400AMP or 6M to an integer? I won't always know what the suffix is, and I just want whatever it is to go away and leave me with the number.

+13  A: 

You could use a regular expression:

Regex reg = new Regex("[0-9]*");
int result = Convert.ToInt32(reg.Match(input));
ck
Note that there's still a risk of overflowing the integer. You might also want to limit the max number of digits you take to help mitigate that.
Joel Coehoorn
@Joel - good point. I should also mention that obviously this will also remove a prefix, and letters within a number.
ck
Woah, this got upvoted quickly! I was thinking regex too, but it's not going to be very fast (unless you copmile it, in which case it *might* be reasonably so).
Noldorin
Perhaps test the code before posting? It blows up with an InvalidCastException...
Guffa
This will also grab numbers from anywhere within the string, not necessarily at the beginning. You might want to put a "^" at the front of the regex if you insist on the integer being at the front of the string.
Chris Farmer
doesn't handle negative numbers; in the EE realm here negative numbers do matter (voltage!)
jcollum
And shouldn't the "*" really be a "+"? Presumably you really want at least one digit to be matched. And why not use "\d" instead of "[0-9]"? So "^\d+"
Chris Farmer
I'm not sure that this passes the Turkey Test. So if you're considering I18N, you want to do something different. http://www.moserware.com/2008/02/does-your-code-pass-turkey-test.html
Jeff Yates
This was only posted as an example...
ck
+6  A: 

It's possibly not the cleanest method, but it's reasonably simple (a one liner) and I would imagine faster than a regex (uncompiled, for sure).

var str = "400AMP";
var num = Convert.ToInt32(str.Substring(0, str.ToCharArray().TakeWhile(
    c => char.IsDigit(c)).Count()));

Or as an extension method:

public static int GetInteger(this string value)
{
    return Convert.ToInt32(str.Substring(0, str.ToCharArray().TakeWhile(
        c => char.IsDigit(c)).Count()));
}

Equivalently, you could construct the numeric string from the result of the TakeWhile function, as such:

public static int GetInteger(this string value)
{
    return new string(str.ToCharArray().TakeWhile(
        c => char.IsNumber(c)).ToArray());
}

Haven't benchmarked them, so I wouldn't know which is quicker (though I'd very much suspect the first). If you wanted to get better performance, you would just convert the LINQ (extension method calls on enumerables) to a for loop.

Hope that helps.

Noldorin
TakeWhile returns an IEnumerable<char> here - that's not a valid second argument to Substring, surely.
Jon Skeet
@Jon: Yeah, silly mistake. I just fixed the post to add the .Count() as you posted that comment.
Noldorin
+1  A: 

There are several options...

Like using a regular expression:

int result = int.Parse(Regex.Match(input, @"^\d+").Groups[0].Value);

Among the fastest; simply looping to find digits:

int i = 0;
while (i < input.Length && Char.IsDigit(input, i)) i++;
int result = int.Parse(input.Substring(0, i));

Use LastIndexOfAny to find the last digit:

int i = input.LastIndexOfAny("0123456789".ToCharArray()) + 1;
int result = int.Parse(input.Substring(0, i));

(Note: breaks with strings that has digits after the suffix, like "123asdf123".)

Probably fastest; parse it yourself:

int i = 0;
int result = 0;
while (i < input.Length) {
 char c = input[i];
 if (!Char.IsDigit(c)) break;
 result *= 10;
 result += c - '0';
 i++;
}
Guffa
+6  A: 

Okay, here's a long-winded solution which should be reasonably fast. It's similar to Guffa's middle answer, but I've put the conditions inside the body of the loop as I think that's simpler (and allows us to fetch the character just once). It's a matter of personal taste really.

It deliberately doesn't limit the number of digits that it matches, because if the string is an integer which overflows Int32, I think I'd rather see an exception than just a large integer :)

Note that this also handles negative numbers, which I don't think any of the other solutions so far do...

using System;

class Test
{
    static void Main()
    {
        Console.WriteLine(ParseLeadingInt32("-1234AMP"));
        Console.WriteLine(ParseLeadingInt32("+1234AMP"));
        Console.WriteLine(ParseLeadingInt32("1234AMP"));
        Console.WriteLine(ParseLeadingInt32("-1234"));
        Console.WriteLine(ParseLeadingInt32("+1234"));
        Console.WriteLine(ParseLeadingInt32("1234"));
   }

    static int ParseLeadingInt32(string text)
    {
        // Declared before loop because we need the
        // final value
        int i;
        for (i=0; i < text.Length; i++)
        {
            char c = text[i];
            if (i==0 && (c=='-' || c=='+'))
            {
                continue;
            }
            if (char.IsDigit(c))
            {
                continue;
            }
            break;
        }
        return int.Parse(text.Substring(0, i));
    }
}
Jon Skeet
Yeah, that's basically what I suggested in order to achieve top performance. Good to point out allowing for + and - at the first char.
Noldorin
+1 for handling negative numbers and providing test cases
jcollum
A: 

If all you want to do is remove an unknown postfix from what would otherwise be an int, here is how I would do it:

I like a utility static method I call IsInt(string possibleInt) which will, as the name implies, return True if the string will parse into an int. You could write this same static method into your utility class (if it's not there already) and try:

       `string foo = "12345SomePostFix";
        while (!Tools.ToolBox.IsInt(foo))
        {
            foo = foo.Remove(foo.Length - 1);
        }
        int fooInt = int.Parse(foo);`
AllenG