what's the quickest way to extract a 5 digit number from a string in c#.
I've got
string.Join(null, System.Text.RegularExpressions.Regex.Split(expression, "[^\\d]"));
Any others?
what's the quickest way to extract a 5 digit number from a string in c#.
I've got
string.Join(null, System.Text.RegularExpressions.Regex.Split(expression, "[^\\d]"));
Any others?
Do you mean convert a string to a number? Or find the first 5 digit string and then make it a number? Either way, you'll probably be using decimal.Parse or int.Parse.
I'm of the opinion that Regular Expressions are the wrong approach. A more efficient approach would simply to walk through the string looking for a digit, and then advancing 4 characters and seeing if they are all digits. If they are, you've got your substring. It's not as robust, no, but it doesn't have the overhead either.
Use a regular expression (\d{5}) to find the occurrence(s) of the 5 digit number in the string and use int.Parse or decimal.Parse on the match(s).
In the case where there is only one number in text
.
int? value = null;
string pat = @"\d{5}"
Regex r = new Regex(pat);
Match m = r.Match(text);
if (m.Success)
{
value = int.Parse(m.Value);
}
Don't use a regular expression at all. It's way more powerful than you need - and that power is likely to hit performance.
If you can give more details of what you need it to do, we can write the appropriate code... (Test cases would be ideal.)
If the numbers exist with other characters regular expressions are a good solution.
EG: ([0-9]{5})
will match - asdfkki12345afdkjsdl, 12345adfaksk, or akdkfa12345
If you have a simple test case like "12345" or even "12345abcd" don't use regex at all. They are not known by they speed.
For most strings a brute force method is going to be quicker than a RegEx.
A fairly noddy example would be:
string strIWantNumFrom = "qweqwe23qeeq3eqqew9qwer0q";
int num = int.Parse(
string.Join( null, (
from c in strIWantNumFrom.ToCharArray()
where c == '1' || c == '2' || c == '3' || c == '4' || c == '5' ||
c == '6' || c == '7' || c == '8' || c == '9' || c == '0'
select c.ToString()
).ToArray() ) );
No doubt there are much quicker ways, and lots of optimisations that depend on the exact format of your string.
The regex approach is probably the quickest to implement but not the quickest to run. I compared a simple regex solution to the following manual search code and found that the manual search code is ~2x-2.5x faster for large input strings and up to 4x faster for small strings:
static string Search(string expression)
{
int run = 0;
for (int i = 0; i < expression.Length; i++)
{
char c = expression[i];
if (Char.IsDigit(c))
run++;
else if (run == 5)
return expression.Substring(i - run, run);
else
run = 0;
}
return null;
}
const string pattern = @"\d{5}";
static string NotCached(string expression)
{
return Regex.Match(expression, pattern, RegexOptions.Compiled).Value;
}
static Regex regex = new Regex(pattern, RegexOptions.Compiled);
static string Cached(string expression)
{
return regex.Match(expression).Value;
}
Results for a ~50-char string with a 5-digit string in the middle, over 10^6 iterations, latency per call in microseconds (smaller number is faster):
Simple search: 0.648396us
Cached Regex: 2.1414645us
Non-cached Regex: 3.070116us
Results for a ~40K string with a 5-digit string in the middle over 10^4 iterations, latency per call in microseconds (smaller number is faster):
Simple search: 423.801us
Cached Regex: 1155.3948us
Non-cached Regex: 1220.625us
A little surprising: I would have expected Regex -- which is compiled to IL -- to be comparable to the manual search, at least for very large strings.
This might be faster...
public static string DigitsOnly(string inVal)
{
char[] newPhon = new char[inVal.Length];
int i = 0;
foreach (char c in inVal)
if (c.CompareTo('0') > 0 && c.CompareTo('9') < 0)
newPhon[i++] = c;
return newPhon.ToString();
}
if you want to limit it to at most five digits, then
public static string DigitsOnly(string inVal)
{
char[] newPhon = new char[inVal.Length];
int i = 0;
foreach (char c in inVal)
if (c.CompareTo('0') > 0 && c.CompareTo('9') < 0 && i < 5)
newPhon[i++] = c;
return newPhon.ToString();
}