How to split a string on numbers and substrings?
Input: 34AG34A
Expected output: {"34","AG","34","A"}
I have tried with Regex.Split()
function, but I can not figure out what pattern would work.
Any ideas?
How to split a string on numbers and substrings?
Input: 34AG34A
Expected output: {"34","AG","34","A"}
I have tried with Regex.Split()
function, but I can not figure out what pattern would work.
Any ideas?
First, you ask for "numbers" but don't specify what you mean by that.
If you mean "digits in 0-9" then you need the character class [0-9]
. There is also the character class \d
which in addition to 0-9 matches some other characters.
\d matches any decimal digit. It is equivalent to the \p{Nd} regular expression pattern, which includes the standard decimal digits 0-9 as well as the decimal digits of a number of other character sets.
I assume that you are not interested in negative numbers, numbers containing a decimal point, foreign numerals such as 五, etc.
Split is not the right solution here. What you appear to want to do is tokenize the string, not split it. You can do this by using Matches
instead of Split
:
string[] output = Regex.Matches(s, "[0-9]+|[^0-9]+")
.Cast<Match>()
.Select(match => match.Value)
.ToArray();
The regular expression (\d+|[A-Za-z]+)
will return the groups you require.
I think you have to look for two patterns:
Hence, I'd use ([a-z]+)|([0-9]+)
.
For instance, System.Text.RegularExpressions.Regex.Matches("asdf1234be56qq78", "([a-z]+)|([0-9]+)")
returns 6 groups, containing "asdf", "1234", "be", "56", "qq", "78".
Don't use Regex.Split, use Regex.Match:
var m = Regex.Match("34AG34A", "([0-9]+|[A-Z]+)");
while (m.Success) {
Console.WriteLine(m);
m = m.NextMatch();
}
Converting this to an array is left as an exercise to the reader. :-)