I'm working in C# doing some OCR work and have extracted the text I need to work with. Now I need to parse a line using Regular Expressions.
string checkNum;
string routingNum;
string accountNum;
Regex regEx = new Regex(@"\u9288\d+\u9288");
Match match = regEx.Match(numbers);
if (match.Success)
checkNum = match.Value.Remove(0, 1).Remove(match.Value.Length - 1, 1);
regEx = new Regex(@"\u9286\d{9}\u9286");
match = regEx.Match(numbers);
if(match.Success)
routingNum = match.Value.Remove(0, 1).Remove(match.Value.Length - 1, 1);
regEx = new Regex(@"\d{10}\u9288");
match = regEx.Match(numbers);
if (match.Success)
accountNum = match.Value.Remove(match.Value.Length - 1, 1);
The problem is that the string contains the necessary Unicode characters when I do a .ToCharArray()
and inspect the contents of the string, but it never seems to recognize the Unicode characters when I parse the string looking for them. I thought strings in C# were Unicode by default.