Hello. I am working with ASP.NET and need to manage with a string typed by the user in order to extract some information. The user enters a normal text, words and numbers, but sometimes he may type a mathematical expression in MATHML, these expressions are always an xml string enclosed by the tag. I want to extract from the typed text every math segment. For example, let's consider the user typed this text:
string input = "My name is Dorry and here is a math expression: <math>---some math1---</math> ah, there is another expression: <math>---some math2---</math> and do not forget this too <math>---some math3---</math>.".
Well, The first regex solution I came up with is this:
string pattern1 = @"\<math(.+)\<\/math\>";
To get matches I obviousely use:
Regex r = new Regex(pattern1, RegexOptions.IgnoreCase);
string[] res = r.Matches(input);
And it seemd working, too bad, it does not because this expression, instead of getting me an array (using Reges.Matches) filled with three strings ("---some math1---", "---some math2---", "---some math3---"); it gets me an array with one element only: "---some math1--- ah, there is another expression: ---some math2--- and do not forget this too ---some math3---". Can you see? it takes the first and the last and merges everything in the middle WITHOUT CARING of some other or elements in the way!
Well, I suppose this is a well known issue about regular expressions; is there a solution? how to tell the regex engine to be a little more... aware?
Thank you very much in advance.