ansaurus

Question

.NET Regular Expression: How to get a text enclosed by two tags

Answer 1

A:

If you're using the .NET BCL Regex class, you should be able to use balanced groups to achieve what you need:

http://blog.stevenlevithan.com/archives/balancing-groups

Lucero 2010-10-14 08:09:03

Answer 2

A:

Hi,

You can use <math>[\s\S]*?</math> regex. It worked fine with the example string provided by you. It gave me 3 matches as follows :

I hope this is what you want to get.

Shekhar 2010-10-14 08:11:27

Yeah it matches... thank you very much

Andry 2010-10-14 11:05:36

Answer 3

+1 A:

Using regular expressions for matching XML-/HTML-like tags is usually a bad idea and very error-prone. I don't know if the balanced groups .NET regexes provide solve this, so just be warned.
Your problem has bitten many many others before - regexes are greedy by default. .+ can match everything (including </math>), so it matches the whole input. Then, because the regex did not match completely, it starts backtracing until the rest of the regex can match. And so the </math> subpattern matches only the last closing tag. To make the regex non-greedy, add a ? after the + (or * for that matter).

delnan 2010-10-14 08:14:32

Well, I found a correct pattern... about what you said... I'll be aware of it and research more to better understand where regex is considered a good solution for and a good practice. Thank you for your information

Andry 2010-10-14 11:08:00

Answer 4

A:

Give this a go..

string pattern1 = @"\<math[\s\S]*?<\/math\>";
Regex r = new Regex(pattern1, RegexOptions.IgnoreCase);
MatchCollection res = r.Matches(input);

Nick

Nicholas Mayne 2010-10-14 08:17:30

Thank you Nick, it runs correctly... thanks again

Andry 2010-10-14 11:05:17

Answer 5

A:

This is the regex you need:

  <math>.*?</math>

It matches every pair of math tags.

If the opening tag might contain attributes, use this regex instead:

  <math\b[^><]*>.*?</math>

Vantomex 2010-10-14 11:06:53

OK, thanks, that's good too :)

Andry 2010-10-14 12:12:51

ansaurus

tags:

views:

answers:

.NET Regular Expression: How to get a text enclosed by two tags

related questions