tags:

views:

109

answers:

3

Through the code i got the output content as XML. I have pair or multiple of XML tags as follows:

<p>December10</p>
<p>
</p>
<p>
</p>
<p>
</p>
<p>
</p>
<p>
</p>
<p> Welcome to this space </p>
<p>
</p>
<p>
</p>
<p>Hai, Today is Tuesday</p>
<p>
</p>
<p>
</p>
<p>
</p>
<p>This a xml tag</p>

I want a regular expression as below requirement:

As above mentioned i want only one EMPTY pair Tag as <p></p>. I do not want the repeated EMPTY indefinite or definite pair tags.

Please help me in this regard to use regular expression to overcome the issue.

+1  A: 

If this is .NET, you could try something like this:

Regex.Replace(content, "(<p>\s*</p>\s*?)+","<p></p>")

Or even better

Regex.Replace(content, "(<p>\s*</p>\s*?)+","<p/>")

(Edited to add Gumbo's suggestion)

Konamiman
Most browsers don't support these tags: http://jsbin.com/edufi
Kobi
Don’t forget the whitespace between the elements.
Gumbo
@Kobi: Do you mean the shortened tags?
Konamiman
Yes, I tried it at least on Firefox and Chrome, though didn't play with doc type.
Kobi
@Gumbo: that's what the `\s*` is intended for. Is it incorrect?
Konamiman
You need another `\s*?` after `</p>`. (`?` because we don't want to match the trailing newline).
Tim Pietzcker
Indeed, you are right.
Konamiman
+2  A: 

Oh God, please don't let bobince see you asking this question.

See: RegEx match open tags except XHTML self-contained tags or Parsing Html The Cthulhu Way

brianegge
he is coming! bobince is coming! oh God no! xD
Arnis L.
+2  A: 
 s/(<p><\/p>)+/<p><\/p>/g;

this one work for me (meaning == I tested it with your tagsoup).. it is perl/sed syntax, s///g means 's' replace and 'g' global

Pierre Guilbert