tags:

views:

46

answers:

1

So this is purely a question of curiosity...

Say I have a set of tags:

<tag>
  <sub>A</sub>
  <sub>B</sub>
  <sub>C</sub>
</tag>
<tag>
  <sub>1</sub>
  <sub>2</sub>
  <sub>3</sub>
</tag>

Is it possible to, in a single Regex.Replace command, aggregate the contents of all <sub> tags within a <tag> into one <sub>.

Like so:

<tag><sub>ABC</sub></tag>
<tag><sub>123</sub></tag>

My guess is no, but I figured I'd give it a shot.

+4  A: 

Is the theoretical tag set always this clean? If so, replacing </sub>\s+<sub> with nothing would do it.

jball
Better replace `\n` with `\s+`.
Bart Kiers
That's true, will do.
jball
Ah I hadn't even thought of doing it that way, but let's assume the set isn't nearly that clean. Is it possible to aggregate groups/captures in such a way?
climbage
Using lookahead, sure, but it really depends on how far from a cleanly structured and well-matched set of tags you get. It's much easier to find failures in regexes made for these kinds of situations that it is to write them in the first place.
jball
If the input isn't nearly that clean, you want an XML parser, not regex.
bobince
bobince speaks truth.
jball