First, I'm not a regex expert, so I'm pretty sure I'm doing something wrong.
Here is my regular expression:
<(list)(\b[^>]*)>(<\1\b[^>]*>.*?<\/\1>|.)*?<\/\1>
This is the input string:
...
<list title="Lorem ipsum dolor sit amet, consectetur adipiscing elit...">
<li>
<list title="Lorem adipiscing...">
<li>Lorem ipsum dolor sit amet, consectetur adipiscing elit</li>
<li>Lorem ipsum dolor sit amet, consectetur adipiscing elit</li>
</list>
</li>
<li>
<list title="Lorem ipsum...">
<li>Lorem ipsum dolor sit amet, consectetur adipiscing elit</li>
</list>
</li>
<li>Lorem ipsum dolor sit amet, consectetur adipiscing elit
</li>
<li>Lorem ipsum dolor sit amet, consectetur adipiscing elit
</li>
</list>
...
I want to match the external <list>
and catch all the content including the intertal <list>
but when I try to read the group \3
is empty althoug groups \1
and \2
are fine.
Any ideas would be very much appreciated.