I don't think regex is the right tool for this job, but something like this will "work" some of the time.
String text =
" <rect width='10px' height ='20px'/> \n" +
" <rect width='20px' height ='22px'/> \n" +
" <circle radius='20px' height ='22px'/> \n" +
" <square/> <rectangle></rectangle> \n" +
" <foo @!(*#&^#@/> <bar (!@*&(*@!#> </whatever>";
System.out.println(
text.replaceAll("<([a-z]+)([^>]*)/>", "<$1$2></$1>")
);
The above Java snippet prints:
<rect width='10px' height ='20px'></rect>
<rect width='20px' height ='22px'></rect>
<circle radius='20px' height ='22px'></circle>
<square></square> <rectangle></rectangle>
<foo @!(*#&^#@></foo> <bar (!@*&(*@!#> </whatever>
The regex is this (see also on rubular.com):
/<([a-z]+)([^>]*)\/>/
Essentially we try to capture what we hope is a tag name in group 1, and everything else until the />
in group 2, and use these captured strings in our substitution.
References