This is a Java code that perhaps may be helpful:
String text = "A | 1\n" +
"A | 2\n" +
"B | 1\n" +
"B | 2\n" +
"B | 3\n" +
"A | x\n" +
"D | y\n" +
"D | z\n";
String[] sections = text.split("(?<=(.) . .)\n(?!\\1)");
StringBuilder sb = new StringBuilder();
for (String section : sections) {
sb.append(section.substring(0, 1) + " {")
.append(section.substring(3).replaceAll("\n.", ""))
.append(" }\n");
}
System.out.println(sb.toString());
This prints:
A { 1 | 2 }
B { 1 | 2 | 3 }
A { x }
D { y | z }
The idea is to to do this in two steps:
- First, split into sections
- Then transform each section
A single replaceAll
variant
If you intersperse {
and }
in the input to be captured so they can be rearranged in the output, this is possible with a single replaceAll
(i.e. an entirely regex solution)
String text = "{ A | 1 }" +
"{ A | 2 }" +
"{ B | 1 }" +
"{ B | 2 }" +
"{ B | 3 }" +
"{ C | 4 }" +
"{ D | 5 }";
System.out.println(
text.replaceAll("(?=\\{ (.))(?<!(?=\\1).{7})(\\{)( )(.) .|(?=\\}. (.))(?:(?<=(?=\\5).{6}).{5}|(?<=(.))(.))", "$4$3$2$7$6")
);
This prints (see output on ideone.org):
A { 1 | 2 } B { 1 | 2 | 3 } C { 4 } D { 5 }
Unfortunately no, I don't think this is worth explaining. It's way too complicated for what's being accomplished. Essentially, though, lots of assertions, nested assertions, and capture groups (some of which will be empty strings depending on which assertion passes).
This is, without a doubt, the most complicated regex I've written.